Open-Source Nihon Kohden → EDF(+) Converter: Features and UsageElectrophysiological recording systems from manufacturers like Nihon Kohden produce proprietary file formats that are rich in device-specific metadata and optimized for clinical workflows. For research, long-term archiving, or cross-platform analysis, converting those recordings to a standardized, open format such as EDF or EDF+ (European Data Format, extended) is often necessary. This article describes an open-source Nihon Kohden → EDF(+) converter: its main features, how it handles signals and metadata, installation and usage examples, implementation notes, limitations, and best practices for preserving data integrity during conversion.
Why convert Nihon Kohden files to EDF(+)
- Interoperability: EDF(+) is widely supported by research tools (EEGLAB, MNE-Python, Polyman, BIOSIG, EDFbrowser) and by many analytics pipelines.
- Long-term access and reproducibility: Open formats reduce vendor lock-in and make data easier to share with collaborators, repositories, and journals.
- Standardized annotations: EDF+ supports event annotations in a structured way, simplifying subsequent scoring and automated analysis.
- Preservation of timestamps and metadata: A careful converter can map device-specific timestamps, channel labels, and sampling rates into EDF+ fields so clinical context is retained.
Key features of a quality open-source Nihon Kohden → EDF(+) converter
- Support for common Nihon Kohden formats: Ability to read proprietary files generated by a range of Nihon Kohden devices (e.g., EEG/PSG/ECG recorders). Many vendors use container files plus binary signal blocks; a robust converter parses headers, data blocks, and any accompanying annotation/auxiliary files.
- Accurate sampling-rate and scaling conversion: Properly interpret per-channel ADC gains, offsets, and units so the EDF values represent physical units (uV, mV, etc.) correctly.
- Timestamps and continuity handling: Preserve absolute start time and handle discontinuities or gaps in recordings (e.g., by inserting annotation events or padding).
- Annotation and event mapping: Convert device-specific event markers and free-text logs into EDF+ annotations with timestamps and duration where applicable.
- Batch processing: Command-line support for recursively converting directories of files, queuing, and parallel processing.
- Lossless or controlled lossy options: Prefer lossless conversion (integer-to-integer scaling when possible). Optionally allow resampling where necessary with clear warnings.
- Channel selection and metadata editing: Let users include/exclude channels, rename labels, adjust montages, or inject study-level metadata (patient ID, recording reason) while maintaining provenance.
- Validation and verification tools: Provide checksum or sample-level comparisons, and a viewer integration to visually inspect signals before/after conversion.
- Cross-platform packaging and licensing: Distribute as source code under a permissive open-source license (MIT, BSD, or Apache) and provide prebuilt binaries or containers for Linux, macOS, and Windows.
- Extensible design: Modular IO and conversion pipelines so contributors can add support for other vendors or additional EDF variants.
Installation and prerequisites
A typical open-source converter is implemented in Python for portability, often using libraries such as numpy for numeric operations and pyEDFlib or mne.io for EDF writing. Example prerequisites:
- Python 3.10+
- numpy, scipy (for numeric ops and optional resampling)
- pyEDFlib (for writing EDF/EDF+ files) or mne (for higher-level IO and validation)
- click or argparse (for CLI parsing)
- packaging: Dockerfile or PyInstaller for standalone binaries
Installation (example with pip):
python -m venv venv source venv/bin/activate pip install -r requirements.txt pip install .
Or run via Docker:
docker build -t nk2edf . docker run --rm -v /data:/data nk2edf convert /data/input/* -o /data/output
Basic usage examples
Command-line examples assume a converter binary or entry-point called nk2edf.
Single-file conversion:
nk2edf convert session1.ekg -o session1.edf
Batch conversion (directory):
nk2edf convert /recordings/nihonkohden/ -o /edf_out/ --recursive --parallel 4
Select channels, rename, and add metadata:
nk2edf convert file.nk --channels "EEG Fp1,Fp2,ECG" --rename "Fp1:Fp1-LE, Fp2:Fp2-RE" --patient "ID12345" --recording "Sleep study 2025-06-01" -o file.edf
Handle discontinuities by inserting annotations:
nk2edf convert gapfile.nk --annotate-gaps --gap-label "DISCONTINUITY" -o gapfile.edf
Programmatic usage (Python API):
from nk2edf import Converter conv = Converter(input_path="session.ekg") conv.set_channels(["EEG Fp1", "EEG Fp2"]) conv.add_metadata(patient_id="ID123") conv.convert("session.edf")
How the converter maps Nihon Kohden data to EDF(+)
- Header parsing: Extracts acquisition start time, channel count, sampling rates (per channel if variable), physical min/max, digital min/max, patient and recording IDs, and any free-text notes.
- Signal scaling: Uses ADC calibration (gain, offset) to compute physical values. EDF requires per-channel physical_min/physical_max and digital_min/digital_max; the converter computes these to avoid clipping and preserve units. Typical mapping:
physical_value = (digital_value – offset) * scale
where scale and offset are derived from device ADC parameters.
- Annotations: Maps event markers (alarms, manual flags) into EDF+ annotation records with onset times relative to the recording start. If a marker has a duration, that is stored in EDF+ duration field. Free-text tags go into the annotation text.
- Time zones and DST: Converts device local times to UTC or preserves timezone-aware timestamps in metadata, with an option to normalize to UTC for multi-site studies.
Implementation notes & challenges
- Proprietary formats: Nihon Kohden formats vary across devices and firmware versions; some use plain binary with well-known header structures, others use encrypted or undocumented containers. Successful converters rely on reverse engineering, vendor documentation, or community-shared format specifications.
- Variable sampling rates: If channels have different sampling rates, EDF supports per-channel sampling frequency, but some EDF writers expect consistent record lengths; the converter may need to split into multiple channels or resample.
- Large file handling: Long polysomnography or continuous EEG files can be many GBs; implement streaming I/O, memory-mapped arrays, or chunked processing to avoid excessive RAM usage.
- Time alignment: Multi-file sessions (separate files per hour or per device module) require careful stitching and annotation of discontinuities.
- Legal/ethical: If reverse engineering vendor formats, comply with local laws and licenses. Open-source projects should avoid including proprietary binaries.
Validation and QC
- Visual inspection: Open converted EDF+ in EDFbrowser or MNE’s plotting tools to check channel waveforms, timing, and annotation placement.
- Signal-level comparison: Compare sample statistics (mean, std, min/max) between source and target to detect scaling or clipping errors.
- Check metadata: Confirm patient ID, start time, sampling rates, channel labels, and units are present and correct.
- Automated tests: Include unit tests with known input files and golden EDF outputs, plus integration tests for batch conversions.
Limitations and known issues
- Some Nihon Kohden features (embedded video, proprietary compressions, or device diagnostics) may not be representable in EDF+; these can be stored as sidecar files or embedded as annotations/attachments when the user requests.
- Lossy resampling: If resampling is performed, there is potential signal distortion—ensure anti-aliasing filters and document the change.
- Metadata fidelity: Vendor-specific metadata fields may not have a 1:1 mapping into EDF header fields; use complementary sidecar JSON files to preserve full provenance.
Best practices for reliable conversions
- Always keep original files unchanged; store converted EDF(+) alongside originals with clear provenance metadata (e.g., converter name, version, command-line used).
- Use checksums (SHA256) on both original and converted files to ensure file integrity during transfer.
- Prefer lossless integer mappings; only resample when necessary and document it.
- Maintain a log of conversion warnings/errors for each file to allow targeted review.
- For clinical or regulatory use, validate conversion on representative datasets and document verification steps.
Example project layout (for developers)
A minimal open-source project structure:
- LICENSE (MIT/BSD/Apache)
- README.md (usage and examples)
- nk2edf/ (Python package)
- init.py
- io/ (parsers for Nihon Kohden formats)
- writer/ (EDF/EDF+ writers using pyEDFlib)
- cli.py
- utils.py (scaling, timestamps, annotations)
- tests/ (unit and integration tests)
- docker/ (Dockerfile and scripts)
- examples/ (sample input files and expected outputs)
Community and contributions
Open-source success depends on documentation, reproducible examples, and welcoming contributors. Useful ways to contribute:
- Add parsers for additional Nihon Kohden firmware versions or devices.
- Improve unit tests with varied real-world recordings.
- Enhance GUI front-ends or integrate with viewers like EDFbrowser.
- Add support for sidecar metadata (BIDS for EEG/PSG).
- Improve performance (streaming, parallelism) and packaging (conda, PyPI, Windows binaries).
Conclusion
An open-source Nihon Kohden → EDF(+) converter bridges the gap between vendor-specific clinical recordings and open, analysis-friendly data formats. The best converters prioritize accurate signal scaling, preservation of timestamps and annotations, robust handling of large files, and transparent provenance. Proper validation and careful handling of edge cases (variable sampling rates, discontinuities, proprietary features) are key to trustworthy conversion suitable for research, archiving, and cross-platform analysis.
Leave a Reply