NM000323: eeg dataset, 54 subjects#
Lee et al. 2019 (ERP) — EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy
Citation: Min-Ho Lee, O-Yeon Kwon, Yong-Jeong Kim, Hong-Kyung Kim, Young-Eun Lee, John Williamson, Siamac Fazli, Seong-Whan Lee (2019). Lee et al. 2019 (ERP) — EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy. 10.1093/gigascience/giz002
54-participant EEG dataset — Lee et al. 2019 (ERP) — EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy.
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import NM000323
dataset = NM000323(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = NM000323(cache_dir="./data", subject="01")
Advanced query
dataset = NM000323(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{nm000323,
title = {Lee et al. 2019 (ERP) — EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy},
author = {Min-Ho Lee and O-Yeon Kwon and Yong-Jeong Kim and Hong-Kyung Kim and Young-Eun Lee and John Williamson and Siamac Fazli and Seong-Whan Lee},
doi = {10.1093/gigascience/giz002},
url = {https://doi.org/10.1093/gigascience/giz002},
}
About This Dataset#
BMI/OpenBMI dataset for P300.
Code: Lee2019-ERP
Paradigm: p300 DOI: 10.5524/100542 Subjects: 54 Sessions per subject: 2 Events: Target=1, NonTarget=2 Trial interval: [0.0, 1.0] s Runs per session: 2 File format: MAT
Lee2019-ERP
Acquisition
Sampling rate: 1000.0 Hz Number of channels: 62 Channel types: eeg=62, emg=4 Channel names: AF3, AF4, AF7, AF8, C1, C2, C3, C4, C5, C6, CP1, CP2, CP3, CP4, CP5, CP6, CPz, Cz, EMG1, EMG2, EMG3, EMG4, F10, F3, F4, F7, F8, F9, FC1, FC2, FC3, FC4, FC5, FC6, FT10, FT9, FTT10h, FTT9h, Fp1, Fp2, Fz, O1, O2, Oz, P1, P2, P3, P4, P7, P8, PO10, PO3, PO4, PO9, POz, Pz, T7, T8, TP10, TP7, TP8, TP9, TPP10h, TPP8h, TPP9h, TTP7h Montage: standard_1005 Hardware: BrainAmp
View full README
Lee2019-ERP
Acquisition
Sampling rate: 1000.0 Hz Number of channels: 62 Channel types: eeg=62, emg=4 Channel names: AF3, AF4, AF7, AF8, C1, C2, C3, C4, C5, C6, CP1, CP2, CP3, CP4, CP5, CP6, CPz, Cz, EMG1, EMG2, EMG3, EMG4, F10, F3, F4, F7, F8, F9, FC1, FC2, FC3, FC4, FC5, FC6, FT10, FT9, FTT10h, FTT9h, Fp1, Fp2, Fz, O1, O2, Oz, P1, P2, P3, P4, P7, P8, PO10, PO3, PO4, PO9, POz, Pz, T7, T8, TP10, TP7, TP8, TP9, TPP10h, TPP8h, TPP9h, TTP7h Montage: standard_1005 Hardware: BrainAmp Software: OpenBMI Reference: nasion Ground: AFz Sensor type: Ag/AgCl Line frequency: 60.0 Hz Impedance threshold: 10 kOhm Cap manufacturer: Brain Products Auxiliary channels: EMG (4 ch)
Participants
Number of subjects: 54 Health status: healthy Age: mean=29.5, min=24, max=35 Gender distribution: female=25, male=29 Handedness: right BCI experience: mixed Species: human
Experimental Protocol
Paradigm: p300 Task type: copy_spelling Number of classes: 2 Class labels: Target, NonTarget Study design: 36-symbol ERP row-column speller with random-set presentation and face stimuli, offline training and online test phases Feedback type: visual Stimulus type: rc_speller Stimulus modalities: visual Primary modality: visual Mode: offline Training/test split: True Instructions: Subjects were asked to copy-spell given sentences by gazing at target characters on screen. In training: ‘NEURAL NETWORKS AND DEEP LEARNING’ (33 characters), in test: ‘PATTERN RECOGNITION MACHINE LEARNING’ (36 characters). Participants counted number of times each target character flashed.
HED Event Annotations
Schema: HED 8.4.0 | Browse: https://www.hedtags.org/hed-schema-browser Target
├─ Sensory-event ├─ Experimental-stimulus ├─ Visual-presentation └─ Target NonTarget├─ Sensory-event ├─ Experimental-stimulus ├─ Visual-presentation └─ Non-targetParadigm-Specific Parameters
Detected paradigm: p300 Number of targets: 36 Number of repetitions: 5 Inter-stimulus interval: 135.0 ms Stimulus onset asynchrony: 215.0 ms
Data Structure
Trials: {‘training’: 1980, ‘test’: 2160} Trials context: Training: copy-spell ‘NEURAL NETWORKS AND DEEP LEARNING’ (33 characters). Test: copy-spell ‘PATTERN RECOGNITION MACHINE LEARNING’ (36 characters). Each character received 5 sequences of 12 flashes (60 flashes total).
Preprocessing
Data state: raw Preprocessing applied: False
Signal Processing
Classifiers: LDA Feature extraction: Mean Amplitudes
Cross-Validation
Method: training-test split Evaluation type: within_session, cross_session
Performance (Original Study)
Accuracy: 96.7% Accuracy Std: 0.05 Illiteracy Rate: 11.1
BCI Application
Applications: speller, communication Online feedback: True
Tags
Pathology: Healthy Modality: Visual Type: Perception
Documentation
Description: EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy DOI: 10.1093/gigascience/giz002 License: GPL-3.0 Investigators: Min-Ho Lee, O-Yeon Kwon, Yong-Jeong Kim, Hong-Kyung Kim, Young-Eun Lee, John Williamson, Siamac Fazli, Seong-Whan Lee Senior author: Seong-Whan Lee Contact: sw.lee@korea.ac.kr; Tel: +82-2-3290-3197; Fax: +82-2-3290-3583 Institution: Korea University Department: Department of Brain and Cognitive Engineering Address: 145 Anam-ro, Seongbuk-gu, Seoul, 02841, Korea Country: KR Repository: GigaDB Publication year: 2019 Keywords: EEG datasets, brain-computer interface, event-related potential, steady-state visually evoked potential, motor-imagery, OpenBMI toolbox, BCI illiteracy
References
Lee, M. H., Kwon, O. Y., Kim, Y. J., Kim, H. K., Lee, Y. E., Williamson, J., … Lee, S. W. (2019). EEG dataset and OpenBMI toolbox for three BCI paradigms: An investigation into BCI illiteracy. GigaScience, 8(5), 1–16. https://doi.org/10.1093/gigascience/giz002 Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Hochenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896 Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8 Generated by MOABB 1.5.0 (Mother of All BCI Benchmarks) NeuroTechX/moabb
Cohort#
Dataset Statistics#
Age distribution by gender (n=54, range 30–30 yr, mean 29.0 yr)
Channel counts: 66 ch (n=216 recordings)
Sampling frequencies: 1000.0 Hz (n=216 recordings)
Total recording duration: 58 h
Signal · Electrodes & live trace#
Live trace viewer — sub-1 · ses-1 · task-p300 · run-1
Showing one representative recording out of
54 subjects and 216 recordings in this dataset.
Browse the full set on OpenNeuro;
drop any other _eeg.{set,edf,bdf,vhdr} file onto the
viewer (or pass ?eeg=<url>) to inspect it.
Electrode layout — EEG · 62 sensors — 62 channels
NEMAR Processing Statistics#
The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.
HED event descriptors word cloud
Manifest#
File Explorer#
Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.
Full dataset metadata table
Dataset ID |
|
Title |
Lee et al. 2019 (ERP) — EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy |
Author (year) |
|
Canonical |
— |
Importable as |
|
Year |
2019 |
Authors |
Min-Ho Lee, O-Yeon Kwon, Yong-Jeong Kim, Hong-Kyung Kim, Young-Eun Lee, John Williamson, Siamac Fazli, Seong-Whan Lee |
License |
GPL-3.0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{nm000323,
title = {Lee et al. 2019 (ERP) — EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy},
author = {Min-Ho Lee and O-Yeon Kwon and Yong-Jeong Kim and Hong-Kyung Kim and Young-Eun Lee and John Williamson and Siamac Fazli and Seong-Whan Lee},
doi = {10.1093/gigascience/giz002},
url = {https://doi.org/10.1093/gigascience/giz002},
}
API Reference#
eegdash.datasetEEGDashDatasetNM000323 · Lee2019_ERPeegdash/dataset/registry.py · [source ↗]- class eegdash.dataset.NM000323(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Lee et al. 2019 (ERP) — EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy
- Study:
nm000323(NeMAR)- Author (year):
Lee2019_ERP- Canonical:
—
Also importable as:
NM000323,Lee2019_ERP.Modality:
eeg; Experiment type:Attention; Subject type:Healthy. Subjects: 54; recordings: 216; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/nm000323 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=nm000323 DOI: https://doi.org/10.1093/gigascience/giz002
Examples
>>> from eegdash.dataset import NM000323 >>> dataset = NM000323(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
- __init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
- save(path: str, overwrite: bool = False, offset: int = 0)[source]#
Save datasets to files by creating one subdirectory for each dataset:
path/ 0/ 0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw) 1/ 1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- Parameters:
path (str) –
- Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.
BaseDataset from braindecode — windowed via create_windows_from_events.braindecodeDataLoader; supports parallel workers and on-the-fly augmentations.pytorchSwap any load_dataset(...) call for nm000323 to reproduce the tutorial on this dataset.
Citation
Min-Ho Lee, O-Yeon Kwon, Yong-Jeong Kim, Hong-Kyung Kim, Young-Eun Lee, … (2019). Lee et al. 2019 (ERP) — EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy. 10.1093/gigascience/giz002
Provenance
¹Contributed to nemar in BIDS format.
²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.
³Persistent identifier: 10.1093/gigascience/giz002.
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset