NM000207: eeg dataset, 15 subjects#
Kojima et al. 2024 (Dataset B) — Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing
Citation: Simon Kojima, Shin’ichiro Kanoh (2024). Kojima et al. 2024 (Dataset B) — Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing.
15-participant EEG dataset — Kojima et al. 2024 (Dataset B) — Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing.
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import NM000207
dataset = NM000207(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = NM000207(cache_dir="./data", subject="01")
Advanced query
dataset = NM000207(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{nm000207,
title = {Kojima et al. 2024 (Dataset B) — Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing},
author = {Simon Kojima and Shin'ichiro Kanoh},
}
About This Dataset#
Class for Kojima2024B dataset management. P300 dataset.
Schema: HED 8.4.0 | Browse: https://www.hedtags.org/hed-schema-browser
Class for Kojima2024B dataset management. P300 dataset
Target
├─ Sensory-event
├─ Experimental-stimulus
View full README
Class for Kojima2024B dataset management. P300 dataset
Target
├─ Sensory-event
├─ Experimental-stimulus
├─ Visual-presentation
└─ Target
NonTarget
├─ Sensory-event
├─ Experimental-stimulus
├─ Visual-presentation
└─ Non-target
Paradigm-Specific Parameters
Detected paradigm: p300
Number of targets: 4
Number of repetitions: 15
Stimulus onset asynchrony: {‘ASME-4stream_overall’: 150.0, ‘ASME-2stream_overall’: 300.0, ‘within_stream’: 600.0} ms
Data Structure
Trials: {‘ASME-4stream’: ‘600 stimuli per trial (4 trials per run, 6 runs)’, ‘ASME-2stream’: ‘300 stimuli per trial (4 trials per run, 6 runs)’}
Blocks per session: 12
Block duration: 90.0 s
Trials context: 12 runs alternating between ASME-4stream and ASME-2stream, 4 trials per run
Preprocessing
Data state: raw
Preprocessing applied: False
Signal Processing
Classifiers: Linear Discriminant Analysis (LDA), shrinkage-LDA
Feature extraction: mean amplitudes in 10 intervals (0.1s non-overlapping, 0-1.0s)
Frequency bands: analyzed=[0.1, 8.0] Hz
Cross-Validation
Method: 3-fold chronological cross-validation (BCI simulation); 4-fold chronological cross-validation (binary classification)
Evaluation type: offline simulation
Performance (Original Study)
Asme-4Stream Accuracy: 0.83
Asme-2Stream Accuracy: 0.86
BCI Application
Applications: communication
Environment: laboratory
Online feedback: False
Tags
Pathology: Healthy
Modality: auditory
Type: ERP, P300
Documentation
Description: Four-class ASME BCI investigation comparing two strategies for multiclassing: ASME-4stream (four streams with single target stimulus each) vs ASME-2stream (two streams with two target stimuli each)
DOI: 10.3389/fnhum.2024.1461960
Associated paper DOI: 10.3389/fnhum.2024.1461960
License: CC0-1.0
Investigators: Simon Kojima, Shin’ichiro Kanoh
Senior author: Shin’ichiro Kanoh
Contact: simon.kojima@ieee.org
Institution: Shibaura Institute of Technology
Department: Graduate School of Engineering and Science (Simon Kojima); College of Engineering (Shin’ichiro Kanoh)
Address: Tokyo, Japan
Country: JP
Repository: Harvard dataverse
Data URL: https://doi.org/10.7910/DVN/1UJDV6
Publication year: 2024
Funding: JSPS KAKENHI (Grant Number JP23K11811 to Shin’ichiro Kanoh)
Ethics approval: Review Board on Bioengineering Research Ethics of the Shibaura Institute of Technology
Keywords: brain-computer interface, electroencephalogram, event-related potential, auditory scene analysis, stream segregation, machine learning, NASA-TLX
Abstract
The ASME (Auditory Stream segregation Multiclass ERP) paradigm is used for an auditory brain-computer interface (BCI). Two approaches for achieving four-class ASME were investigated: ASME-4stream (four streams with a single target stimulus each) and ASME-2stream (two streams with two target stimuli each). Fifteen healthy subjects participated. ERPs were analyzed, and binary classification and BCI simulation were conducted offline using linear discriminant analysis. Average accuracies were 0.83 (ASME-4stream) and 0.86 (ASME-2stream). The ASME-2stream paradigm showed shorter latency and larger amplitude of P300, higher binary classification accuracy, and smaller workload. Both paradigms achieved sufficiently high accuracy (over 80%) for practical auditory BCI.
Methodology
Subjects performed 12 runs alternating between ASME-4stream and ASME-2stream paradigms. Each run contained 4 trials with ~90s duration. ASME-4stream presented 4 streams (SOA=0.15s, 600 stimuli/trial, ratio 9:1 standard:deviant). ASME-2stream presented 2 streams with 2 deviant stimuli each (SOA=0.3s, 300 stimuli/trial, ratio 8:1:1). EEG recorded at 1000 Hz from 64 channels. EOG artifacts removed using ICA on 15 PCs. Data filtered (1-40 Hz for ERP, 0.1-8 Hz for classification), epoched (-0.1 to 1.2s), downsampled to 250 Hz. Classification used shrinkage-LDA with mean amplitudes from 10 intervals (0-1.0s) as features. Performance evaluated using 4-fold chronological cross-validation. Usability assessed via NASA-TLX questionnaire.
References
Kojima, S. (2024). Replication Data for: Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing. Harvard Dataverse, V1. DOI: https://doi.org/10.7910/DVN/1UJDV6 Kojima, S. & Kanoh, S. (2024). Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing. Frontiers in Human Neuroscience 18:1461960. DOI: https://doi.org/10.3389/fnhum.2024.1461960 Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Hochenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896 Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8 Generated by MOABB 1.5.0 (Mother of All BCI Benchmarks) NeuroTechX/moabb
Cohort#
Dataset Statistics#
Age distribution by gender (n=15, range 23–23 yr, mean 22.0 yr)
Channel counts: 64 ch (n=180 recordings)
Sampling frequencies: 1000.0 Hz (n=180 recordings)
Total recording duration: 21 h 37 min
Signal · Electrodes & live trace#
Live trace viewer — sub-13 · ses-0 · task-p300 · run-24
Showing one representative recording out of
15 subjects and 180 recordings in this dataset.
Browse the full set on OpenNeuro;
drop any other _eeg.{set,edf,bdf,vhdr} file onto the
viewer (or pass ?eeg=<url>) to inspect it.
Electrode layout — EEG · 64 sensors — 64 channels
NEMAR Processing Statistics#
The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.
HED event descriptors word cloud
Manifest#
File Explorer#
Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.
Full dataset metadata table
Dataset ID |
|
Title |
Kojima et al. 2024 (Dataset B) — Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing |
Author (year) |
|
Canonical |
— |
Importable as |
|
Year |
2024 |
Authors |
Simon Kojima, Shin’ichiro Kanoh |
License |
CC0-1.0 |
Citation / DOI |
Unknown |
Source links |
OpenNeuro | NeMAR | Source URL |
API Reference#
eegdash.datasetEEGDashDatasetNM000207 · Kojima2024B_P300eegdash/dataset/registry.py · [source ↗]- class eegdash.dataset.NM000207(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Kojima et al. 2024 (Dataset B) — Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing
- Study:
nm000207(NeMAR)- Author (year):
Kojima2024B_P300- Canonical:
—
Also importable as:
NM000207,Kojima2024B_P300.Modality:
eeg; Experiment type:Attention; Subject type:Healthy. Subjects: 15; recordings: 180; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/nm000207 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=nm000207
Examples
>>> from eegdash.dataset import NM000207 >>> dataset = NM000207(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
- __init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
- save(path: str, overwrite: bool = False, offset: int = 0)[source]#
Save datasets to files by creating one subdirectory for each dataset:
path/ 0/ 0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw) 1/ 1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- Parameters:
path (str) –
- Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.
BaseDataset from braindecode — windowed via create_windows_from_events.braindecodeDataLoader; supports parallel workers and on-the-fly augmentations.pytorchSwap any load_dataset(...) call for nm000207 to reproduce the tutorial on this dataset.
Citation
Simon Kojima, Shin'ichiro Kanoh (2024). Kojima et al. 2024 (Dataset B) — Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing.
Provenance
¹Contributed to nemar in BIDS format.
²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.
Related & sibling datasets
+ 1 more — see See Also below →
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset