NM000207: eeg dataset, 15 subjects#
Class for Kojima2024B dataset management. P300 dataset
Access recordings and metadata through EEGDash.
Citation: Simon Kojima, Shin’ichiro Kanoh (2024). Class for Kojima2024B dataset management. P300 dataset.
Modality: eeg Subjects: 15 Recordings: 180 License: CC0-1.0 Source: nemar
Metadata: Complete (90%)
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import NM000207
dataset = NM000207(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = NM000207(cache_dir="./data", subject="01")
Advanced query
dataset = NM000207(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{nm000207,
title = {Class for Kojima2024B dataset management. P300 dataset},
author = {Simon Kojima and Shin'ichiro Kanoh},
}
About This Dataset#
Class for Kojima2024B dataset management. P300 dataset
Class for Kojima2024B dataset management. P300 dataset.
Dataset Overview
Code: Kojima2024B
Paradigm: p300
DOI: 10.7910/DVN/1UJDV6
View full README
Class for Kojima2024B dataset management. P300 dataset
Class for Kojima2024B dataset management. P300 dataset.
Dataset Overview
Code: Kojima2024B
Paradigm: p300
DOI: 10.7910/DVN/1UJDV6
Subjects: 15
Sessions per subject: 1
Events: Target=[111, 112, 113, 114], NonTarget=[101, 102, 103, 104]
Trial interval: [-0.5, 1.2] s
Runs per session: 12
File format: BrainVision
Number of contributing labs: 1
Acquisition
Sampling rate: 1000.0 Hz
Number of channels: 64
Channel types: eeg=64, eog=2
Channel names: AF3, AF4, AF7, AF8, AFz, C1, C2, C3, C4, C5, C6, CP1, CP2, CP3, CP4, CP5, CP6, CPz, Cz, F1, F2, F3, F4, F5, F6, F7, F8, FC1, FC2, FC3, FC4, FC5, FC6, FCz, FT10, FT7, FT8, FT9, Fp1, Fp2, Fz, O1, O2, Oz, P1, P2, P3, P4, P5, P6, P7, P8, PO3, PO4, PO7, PO8, POz, Pz, T7, T8, TP10, TP7, TP8, TP9, hEOG, vEOG
Montage: standard_1020
Hardware: BrainAmp
Reference: right mastoid
Ground: left mastoid
Sensor type: EEG
Line frequency: 50.0 Hz
Cap manufacturer: EasyCap
Electrode type: passive Ag/AgCl
Electrode material: Ag/AgCl
Auxiliary channels: EOG (2 ch, vertical, horizontal)
Participants
Number of subjects: 15
Health status: healthy
Age: mean=22.8, min=21.0, max=24.0
Gender distribution: male=13, female=2
Species: human
Experimental Protocol
Paradigm: p300
Task type: auditory stream segregation with oddball
Number of classes: 2
Class labels: Target, NonTarget
Trial duration: 90.0 s
Tasks: ASME-4stream, ASME-2stream
Study design: within-subject comparison
Study domain: auditory BCI
Feedback type: none
Stimulus type: auditory tones
Stimulus modalities: auditory
Primary modality: auditory
Synchronicity: synchronous
Mode: offline
Training/test split: False
Instructions: focus selectively on deviant stimuli in one of the streams and count target deviant stimuli
HED Event Annotations
Schema: HED 8.4.0 | Browse: https://www.hedtags.org/hed-schema-browser
Target
├─ Sensory-event
├─ Experimental-stimulus
├─ Visual-presentation
└─ Target
NonTarget
├─ Sensory-event
├─ Experimental-stimulus
├─ Visual-presentation
└─ Non-target
Paradigm-Specific Parameters
Detected paradigm: p300
Number of targets: 4
Number of repetitions: 15
Stimulus onset asynchrony: {‘ASME-4stream_overall’: 150.0, ‘ASME-2stream_overall’: 300.0, ‘within_stream’: 600.0} ms
Data Structure
Trials: {‘ASME-4stream’: ‘600 stimuli per trial (4 trials per run, 6 runs)’, ‘ASME-2stream’: ‘300 stimuli per trial (4 trials per run, 6 runs)’}
Blocks per session: 12
Block duration: 90.0 s
Trials context: 12 runs alternating between ASME-4stream and ASME-2stream, 4 trials per run
Preprocessing
Data state: raw
Preprocessing applied: False
Signal Processing
Classifiers: Linear Discriminant Analysis (LDA), shrinkage-LDA
Feature extraction: mean amplitudes in 10 intervals (0.1s non-overlapping, 0-1.0s)
Frequency bands: analyzed=[0.1, 8.0] Hz
Cross-Validation
Method: 3-fold chronological cross-validation (BCI simulation); 4-fold chronological cross-validation (binary classification)
Evaluation type: offline simulation
Performance (Original Study)
Asme-4Stream Accuracy: 0.83
Asme-2Stream Accuracy: 0.86
BCI Application
Applications: communication
Environment: laboratory
Online feedback: False
Tags
Pathology: Healthy
Modality: auditory
Type: ERP, P300
Documentation
Description: Four-class ASME BCI investigation comparing two strategies for multiclassing: ASME-4stream (four streams with single target stimulus each) vs ASME-2stream (two streams with two target stimuli each)
DOI: 10.3389/fnhum.2024.1461960
Associated paper DOI: 10.3389/fnhum.2024.1461960
License: CC0-1.0
Investigators: Simon Kojima, Shin’ichiro Kanoh
Senior author: Shin’ichiro Kanoh
Contact: simon.kojima@ieee.org
Institution: Shibaura Institute of Technology
Department: Graduate School of Engineering and Science (Simon Kojima); College of Engineering (Shin’ichiro Kanoh)
Address: Tokyo, Japan
Country: JP
Repository: Harvard dataverse
Data URL: https://doi.org/10.7910/DVN/1UJDV6
Publication year: 2024
Funding: JSPS KAKENHI (Grant Number JP23K11811 to Shin’ichiro Kanoh)
Ethics approval: Review Board on Bioengineering Research Ethics of the Shibaura Institute of Technology
Keywords: brain-computer interface, electroencephalogram, event-related potential, auditory scene analysis, stream segregation, machine learning, NASA-TLX
Abstract
The ASME (Auditory Stream segregation Multiclass ERP) paradigm is used for an auditory brain-computer interface (BCI). Two approaches for achieving four-class ASME were investigated: ASME-4stream (four streams with a single target stimulus each) and ASME-2stream (two streams with two target stimuli each). Fifteen healthy subjects participated. ERPs were analyzed, and binary classification and BCI simulation were conducted offline using linear discriminant analysis. Average accuracies were 0.83 (ASME-4stream) and 0.86 (ASME-2stream). The ASME-2stream paradigm showed shorter latency and larger amplitude of P300, higher binary classification accuracy, and smaller workload. Both paradigms achieved sufficiently high accuracy (over 80%) for practical auditory BCI.
Methodology
Subjects performed 12 runs alternating between ASME-4stream and ASME-2stream paradigms. Each run contained 4 trials with ~90s duration. ASME-4stream presented 4 streams (SOA=0.15s, 600 stimuli/trial, ratio 9:1 standard:deviant). ASME-2stream presented 2 streams with 2 deviant stimuli each (SOA=0.3s, 300 stimuli/trial, ratio 8:1:1). EEG recorded at 1000 Hz from 64 channels. EOG artifacts removed using ICA on 15 PCs. Data filtered (1-40 Hz for ERP, 0.1-8 Hz for classification), epoched (-0.1 to 1.2s), downsampled to 250 Hz. Classification used shrinkage-LDA with mean amplitudes from 10 intervals (0-1.0s) as features. Performance evaluated using 4-fold chronological cross-validation. Usability assessed via NASA-TLX questionnaire.
References
Kojima, S. (2024). Replication Data for: Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing. Harvard Dataverse, V1. DOI: https://doi.org/10.7910/DVN/1UJDV6 Kojima, S. & Kanoh, S. (2024). Four-class ASME BCI: investigation of the feasibility and comparison of two strategies for multiclassing. Frontiers in Human Neuroscience 18:1461960. DOI: https://doi.org/10.3389/fnhum.2024.1461960 Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Hochenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896 Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8 Generated by MOABB 1.5.0 (Mother of All BCI Benchmarks) https://github.com/NeuroTechX/moabb
Dataset Information#
Dataset ID |
|
Title |
Class for Kojima2024B dataset management. P300 dataset |
Author (year) |
|
Canonical |
— |
Importable as |
|
Year |
2024 |
Authors |
Simon Kojima, Shin’ichiro Kanoh |
License |
CC0-1.0 |
Citation / DOI |
Unknown |
Source links |
OpenNeuro | NeMAR | Source URL |
Found an issue with this dataset?
If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!
Technical Details#
Subjects: 15
Recordings: 180
Tasks: 1
Channels: 64
Sampling rate (Hz): 1000.0
Duration (hours): 21.62847222222222
Pathology: Healthy
Modality: Auditory
Type: Attention
Size on disk: 13.9 GB
File count: 180
Format: BIDS
License: CC0-1.0
DOI: —
API Reference#
Use the NM000207 class to access this dataset programmatically.
- class eegdash.dataset.NM000207(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Bases:
EEGDashDatasetClass for Kojima2024B dataset management. P300 dataset
- Study:
nm000207(NeMAR)- Author (year):
Kojima2024B_P300- Canonical:
—
Also importable as:
NM000207,Kojima2024B_P300.Modality:
eeg; Experiment type:Attention; Subject type:Healthy. Subjects: 15; recordings: 180; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
- query#
Merged query with the dataset filter applied.
- Type:
dict
- records#
Metadata records used to build the dataset, if pre-fetched.
- Type:
list[dict] | None
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/nm000207 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=nm000207
Examples
>>> from eegdash.dataset import NM000207 >>> dataset = NM000207(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset