EEGdashNeMARNM000250
Iss. 250 · 87 subjects · 520 recordings · CC-BY-4.0
Dataset Brief · Dreyer et al. 2023 — A large EEG database with users' profile…

NM000250: eeg dataset, 87 subjects#

Dreyer et al. 2023 — A large EEG database with users’ profile information for motor imagery brain-computer interface research

Citation: Pauline Dreyer, Aline Roc, Léa Pillette, Sébastien Rimbert, Fabien Lotte (2021). Dreyer et al. 2023 — A large EEG database with users’ profile information for motor imagery brain-computer interface research.

87-participant EEG dataset — Dreyer et al. 2023 — A large EEG database with users' profile information for motor imagery brain-computer interface research.

EEG · 27 ch512 HzBIDS 1.9.0Task · imageryHealthyVisualMotor
Layer 01Study
What was asked
Hypothesis, independent & dependent variables, paradigm, cohort, and the editorial caveats around what the recordings can and cannot answer.
Layer 02Signal · BIDS
What was recorded
Sidecars, channels & electrodes, coordinate system, event semantics, and quality stats from the NEMAR pipeline when available.
Layer 03Training · ML
What you can train on
Recommended access modes — MNE Raw, braindecode windows, PyTorch DataLoader — plus the targets the metadata makes addressable.
§ 01Access · Get started

Quickstart#

Install

pip install eegdash

Access the data

from eegdash.dataset import NM000250

dataset = NM000250(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)

Filter by subject

dataset = NM000250(cache_dir="./data", subject="01")

Advanced query

dataset = NM000250(
    cache_dir="./data",
    query={"subject": {"$in": ["01", "02"]}},
)

Iterate recordings

for rec in dataset:
    print(rec.subject, rec.raw.info['sfreq'])

If you use this dataset in your research, please cite the original authors.

BibTeX

@dataset{nm000250,
  title = {Dreyer et al. 2023 — A large EEG database with users' profile information for motor imagery brain-computer interface research},
  author = {Pauline Dreyer and Aline Roc and Léa Pillette and Sébastien Rimbert and Fabien Lotte},
}
§ 02Study · The README

About This Dataset#

Class for Dreyer2023 dataset management. MI dataset.

Schema: HED 8.4.0 | Browse: https://www.hedtags.org/hed-schema-browser

Class for Dreyer2023 dataset management. MI dataset

left_hand
├─ Sensory-event, Experimental-stimulus, Visual-presentation
└─ Agent-action
View full README

Class for Dreyer2023 dataset management. MI dataset

left_hand
     ├─ Sensory-event, Experimental-stimulus, Visual-presentation
     └─ Agent-action
        └─ Imagine
           ├─ Move
           └─ Left, Hand

right_hand
├─ Sensory-event, Experimental-stimulus, Visual-presentation
└─ Agent-action
   └─ Imagine
      ├─ Move
      └─ Right, Hand

Paradigm-Specific Parameters

  • Detected paradigm: motor_imagery

  • Imagery tasks: right_hand, left_hand

  • Cue duration: 1.25 s

  • Imagery duration: 3.75 s

Data Structure

  • Trials: 240

  • Trials per class: right_hand=120, left_hand=120

  • Blocks per session: 6

  • Block duration: 420.0 s

  • Trials context: per subject (120 per class)

Preprocessing

  • Data state: raw

  • Preprocessing applied: False

  • Bandpass filter: [5.0, 35.0]

  • Filter type: Butterworth

  • Filter order: 5

  • Artifact methods: visual inspection

  • Re-reference: Laplacian (C3, C4 for feature extraction)

  • Notes: The raw signals were recorded without any hardware filters. For online processing, a fifth-order Butterworth filter was applied in a participant-specific discriminant frequency band in the range of 5 Hz to 35 Hz with 0.5 Hz large bins. Impedance could not be measured with active electrodes; EEG signals were visually checked and regularly re-checked to ensure good signal quality.

Signal Processing

  • Classifiers: LDA

  • Feature extraction: CSP, Bandpower

  • Frequency bands: analyzed=[5.0, 35.0] Hz; alpha=[8.0, 13.0] Hz; mu=[8.0, 13.0] Hz; beta=[13.0, 30.0] Hz

  • Spatial filters: CSP, Laplacian

Cross-Validation

  • Method: calibration-feedback

  • Evaluation type: within_session

Performance (Original Study)

  • Accuracy: 63.35%

  • Mean Accuracy Std: 17.36

  • Mean Accuracy R3: 63.14

  • Mean Accuracy R4: 64.82

  • Chance Level Individual: 58.7

  • Chance Level Database: 51.0

BCI Application

  • Applications: rehabilitation, assistive_technology, neurofeedback, user_training

  • Environment: laboratory

  • Online feedback: True

Tags

  • Pathology: Healthy

  • Modality: Motor

  • Type: Motor Imagery

Documentation

  • Description: A large EEG database with users’ profile information for motor imagery brain-computer interface research. Contains electroencephalographic signals from 87 human participants, collected during a single day of brain-computer interface (BCI) experiments, organized into 3 datasets (A, B, and C) that were all recorded using the same protocol: right and left hand motor imagery (MI).

  • DOI: 10.1038/s41597-023-02445-z

  • Associated paper DOI: 10.1038/s41597-023-02445-z

  • License: CC-BY-4.0

  • Investigators: Pauline Dreyer, Aline Roc, Léa Pillette, Sébastien Rimbert, Fabien Lotte

  • Senior author: Fabien Lotte

  • Contact: fabien.lotte@inria.fr

  • Institution: Centre Inria de l’université de Bordeaux

  • Department: LaBRI (Univ. Bordeaux/CNRS/Bordeaux INP)

  • Address: Talence, 33405, France

  • Country: FR

  • Repository: Zenodo

  • Data URL: https://doi.org/10.5281/zenodo.8089820

  • Publication year: 2023

  • Funding: European Research Council (ERC Starting Grant project BrainConquest, grant ERC-2016-STG-714567)

  • Ethics approval: Inria’s ethics committee, the COERLE (Approval number: 2018-13)

  • Keywords: motor imagery, brain-computer interface, EEG, BCI illiteracy, user training, personality profile, cognitive traits, user profile

Abstract

We present and share a large database containing electroencephalographic signals from 87 human participants, collected during a single day of brain-computer interface (BCI) experiments, organized into 3 datasets (A, B, and C) that were all recorded using the same protocol: right and left hand motor imagery (MI). Each session contains 240 trials (120 per class), which represents more than 20,800 trials, or approximately 70 hours of recording time. It includes the performance of the associated BCI users, detailed information about the demographics, personality profile as well as some cognitive traits and the experimental instructions and codes (executed in the open-source platform OpenViBE). Such database could prove useful for various studies, including but not limited to: (1) studying the relationships between BCI users’ profiles and their BCI performances, (2) studying how EEG signals properties varies for different users’ profiles and MI tasks, (3) using the large number of participants to design cross-user BCI machine learning algorithms or (4) incorporating users’ profile information into the design of EEG signal classification algorithms.

Methodology

Participants performed a Graz protocol MI-BCI task with 6 runs (2 calibration runs with sham feedback, 4 online training runs with real feedback). Each run consisted of 40 trials (20 per MI-task) with 8s trial duration. Trial structure: green cross (t=0s), acoustic signal (t=2s), red arrow cue (t=3s, 1.25s duration), continuous visual feedback (t=4.25s, 3.75s duration), inter-trial interval (1.5-3.5s). Signal processing used participant-specific Most Discriminant Frequency Band (MDFB) selection (5-35 Hz range), fifth-order Butterworth filtering, Common Spatial Pattern (CSP) with 3 pairs of spatial filters, and Linear Discriminant Analysis (LDA) classifier trained on calibration data. Participants completed 6 questionnaires assessing demographics, personality (16PF5), cognitive traits, spatial abilities (Mental Rotation test), learning style (ILS), and pre/post-experiment states (NeXT questionnaire).

References

Pillette, L., Roc, A., N’kaoua, B., & Lotte, F. (2021). Experimenters’ influence on mental-imagery based brain-computer interface user training. International Journal of Human-Computer Studies, 149, 102603.

Benaroch, C., Yamamoto, M. S., Roc, A., Dreyer, P., Jeunet, C., & Lotte, F. (2022). When should MI-BCI feature optimization include prior knowledge, and which one?. Brain-Computer Interfaces, 9(2), 115-128. Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Hochenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896 Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8 Generated by MOABB 1.5.0 (Mother of All BCI Benchmarks) NeuroTechX/moabb

§ 03Cohort · Participants

Cohort#

Dataset Statistics#

Age distribution by gender (n=87, range 29–29 yr, mean 29.0 yr)

25
Female · 42Male · 45

Sex composition

87
subjects
Female
42
Male
45
F : M ratio
0.93 : 1
48% female · n = 87 subjects with reported sex.
HandednessRight · 87

Channel counts: 27 ch (n=520 recordings)

Sampling frequencies: 512.0 Hz (n=520 recordings)

Total recording duration: 63 h

§ 04Signal · Electrodes & trace

Signal · Electrodes & live trace#

Fig. 01 Signal & montage 27 ch · EEG · 512 Hz · 87 subjects, 520 recordings
Live trace viewer — sub-13 · ses-0 · task-imagery · run-3

Showing one representative recording out of 87 subjects and 520 recordings in this dataset. Browse the full set on OpenNeuro; drop any other _eeg.{set,edf,bdf,vhdr} file onto the viewer (or pass ?eeg=<url>) to inspect it.

Electrode layout — EEG · 27 sensors — 27 channels

NEMAR Processing Statistics#

The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.

HED event descriptors word cloud HED event descriptors word cloud — NM000250
§ 05Manifest · BIDS tree

Manifest#

File Explorer#

Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.

Recordings
Files
Subjects
Modalities
Click to load file structure…
Full dataset metadata table

Dataset ID

NM000250

Title

Dreyer et al. 2023 — A large EEG database with users’ profile information for motor imagery brain-computer interface research

Author (year)

Dreyer2023

Canonical

Importable as

NM000250, Dreyer2023

Year

2021

Authors

Pauline Dreyer, Aline Roc, Léa Pillette, Sébastien Rimbert, Fabien Lotte

License

CC-BY-4.0

Citation / DOI

Unknown

Source links

OpenNeuro | NeMAR | Source URL

§ 06API · Programmatic access

API Reference#

Signature
eegdash.dataset
class
eegdash.dataset.NM000250(cache_dir, query=None, s3_bucket=None, **kwargs)
Bases: EEGDashDataset
Author (year)Dreyer2023
Canonical
Importable asNM000250 · Dreyer2023
Sourceeegdash/dataset/registry.py · [source ↗]
class eegdash.dataset.NM000250(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#

Dreyer et al. 2023 — A large EEG database with users’ profile information for motor imagery brain-computer interface research

Study:

nm000250 (NeMAR)

Author (year):

Dreyer2023

Canonical:

Also importable as: NM000250, Dreyer2023.

Modality: eeg; Experiment type: Motor; Subject type: Healthy. Subjects: 87; recordings: 520; tasks: 1.

Parameters:
  • cache_dir (str | Path) – Directory where data are cached locally.

  • query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key dataset.

  • s3_bucket (str | None) – Base S3 bucket used to locate the data.

  • **kwargs (dict) – Additional keyword arguments forwarded to EEGDashDataset.

data_dir#

Local dataset cache directory (cache_dir / dataset_id).

Type:

Path

query#

Merged query with the dataset filter applied.

Type:

dict

records#

Metadata records used to build the dataset, if pre-fetched.

Type:

list[dict] | None

Notes

Each item is a recording; recording-level metadata are available via dataset.description. query supports MongoDB-style filters on fields in ALLOWED_QUERY_FIELDS and is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.

References

OpenNeuro dataset: https://openneuro.org/datasets/nm000250 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=nm000250

Examples

>>> from eegdash.dataset import NM000250
>>> dataset = NM000250(cache_dir="./data")
>>> recording = dataset[0]
>>> raw = recording.load()
__init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
save(path: str, overwrite: bool = False, offset: int = 0)[source]#

Save datasets to files by creating one subdirectory for each dataset:

path/
    0/
        0-raw.fif | 0-epo.fif
        description.json
        raw_preproc_kwargs.json (if raws were preprocessed)
        window_kwargs.json (if this is a windowed dataset)
        window_preproc_kwargs.json  (if windows were preprocessed)
        target_name.json (if target_name is not None and dataset is raw)
    1/
        1-raw.fif | 1-epo.fif
        description.json
        raw_preproc_kwargs.json (if raws were preprocessed)
        window_kwargs.json (if this is a windowed dataset)
        window_preproc_kwargs.json  (if windows were preprocessed)
        target_name.json (if target_name is not None and dataset is raw)
Parameters:
  • path (str) –

    Directory in which subdirectories are created to store

    -raw.fif | -epo.fif and .json files to.

  • overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.

  • offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.

Access modesMNE → braindecode → PyTorch → ML
.rawMNE Raw object — standard tools (filter, epoch, ICA, plot_psd).mne
DataLoaderWraps the windowed dataset into a PyTorch DataLoader; supports parallel workers and on-the-fly augmentations.pytorch
Zarr cacheOptional braindecode Zarr mirror for fast resume; persisted to cache_dir.zarr
Hugging FaceNo per-dataset mirror published yet — browse the EEGDash org listing for sibling datasets. See the datasets loader API.huggingface
Croissant 1.0Machine-readable JSON-LD descriptorNM000250.croissant.json (MLCommons schema, ingestible by PyTorch / TensorFlow / JAX).mlcommons
Examples using EEGDashcurated · start here

Swap any load_dataset(...) call for nm000250 to reproduce the tutorial on this dataset.

Citation

Pauline Dreyer, Aline Roc, Léa Pillette, Sébastien Rimbert, Fabien Lotte (2021). Dreyer et al. 2023 — A large EEG database with users' profile information for motor imagery brain-computer interface research.

Provenance

¹Contributed to nemar in BIDS format.

²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.

BIDS
BIDS 1.9.0
Sidecars
events · events.json · channels · eeg.json
Provenance
CC-BY-4.0 · DOI not on file
Machine-readable
Mirrors

See Also#