DS007630: eeg dataset, 3 subjects#
EEG-Speech Brain Decoding Dataset
Access recordings and metadata through EEGDash.
Citation: Motoshige Sato, Ilya Horiguchi, Masakazu Inoue, Kenichi Tomeoka, Eri Hatakeyama, Yuya Kita, Atsushi Yamamoto, Ippei Fujisawa, Shuntaro Sasai (2026). EEG-Speech Brain Decoding Dataset. 10.18112/openneuro.ds007630.v1.0.0
Modality: eeg Subjects: 3 Recordings: 1974 License: CC0 Source: openneuro
Metadata: Complete (100%)
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import DS007630
dataset = DS007630(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = DS007630(cache_dir="./data", subject="01")
Advanced query
dataset = DS007630(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{ds007630,
title = {EEG-Speech Brain Decoding Dataset},
author = {Motoshige Sato and Ilya Horiguchi and Masakazu Inoue and Kenichi Tomeoka and Eri Hatakeyama and Yuya Kita and Atsushi Yamamoto and Ippei Fujisawa and Shuntaro Sasai},
doi = {10.18112/openneuro.ds007630.v1.0.0},
url = {https://doi.org/10.18112/openneuro.ds007630.v1.0.0},
}
About This Dataset#
EEG-Speech Brain Decoding Dataset
Overview
This dataset contains EEG recordings and audio data.
Sessions
View full README
EEG-Speech Brain Decoding Dataset
Overview
This dataset contains EEG recordings and audio data.
Sessions
Sessions are labeled by recording date in YYYYMMDD format.
- Example: ses-20240401 = recorded on April 1, 2024
Multiple recordings on the same day are distinguished by run numbers:
- run-N: Nth recording of the day
Tasks
speechopen: Overt speech production task - Participants vocalize visually presented text
listening: Auditory listening task - Participants listen to prerecorded speech stimuli
File Format Notes
EEG Data
Raw EEG data is stored:
- Path: sub-*/ses-*/eeg/*_eeg.edf
- Note: EDF format is not officially part of BIDS-EEG specification
- Files are excluded in .bidsignore but documented here for reference
- Future releases may include EDF conversions for full BIDS compliance
Behavioral Data (Audio)
Task audio files are stored in beh/ directories:
- Speech production: sub-*/ses-*/beh/*_recording-vocal_beh.wav
- Listening: sub-*/ses-*/beh/*_recording-audio_beh.wav
- Note: Not officially part of BIDS-EEG spec, but included for analysis convenience
- Excluded in .bidsignore
Directory Structure
dataset_root/
├── README (this file)
├── CHANGES (version history)
├── dataset_description.json (dataset metadata)
├── participants.tsv (participant information)
├── participants.json (participant column descriptions)
├── task-speechopen_acq-pangolin_eeg.json (speech production EEG metadata)
├── task-listening_acq-pangolin_eeg.json (listening EEG metadata)
├── task-speechopen_acq-pangolin_events.json (speech production events column descriptions)
├── task-speechopen_acq-pangolin_recording-vocal_beh.json
├── task-listening_acq-pangolin_recording-audio_beh.json
├── .bidsignore (files to ignore in validation)
│
├── code/ (analysis and preprocessing code)
│ ├── preprocessing/ (EEG and audio preprocessing)
│ ├── training/ (model training scripts)
│ ├── evaluation/ (evaluation metrics)
│ └── bids/ (BIDS conversion scripts)
│
├── sub-01/ (participant data)
│ └── ses-YYYYMMDD/ (session by date)
│ ├── eeg/ (EEG recordings)
│ └── beh/ (behavioral/audio data)
│
└── derivatives/ (processed data)
└── pipeline-standard/ (standard preprocessing)
Dataset Information#
Dataset ID |
|
Title |
EEG-Speech Brain Decoding Dataset |
Author (year) |
— |
Canonical |
— |
Importable as |
|
Year |
2026 |
Authors |
Motoshige Sato, Ilya Horiguchi, Masakazu Inoue, Kenichi Tomeoka, Eri Hatakeyama, Yuya Kita, Atsushi Yamamoto, Ippei Fujisawa, Shuntaro Sasai |
License |
CC0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{ds007630,
title = {EEG-Speech Brain Decoding Dataset},
author = {Motoshige Sato and Ilya Horiguchi and Masakazu Inoue and Kenichi Tomeoka and Eri Hatakeyama and Yuya Kita and Atsushi Yamamoto and Ippei Fujisawa and Shuntaro Sasai},
doi = {10.18112/openneuro.ds007630.v1.0.0},
url = {https://doi.org/10.18112/openneuro.ds007630.v1.0.0},
}
Found an issue with this dataset?
If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!
Technical Details#
Subjects: 3
Recordings: 1974
Tasks: 3
Channels: 134 (1496), 90 (172), 140 (168), 70 (138)
Sampling rate (Hz): 1200.0 (1802), 1024.0 (161), 2048.0 (11)
Duration (hours): Not calculated
Pathology: Not specified
Modality: —
Type: —
Size on disk: 955.3 GB
File count: 1974
Format: BIDS
License: CC0
DOI: doi:10.18112/openneuro.ds007630.v1.0.0
API Reference#
Use the DS007630 class to access this dataset programmatically.
- class eegdash.dataset.DS007630(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Bases:
EEGDashDatasetEEG-Speech Brain Decoding Dataset
- Study:
ds007630(OpenNeuro)- Author (year):
nan- Canonical:
—
Also importable as:
DS007630,nan.Modality:
eeg. Subjects: 3; recordings: 1974; tasks: 3.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
- query#
Merged query with the dataset filter applied.
- Type:
dict
- records#
Metadata records used to build the dataset, if pre-fetched.
- Type:
list[dict] | None
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/ds007630 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds007630 DOI: https://doi.org/10.18112/openneuro.ds007630.v1.0.0
Examples
>>> from eegdash.dataset import DS007630 >>> dataset = DS007630(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset