DS007602: eeg dataset, 3 subjects#
EEG-Speech Brain Decoding Dataset
Access recordings and metadata through EEGDash.
Citation: Motoshige Sato, Masakazu Inoue, Kenichi Tomeoka, Ilya Horiguchi, Eri Hatakeyama, Yuya Kita, Atsushi Yamamoto, Ippei Fujisawa, Shuntaro Sasai (2026). EEG-Speech Brain Decoding Dataset. 10.18112/openneuro.ds007602.v1.0.1
Modality: eeg Subjects: 3 Recordings: 113 License: CC0 Source: openneuro
Metadata: Complete (100%)
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import DS007602
dataset = DS007602(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = DS007602(cache_dir="./data", subject="01")
Advanced query
dataset = DS007602(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{ds007602,
title = {EEG-Speech Brain Decoding Dataset},
author = {Motoshige Sato and Masakazu Inoue and Kenichi Tomeoka and Ilya Horiguchi and Eri Hatakeyama and Yuya Kita and Atsushi Yamamoto and Ippei Fujisawa and Shuntaro Sasai},
doi = {10.18112/openneuro.ds007602.v1.0.1},
url = {https://doi.org/10.18112/openneuro.ds007602.v1.0.1},
}
About This Dataset#
EEG-Speech Brain Decoding Dataset
Overview
This dataset contains EEG recordings and audio data.
Sessions
View full README
EEG-Speech Brain Decoding Dataset
Overview
This dataset contains EEG recordings and audio data.
Sessions
Sessions are labeled by recording date in YYYYMMDD format.
- Example: ses-20240401 = recorded on April 1, 2024
Multiple recordings on the same day are distinguished by run numbers:
- run-N: Nth recording of the day
Tasks
speechopen: Overt speech production task - Participants vocalize visually presented text
File Format Notes
EEG Data
Raw EEG data is stored:
- Path: sub-*/ses-*/eeg/*_eeg.edf
- Note: EDF format is not officially part of BIDS-EEG specification
- Files are excluded in .bidsignore but documented here for reference
- Future releases may include EDF conversions for full BIDS compliance
Behavioral Data (Audio)
Vocal recordings are stored in beh/ directories:
- Path: sub-*/ses-*/beh/*_recording-vocal_beh.wav
- Note: Not officially part of BIDS-EEG spec, but included for analysis convenience
- Excluded in .bidsignore
Directory Structure
dataset_root/
├── README (this file)
├── CHANGES (version history)
├── dataset_description.json (dataset metadata)
├── participants.tsv (participant information)
├── participants.json (participant column descriptions)
├── task-speechopen_eeg.json (task-level EEG metadata)
├── task-speechopen_events.json (events column descriptions)
├── .bidsignore (files to ignore in validation)
│
├── code/ (analysis and preprocessing code)
│ ├── preprocessing/ (EEG and audio preprocessing)
│ ├── training/ (model training scripts)
│ ├── evaluation/ (evaluation metrics)
│ └── bids/ (BIDS conversion scripts)
│
├── sub-01/ (participant data)
│ └── ses-YYYYMMDD/ (session by date)
│ ├── eeg/ (EEG recordings)
│ └── beh/ (behavioral/audio data)
│
└── derivatives/ (processed data)
└── pipeline-standard/ (standard preprocessing)
Dataset Information#
Dataset ID |
|
Title |
EEG-Speech Brain Decoding Dataset |
Author (year) |
|
Canonical |
|
Importable as |
|
Year |
2026 |
Authors |
Motoshige Sato, Masakazu Inoue, Kenichi Tomeoka, Ilya Horiguchi, Eri Hatakeyama, Yuya Kita, Atsushi Yamamoto, Ippei Fujisawa, Shuntaro Sasai |
License |
CC0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{ds007602,
title = {EEG-Speech Brain Decoding Dataset},
author = {Motoshige Sato and Masakazu Inoue and Kenichi Tomeoka and Ilya Horiguchi and Eri Hatakeyama and Yuya Kita and Atsushi Yamamoto and Ippei Fujisawa and Shuntaro Sasai},
doi = {10.18112/openneuro.ds007602.v1.0.1},
url = {https://doi.org/10.18112/openneuro.ds007602.v1.0.1},
}
Found an issue with this dataset?
If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!
Technical Details#
Subjects: 3
Recordings: 113
Tasks: 1
Channels: 134
Sampling rate (Hz): 1200.0
Duration (hours): 44.18638888888889
Pathology: Healthy
Modality: Visual
Type: Motor
Size on disk: 49.6 GB
File count: 113
Format: BIDS
License: CC0
DOI: doi:10.18112/openneuro.ds007602.v1.0.1
API Reference#
Use the DS007602 class to access this dataset programmatically.
- class eegdash.dataset.DS007602(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Bases:
EEGDashDatasetEEG-Speech Brain Decoding Dataset
- Study:
ds007602(OpenNeuro)- Author (year):
Sato2026_Speech- Canonical:
Sato2024
Also importable as:
DS007602,Sato2026_Speech,Sato2024.Modality:
eeg; Experiment type:Motor; Subject type:Healthy. Subjects: 3; recordings: 113; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
- query#
Merged query with the dataset filter applied.
- Type:
dict
- records#
Metadata records used to build the dataset, if pre-fetched.
- Type:
list[dict] | None
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/ds007602 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds007602 DOI: https://doi.org/10.18112/openneuro.ds007602.v1.0.1
Examples
>>> from eegdash.dataset import DS007602 >>> dataset = DS007602(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset