DS004015: eeg dataset, 36 subjects#
Attended speaker paradigm (cEEGrid data)
Citation: Bjoern Holtze, Marc Rosenkranz, Manuela Jaeger, Stefan Debener, Bojana Mirkovic (2020). Attended speaker paradigm (cEEGrid data). 10.18112/openneuro.ds004015.v1.0.2
36-participant EEG dataset — Attended speaker paradigm (cEEGrid data).
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import DS004015
dataset = DS004015(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = DS004015(cache_dir="./data", subject="01")
Advanced query
dataset = DS004015(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{ds004015,
title = {Attended speaker paradigm (cEEGrid data)},
author = {Bjoern Holtze and Marc Rosenkranz and Manuela Jaeger and Stefan Debener and Bojana Mirkovic},
doi = {10.18112/openneuro.ds004015.v1.0.2},
url = {https://doi.org/10.18112/openneuro.ds004015.v1.0.2},
}
About This Dataset#
Within this study cEEGrid data from two previous studies were pooled.
15 participants from Jaeger et al. (2020) and 21 from Holtze et al. (2021) were included.
Participants performed a two-competing speaker paradigm in both original studies.
Participants were instructed to either attend to the left or right audio book.
The paradigm consisted of six (Jaeger et al. 2020) or five (Holtze et al. 2021) 10-minute blocks of audio book presentation. In Jaeger et al. (2020) both audio books were always presented equally loud. In Holtze et al. 2021, a 10-minute block could either be presented in the omnidirectional condition (both audio books were presented equally loud) or in the beamforming condition (the to-be-attended audio book was louder than the to-be-ignored audio book). The first 10-minute block was always presented in the omnidirectional condition whereas the conditions were alternated for the later four blocks, with one half of the participants starting with the omnidirectonal condition and the other half starting with the beamforming condition.
The article (https://doi.org/10.3389/fnins.2022.869426) contains all methodological details - Björn Holtze (February, 2022)
Cohort#
Dataset Statistics#
Age distribution by gender (n=36, range 18–33 yr, mean 23.6 yr)
Sex composition
Channel counts: 18 ch (n=36 recordings)
Sampling frequencies: 500.0 Hz (n=36 recordings)
Total recording duration: 47 h
Signal · Electrodes & live trace#
Live trace viewer — sub-021 · task-AttendedSpeakerParadigmcEEGridAttention
Showing one representative recording out of
36 subjects and 36 recordings in this dataset.
Browse the full set on OpenNeuro;
drop any other _eeg.{set,edf,bdf,vhdr} file onto the
viewer (or pass ?eeg=<url>) to inspect it.
Electrode layout — EEG · 18 sensors — 18 channels
NEMAR Processing Statistics#
The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.
HED event descriptors word cloud
Manifest#
File Explorer#
Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.
Full dataset metadata table
Dataset ID |
|
Title |
Attended speaker paradigm (cEEGrid data) |
Author (year) |
|
Canonical |
— |
Importable as |
|
Year |
2020 |
Authors |
Bjoern Holtze, Marc Rosenkranz, Manuela Jaeger, Stefan Debener, Bojana Mirkovic |
License |
CC0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{ds004015,
title = {Attended speaker paradigm (cEEGrid data)},
author = {Bjoern Holtze and Marc Rosenkranz and Manuela Jaeger and Stefan Debener and Bojana Mirkovic},
doi = {10.18112/openneuro.ds004015.v1.0.2},
url = {https://doi.org/10.18112/openneuro.ds004015.v1.0.2},
}
API Reference#
eegdash.datasetEEGDashDatasetDS004015 · Holtze2022_Attendedeegdash/dataset/registry.py · [source ↗]- class eegdash.dataset.DS004015(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Attended speaker paradigm (cEEGrid data)
- Study:
ds004015(OpenNeuro)- Author (year):
Holtze2022_Attended- Canonical:
—
Also importable as:
DS004015,Holtze2022_Attended.Modality:
eeg. Subjects: 36; recordings: 36; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/ds004015 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds004015 DOI: https://doi.org/10.18112/openneuro.ds004015.v1.0.2 NEMAR citation count: 3
Examples
>>> from eegdash.dataset import DS004015 >>> dataset = DS004015(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
- __init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
- save(path: str, overwrite: bool = False, offset: int = 0)[source]#
Save datasets to files by creating one subdirectory for each dataset:
path/ 0/ 0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw) 1/ 1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- Parameters:
path (str) –
- Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.
BaseDataset from braindecode — windowed via create_windows_from_events.braindecodeDataLoader; supports parallel workers and on-the-fly augmentations.pytorchdatasets.load_dataset("EEGDash/ds004015").huggingfaceSwap any load_dataset(...) call for ds004015 to reproduce the tutorial on this dataset.
Citation
Bjoern Holtze, Marc Rosenkranz, Manuela Jaeger, Stefan Debener, Bojana Mirkovic (2020). Attended speaker paradigm (cEEGrid data). 10.18112/openneuro.ds004015.v1.0.2
Provenance
¹Contributed to openneuro in BIDS format.
²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.
³Persistent identifier: 10.18112/openneuro.ds004015.v1.0.2.
Related & sibling datasets
+ 1 more — see See Also below →
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset