DS007629: eeg dataset, 1 subjects#
ROAMM
Access recordings and metadata through EEGDash.
Citation: Haorui Sun, Ardyn Vivienne Olszko, Niharika Singh, David C. Jangraw (2026). ROAMM. 10.18112/openneuro.ds007629.v1.0.1
Modality: eeg Subjects: 1 Recordings: 5 License: CC0 Source: openneuro
Metadata: Complete (100%)
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import DS007629
dataset = DS007629(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = DS007629(cache_dir="./data", subject="01")
Advanced query
dataset = DS007629(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{ds007629,
title = {ROAMM},
author = {Haorui Sun and Ardyn Vivienne Olszko and Niharika Singh and David C. Jangraw},
doi = {10.18112/openneuro.ds007629.v1.0.1},
url = {https://doi.org/10.18112/openneuro.ds007629.v1.0.1},
}
About This Dataset#
ROAMM: Reading Observed At Mindless Moments
**ROAMM**is a large-scale multimodal dataset featuring simultaneous ** EEG and eye-tracking**data collected during naturalistic reading with **span-level mind-wandering annotations**. ROAMM provides a benchmark dataset for MW detection and EEG-to-text decoding tasks, and enables the study of attention-related degradation in language decoding from brain activity in naturalistic reading.
Dataset Status
- *\*Synchronized ML Dataset:* For researchers looking for the pre-processed, synchronized EEG and eye-tracking data (Pickle format), please navigate to:
derivatives/synced/
View full README
ROAMM: Reading Observed At Mindless Moments
**ROAMM**is a large-scale multimodal dataset featuring simultaneous ** EEG and eye-tracking**data collected during naturalistic reading with **span-level mind-wandering annotations**. ROAMM provides a benchmark dataset for MW detection and EEG-to-text decoding tasks, and enables the study of attention-related degradation in language decoding from brain activity in naturalistic reading.
Dataset Status
- *\*Synchronized ML Dataset:* For researchers looking for the pre-processed, synchronized EEG and eye-tracking data (Pickle format), please navigate to:
derivatives/synced/
*\*Linguistic Content:* Reading materials (words with coordinate information) are stored in derivatives/stimuli/wiki_stories. Each word is assigned a unique key to enable mapping fixated words back to their original corpus.
*\*Raw EEG (BIDS):* **Work in Progress.** We are currently converting the full raw EEG dataset for all participants into BIDS-compliant format.
Project Details
Task: Naturalistic reading of standardized articles with retrospective self-report paradigm (ReMind task).
Participants: 44 subjects (50+ hours of data).
- Modalities: - EEG (BioSemi ActiveTwo 64 channels).
Simultaneous Eye-Tracking (SR Research EyeLink 1000 Plus).
Span-level mind-wandering annotations.
Reading comprehension scores (page-level, multiple-choice questions).
Structure
This repository follows the Brain Imaging Data Structure (BIDS).
- participants.tsv: Demographic information (age, sex, handedness, ADHD/Reading Disability status).
- derivatives/synced/: Synchronized multi-modal data frames ready for Machine Learning pipelines.
Publication & Citation
The dataset paper describing the collection, synchronization, and baseline modeling of this data will be available online shortly. Once published, please use the citation provided here to credit the work.
Dataset Information#
Dataset ID |
|
Title |
ROAMM |
Author (year) |
— |
Canonical |
— |
Importable as |
|
Year |
2026 |
Authors |
Haorui Sun, Ardyn Vivienne Olszko, Niharika Singh, David C. Jangraw |
License |
CC0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{ds007629,
title = {ROAMM},
author = {Haorui Sun and Ardyn Vivienne Olszko and Niharika Singh and David C. Jangraw},
doi = {10.18112/openneuro.ds007629.v1.0.1},
url = {https://doi.org/10.18112/openneuro.ds007629.v1.0.1},
}
Found an issue with this dataset?
If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!
Technical Details#
Subjects: 1
Recordings: 5
Tasks: 1
Channels: 64
Sampling rate (Hz): 256.0
Duration (hours): 0.9863834635416666
Pathology: Not specified
Modality: —
Type: —
Size on disk: 223.8 MB
File count: 5
Format: BIDS
License: CC0
DOI: doi:10.18112/openneuro.ds007629.v1.0.1
API Reference#
Use the DS007629 class to access this dataset programmatically.
- class eegdash.dataset.DS007629(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Bases:
EEGDashDatasetROAMM
- Study:
ds007629(OpenNeuro)- Author (year):
nan- Canonical:
—
Also importable as:
DS007629,nan.Modality:
eeg. Subjects: 1; recordings: 5; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
- query#
Merged query with the dataset filter applied.
- Type:
dict
- records#
Metadata records used to build the dataset, if pre-fetched.
- Type:
list[dict] | None
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/ds007629 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds007629 DOI: https://doi.org/10.18112/openneuro.ds007629.v1.0.1
Examples
>>> from eegdash.dataset import DS007629 >>> dataset = DS007629(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset