DS007629: eeg dataset, 1 subjects#

ROAMM

Access recordings and metadata through EEGDash.

Citation: Haorui Sun, Ardyn Vivienne Olszko, Niharika Singh, David C. Jangraw (2026). ROAMM. 10.18112/openneuro.ds007629.v1.0.1

Modality: eeg Subjects: 1 Recordings: 5 License: CC0 Source: openneuro

Metadata: Complete (100%)

Quickstart#

Install

pip install eegdash

Access the data

from eegdash.dataset import DS007629

dataset = DS007629(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)

Filter by subject

dataset = DS007629(cache_dir="./data", subject="01")

Advanced query

dataset = DS007629(
    cache_dir="./data",
    query={"subject": {"$in": ["01", "02"]}},
)

Iterate recordings

for rec in dataset:
    print(rec.subject, rec.raw.info['sfreq'])

If you use this dataset in your research, please cite the original authors.

BibTeX

@dataset{ds007629,
  title = {ROAMM},
  author = {Haorui Sun and Ardyn Vivienne Olszko and Niharika Singh and David C. Jangraw},
  doi = {10.18112/openneuro.ds007629.v1.0.1},
  url = {https://doi.org/10.18112/openneuro.ds007629.v1.0.1},
}

About This Dataset#

ROAMM: Reading Observed At Mindless Moments

**ROAMM**is a large-scale multimodal dataset featuring simultaneous ** EEG and eye-tracking**data collected during naturalistic reading with **span-level mind-wandering annotations**. ROAMM provides a benchmark dataset for MW detection and EEG-to-text decoding tasks, and enables the study of attention-related degradation in language decoding from brain activity in naturalistic reading.

Dataset Status

*\*Synchronized ML Dataset:* For researchers looking for the pre-processed, synchronized EEG and eye-tracking data (Pickle format), please navigate to:

derivatives/synced/

View full README

ROAMM: Reading Observed At Mindless Moments

**ROAMM**is a large-scale multimodal dataset featuring simultaneous ** EEG and eye-tracking**data collected during naturalistic reading with **span-level mind-wandering annotations**. ROAMM provides a benchmark dataset for MW detection and EEG-to-text decoding tasks, and enables the study of attention-related degradation in language decoding from brain activity in naturalistic reading.

Dataset Status

*\*Synchronized ML Dataset:* For researchers looking for the pre-processed, synchronized EEG and eye-tracking data (Pickle format), please navigate to:

derivatives/synced/

*\*Linguistic Content:* Reading materials (words with coordinate information) are stored in derivatives/stimuli/wiki_stories. Each word is assigned a unique key to enable mapping fixated words back to their original corpus. *\*Raw EEG (BIDS):* **Work in Progress.** We are currently converting the full raw EEG dataset for all participants into BIDS-compliant format.

Project Details

  • Task: Naturalistic reading of standardized articles with retrospective self-report paradigm (ReMind task).

  • Participants: 44 subjects (50+ hours of data).

  • Modalities: - EEG (BioSemi ActiveTwo 64 channels).
    • Simultaneous Eye-Tracking (SR Research EyeLink 1000 Plus).

    • Span-level mind-wandering annotations.

    • Reading comprehension scores (page-level, multiple-choice questions).

Structure

This repository follows the Brain Imaging Data Structure (BIDS). - participants.tsv: Demographic information (age, sex, handedness, ADHD/Reading Disability status). - derivatives/synced/: Synchronized multi-modal data frames ready for Machine Learning pipelines.

Publication & Citation

The dataset paper describing the collection, synchronization, and baseline modeling of this data will be available online shortly. Once published, please use the citation provided here to credit the work.

Dataset Information#

Dataset ID

DS007629

Title

ROAMM

Author (year)

Canonical

Importable as

DS007629

Year

2026

Authors

Haorui Sun, Ardyn Vivienne Olszko, Niharika Singh, David C. Jangraw

License

CC0

Citation / DOI

doi:10.18112/openneuro.ds007629.v1.0.1

Source links

OpenNeuro | NeMAR | Source URL

Copy-paste BibTeX
@dataset{ds007629,
  title = {ROAMM},
  author = {Haorui Sun and Ardyn Vivienne Olszko and Niharika Singh and David C. Jangraw},
  doi = {10.18112/openneuro.ds007629.v1.0.1},
  url = {https://doi.org/10.18112/openneuro.ds007629.v1.0.1},
}

Found an issue with this dataset?

If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!

Report an Issue on GitHub

Technical Details#

Subjects & recordings
  • Subjects: 1

  • Recordings: 5

  • Tasks: 1

Channels & sampling rate
  • Channels: 64

  • Sampling rate (Hz): 256.0

  • Duration (hours): 0.9863834635416666

Tags
  • Pathology: Not specified

  • Modality: —

  • Type: —

Files & format
  • Size on disk: 223.8 MB

  • File count: 5

  • Format: BIDS

License & citation
  • License: CC0

  • DOI: doi:10.18112/openneuro.ds007629.v1.0.1

Provenance

API Reference#

Use the DS007629 class to access this dataset programmatically.

class eegdash.dataset.DS007629(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#

Bases: EEGDashDataset

ROAMM

Study:

ds007629 (OpenNeuro)

Author (year):

nan

Canonical:

Also importable as: DS007629, nan.

Modality: eeg. Subjects: 1; recordings: 5; tasks: 1.

Parameters:
  • cache_dir (str | Path) – Directory where data are cached locally.

  • query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key dataset.

  • s3_bucket (str | None) – Base S3 bucket used to locate the data.

  • **kwargs (dict) – Additional keyword arguments forwarded to EEGDashDataset.

data_dir#

Local dataset cache directory (cache_dir / dataset_id).

Type:

Path

query#

Merged query with the dataset filter applied.

Type:

dict

records#

Metadata records used to build the dataset, if pre-fetched.

Type:

list[dict] | None

Notes

Each item is a recording; recording-level metadata are available via dataset.description. query supports MongoDB-style filters on fields in ALLOWED_QUERY_FIELDS and is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.

References

OpenNeuro dataset: https://openneuro.org/datasets/ds007629 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds007629 DOI: https://doi.org/10.18112/openneuro.ds007629.v1.0.1

Examples

>>> from eegdash.dataset import DS007629
>>> dataset = DS007629(cache_dir="./data")
>>> recording = dataset[0]
>>> raw = recording.load()
__init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
save(path, overwrite=False)[source]#

Save the dataset to disk.

Parameters:
  • path (str or Path) – Destination file path.

  • overwrite (bool, default False) – If True, overwrite existing file.

Return type:

None

See Also#