EEGChallengeDataset#

class eegdash.dataset.EEGChallengeDataset(release: str, cache_dir: str, mini: bool = True, query: dict | None = None, s3_bucket: str | None = 's3://nmdatasets/NeurIPS25', **kwargs)[source]#

Bases: EEGDashDataset

A dataset helper for the EEG 2025 Challenge.

This class simplifies access to the EEG 2025 Challenge datasets. It is a specialized version of EEGDashDataset that is pre-configured for the challenge’s data releases. It automatically maps a release name (e.g., “R1”) to the corresponding OpenNeuro dataset and handles the selection of subject subsets (e.g., “mini” release).

Parameters:
  • release (str) – The name of the challenge release to load. Must be one of the keys in RELEASE_TO_OPENNEURO_DATASET_MAP (e.g., “R1”, “R2”, …, “R11”).

  • cache_dir (str) – The local directory where the dataset will be downloaded and cached.

  • mini (bool, default True) – If True, the dataset is restricted to the official “mini” subset of subjects for the specified release. If False, all subjects for the release are included.

  • query (dict, optional) – An additional MongoDB-style query to apply as a filter. This query is combined with the release and subject filters using a logical AND. The query must not contain the dataset key, as this is determined by the release parameter.

  • s3_bucket (str, optional) – The base S3 bucket URI where the challenge data is stored. Defaults to the official challenge bucket.

  • **kwargs – Additional keyword arguments that are passed directly to the EEGDashDataset constructor.

Raises:

ValueError – If the specified release is unknown, or if the query argument contains a dataset key. Also raised if mini is True and a requested subject is not part of the official mini-release subset.

See also

EEGDashDataset

The base class for creating datasets from queries.

Examples

Basic usage with dataset and subject filtering:

>>> from eegdash import EEGDashDataset
>>> dataset = EEGDashDataset(
...     cache_dir="./data",
...     dataset="ds002718",
...     subject="012"
... )
>>> print(f"Number of recordings: {len(dataset)}")

Filter by multiple subjects and specific task:

>>> subjects = ["012", "013", "014"]
>>> dataset = EEGDashDataset(
...     cache_dir="./data",
...     dataset="ds002718",
...     subject=subjects,
...     task="RestingState"
... )

Load and inspect EEG data from recordings:

>>> if len(dataset) > 0:
...     recording = dataset[0]
...     raw = recording.load()
...     print(f"Sampling rate: {raw.info['sfreq']} Hz")
...     print(f"Number of channels: {len(raw.ch_names)}")
...     print(f"Duration: {raw.times[-1]:.1f} seconds")

Advanced filtering with raw MongoDB queries:

>>> from eegdash import EEGDashDataset
>>> query = {
...     "dataset": "ds002718",
...     "subject": {"$in": ["012", "013"]},
...     "task": "RestingState"
... }
>>> dataset = EEGDashDataset(cache_dir="./data", query=query)

Working with dataset collections and braindecode integration:

>>> # EEGDashDataset is a braindecode BaseConcatDataset
>>> for i, recording in enumerate(dataset):
...     if i >= 2:  # limit output
...         break
...     print(f"Recording {i}: {recording.description}")
...     raw = recording.load()
...     print(f"  Channels: {len(raw.ch_names)}, Duration: {raw.times[-1]:.1f}s")

Initialize self. See help(type(self)) for accurate signature.

Parameters:
  • release – The description is missing.

  • cache_dir – The description is missing.

  • mini – The description is missing.

  • query – The description is missing.

  • s3_bucket – The description is missing.

  • **kwargs – The description is missing.