EEGChallengeDataset#
- class eegdash.dataset.EEGChallengeDataset(release: str, cache_dir: str, mini: bool = True, query: dict | None = None, s3_bucket: str | None = 's3://nmdatasets/NeurIPS25', **kwargs)[source]#
Bases:
EEGDashDataset
A dataset helper for the EEG 2025 Challenge.
This class simplifies access to the EEG 2025 Challenge datasets. It is a specialized version of
EEGDashDataset
that is pre-configured for the challenge’s data releases. It automatically maps a release name (e.g., “R1”) to the corresponding OpenNeuro dataset and handles the selection of subject subsets (e.g., “mini” release).- Parameters:
release (str) – The name of the challenge release to load. Must be one of the keys in
RELEASE_TO_OPENNEURO_DATASET_MAP
(e.g., “R1”, “R2”, …, “R11”).cache_dir (str) – The local directory where the dataset will be downloaded and cached.
mini (bool, default True) – If True, the dataset is restricted to the official “mini” subset of subjects for the specified release. If False, all subjects for the release are included.
query (dict, optional) – An additional MongoDB-style query to apply as a filter. This query is combined with the release and subject filters using a logical AND. The query must not contain the
dataset
key, as this is determined by therelease
parameter.s3_bucket (str, optional) – The base S3 bucket URI where the challenge data is stored. Defaults to the official challenge bucket.
**kwargs – Additional keyword arguments that are passed directly to the
EEGDashDataset
constructor.
- Raises:
ValueError – If the specified
release
is unknown, or if thequery
argument contains adataset
key. Also raised ifmini
is True and a requested subject is not part of the official mini-release subset.
See also
EEGDashDataset
The base class for creating datasets from queries.
Examples
Basic usage with dataset and subject filtering:
>>> from eegdash import EEGDashDataset >>> dataset = EEGDashDataset( ... cache_dir="./data", ... dataset="ds002718", ... subject="012" ... ) >>> print(f"Number of recordings: {len(dataset)}")
Filter by multiple subjects and specific task:
>>> subjects = ["012", "013", "014"] >>> dataset = EEGDashDataset( ... cache_dir="./data", ... dataset="ds002718", ... subject=subjects, ... task="RestingState" ... )
Load and inspect EEG data from recordings:
>>> if len(dataset) > 0: ... recording = dataset[0] ... raw = recording.load() ... print(f"Sampling rate: {raw.info['sfreq']} Hz") ... print(f"Number of channels: {len(raw.ch_names)}") ... print(f"Duration: {raw.times[-1]:.1f} seconds")
Advanced filtering with raw MongoDB queries:
>>> from eegdash import EEGDashDataset >>> query = { ... "dataset": "ds002718", ... "subject": {"$in": ["012", "013"]}, ... "task": "RestingState" ... } >>> dataset = EEGDashDataset(cache_dir="./data", query=query)
Working with dataset collections and braindecode integration:
>>> # EEGDashDataset is a braindecode BaseConcatDataset >>> for i, recording in enumerate(dataset): ... if i >= 2: # limit output ... break ... print(f"Recording {i}: {recording.description}") ... raw = recording.load() ... print(f" Channels: {len(raw.ch_names)}, Duration: {raw.times[-1]:.1f}s")
Initialize self. See help(type(self)) for accurate signature.
- Parameters:
release – The description is missing.
cache_dir – The description is missing.
mini – The description is missing.
query – The description is missing.
s3_bucket – The description is missing.
**kwargs – The description is missing.