EEGdash›NeMAR›NM000133

Iss. 133 · 8 subjects · 13 recordings · CC-BY-NC-ND-4.0

Dataset Brief · Alljoined1

NM000133: eeg dataset, 8 subjects#

Name: Alljoined1
Published: 2024-01-01
License: CC-BY-NC-ND-4.0

Alljoined1

Access recordings and metadata through EEGDash.

Citation: Jonathan Xu, Si Kai Lee, Wangshu Jiang (2024). Alljoined1. 10.82901/nemar.nm000133

Modality: eeg Subjects: 8 Recordings: 13 License: CC-BY-NC-ND-4.0 Source: nemar

Metadata: Complete (100%)

8-participant EEG dataset — Alljoined1.

Data & curation Jonathan Xu · Si Kai Lee · Wangshu Jiang
Year 2024 · Distributed via NeMAR

EEG · 64 ch512 HzBIDS 1.9.0Task · images2 sessions

Layer 01Study

What was asked

Hypothesis, independent & dependent variables, paradigm, cohort, and the editorial caveats around what the recordings can and cannot answer.

Layer 02Signal · BIDS

What was recorded

Sidecars, channels & electrodes, coordinate system, event semantics, and quality stats from the NEMAR pipeline when available.

Layer 03Training · ML

What you can train on

Recommended access modes — MNE Raw, braindecode windows, PyTorch DataLoader — plus the targets the metadata makes addressable.

§ 01Access · Get started

Quickstart#

Get Started

Install

pip install eegdash

Access the data

from eegdash.dataset import NM000133

dataset = NM000133(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)

Query & Filter

Filter by subject

dataset = NM000133(cache_dir="./data", subject="01")

Advanced query

dataset = NM000133(
    cache_dir="./data",
    query={"subject": {"$in": ["01", "02"]}},
)

Iterate recordings

for rec in dataset:
    print(rec.subject, rec.raw.info['sfreq'])

Cite This Dataset

If you use this dataset in your research, please cite the original authors.

BibTeX

@dataset{nm000133,
  title = {Alljoined1},
  author = {Jonathan Xu and Si Kai Lee and Wangshu Jiang},
  doi = {10.82901/nemar.nm000133},
  url = {https://doi.org/10.82901/nemar.nm000133},
}

§ 02Study · The README

About This Dataset#

Alljoined1 is an EEG dataset of neural responses to rapid serial visual presentation (RSVP) of natural images, designed for EEG-to-image decoding research. Eight healthy right-handed adults (6 male, 2 female; mean age 22 +/- 0.64 years, normal or corrected-to-normal vision) each viewed 10,000 natural images across two recording sessions on separate days.

The original data were recorded in BioSemi Data Format (BDF) via a 64-channel BioSemi ActiveTwo system with 24-bit A/D conversion, digitized at 512 Hz. This BIDS-formatted version preserves the BDF format to maintain full 24-bit data fidelity.

Reference: Xu, J., Aristimunha, B., Feucht, M. E., Qian, E., Liu, C., Shahjahan, T., … & Nestor, A. (2024). Alljoined–A dataset for EEG-to-Image decoding. Workshop Data Curation and Augmentation in Medical Imaging at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1–9. https://doi.org/10.48550/arXiv.2404.05553

DOI

Alljoined1: EEG Responses to Natural Images

Overview

Recording Setup

Equipment: BioSemi ActiveTwo, 64 Ag/AgCl sintered electrodes

View full README

DOI

Alljoined1: EEG Responses to Natural Images

Overview

Recording Setup

Equipment: BioSemi ActiveTwo, 64 Ag/AgCl sintered electrodes
Montage: International 10-20 system
Sampling rate: 512 Hz
Reference: CMS/DRL (BioSemi default); average reference applied in preprocessing
Electrode offset: kept below 40 mV
Power line: 60 Hz notch filter applied during preprocessing

Task Paradigm

Participants viewed natural images in a rapid serial visual presentation (RSVP) paradigm with an oddball detection task. Each trial consisted of an image presented for 300 ms, followed by 300 ms of black screen, plus 0-50 ms of random jitter. Participants pressed the space bar when two consecutive trials contained the same image (oddball detection). Oddball trials (24 per block) were excluded from analysis.

Stimulus Set

10,000 natural images per participant drawn from the Natural Scenes Dataset (NSD), which itself is sourced from MS-COCO: - 1,000 shared images: the first 960 images from the NSD “shared1000” subset, shown to all participants (each image repeated 4 times per participant) - 9,000 unique images: different for each participant

Each image was shown 4 times per participant across blocks and sessions (presented twice per block, with blocks repeated within sessions).

The BIDS event tables (every events.tsv) reference the stimuli as trial_type = "image/N" where N is a 1-indexed position (1..960) into the shared subset. The full mapping chain is:

events.tsv `value` (N, 1..960)
    ↓

sharedix[N-1]                                    (from code/0_data_collection/nsd_expdesign.mat; 1-indexed NSD id)
    ↓

nsdId = sharedix[N-1] - 1                        (0-indexed)
    ↓

code/1_preprocessing/data/nsd_stim_info_merged.csv
    ↓

cocoId, cocoSplit (val2017 / train2017)
    ↓

stimuli/<cocoSplit>/000000<cocoId:012d>.jpg      (preserves original COCO 2017 layout)

Empirically the 960 shared NSD ids are all in train2017, so every stimulus path under this dataset is stimuli/train2017/000000<id>.jpg.

To populate stimuli/:

python code/download_stimuli.py            # ~140 MB, fetches only the 960 needed
python code/smoke_test.py                  # confirms every event row resolves

A small alignment helper is provided:

import pandas as pd
from code.align_stimuli import StimulusAligner
aligner = StimulusAligner('.')
events = pd.read_csv('sub-01/ses-01/eeg/sub-01_ses-01_task-images_events.tsv', sep='\t')
paths = aligner.paths_for_events(events, subject=1, session=1)   # list[Path | None]
img   = aligner.image_for_event(events.iloc[0], subject=1, session=1)  # PIL.Image

Subjects and Sessions

8 subjects, 1-2 sessions each (13 sessions total):

| Subject | Sessions | Notes |
|---------|----------|-------|
| sub-01 | ses-01, ses-02 | |
| sub-02 | ses-01 | |
| sub-03 | ses-01, ses-02 | Epoched file missing for ses-01 |
| sub-04 | ses-01, ses-02 | |
| sub-05 | ses-01, ses-02 | |
| sub-06 | ses-01, ses-02 | |
| sub-07 | ses-01 | |
| sub-08 | ses-01 | |

Total: approximately 46,080 epochs across all participants (approximately 3,839 events per session after oddball exclusion).

Data Format

Raw continuous EEG recordings are stored as BDF files (BioSemi Data Format, 24-bit resolution). The original data were distributed as MNE-Python FIF files; conversion to BDF was performed to preserve the native 24-bit precision of the BioSemi ActiveTwo system. Round-trip validation confirmed data integrity to within 1.55e-8 V (sub-nanovolt), and event onsets match exactly (zero timing error). Per-session files:

| Path | Description |
|------|-------------|
| `sub-XX/ses-YY/eeg/sub-XX_ses-YY_task-images_eeg.bdf` | Raw EEG |
| `sub-XX/ses-YY/eeg/sub-XX_ses-YY_task-images_events.tsv` | Event markers |

Shared sidecar files (root level, BIDS inheritance principle):

| File | Description |
|------|-------------|
| `task-images_eeg.json` | Recording parameters |
| `task-images_channels.tsv` | Channel descriptions (64 EEG channels) |
| `task-images_electrodes.tsv` | Electrode positions (standard 10-20, CapTrak) |
| `task-images_coordsystem.json` | Coordinate system specification |

Event values in the events.tsv files represent image indices (1-960+) corresponding to NSD image identifiers. The trial_type column uses the format image/{index}.

Derivatives

The derivatives/epoched/ directory contains preprocessed and epoched data provided by the original authors, stored in MNE-Python FIF format (.fif).

Preprocessing pipeline applied by the original authors: 1. Band-pass filter: 0.5-125 Hz 2. Notch filter: 60 Hz (power line) 3. Independent Component Analysis (ICA): FastICA, retaining 95% of variance 4. Epoch extraction: -50 ms to 600 ms relative to stimulus onset 5. Artifact rejection: AutoReject algorithm (mean 130.75 epochs dropped per subject, SD 260.44) 6. Baseline correction 7. Average re-referencing

These epoched files are derivative products, not raw recordings, and are stored separately per BIDS conventions. Note: the epoched file for sub-03 ses-01 was not available in the source distribution.

Code

The code/ directory contains the original Alljoined1 analysis code, cloned from Alljoined/alljoined-dataset1.

BIDS Conversion

Converted to BIDS by Yahya Shirazi (Swartz Center for Computational Neuroscience, UC San Diego) using MNE-Python and custom scripts. - Source data: OSF repository https://osf.io/kqgs8/ - Conversion validated with round-trip integrity checks (data, channels, sampling frequency, event count, event values, and event timing)

License and Terms of Use

This dataset is distributed under CC-BY-NC-ND-4.0 (Creative Commons Attribution-NonCommercial-NoDerivatives 4.0). The Alljoined team imposes additional terms on their datasets. By using this dataset you agree to all conditions below. 1. Researcher shall use the Dataset only for non-commercial research and educational purposes, in accordance with Alljoined’s Terms of Use. 2. No Warranties: Alljoined makes no representations or warranties regarding the Dataset, including but not limited to warranties of non-infringement or fitness for a particular purpose. 3. Full Responsibility: Researcher accepts full responsibility for his or her use of the Dataset and shall defend and indemnify Alljoined, including their employees, officers and agents, against any and all claims arising from Researcher’s use of the Dataset. 4. Privacy Compliance: Researcher shall comply with Alljoined’s Privacy Policy and ensure that any use of the Dataset respects the privacy rights of individuals whose data may be included. 5. Sharing Rights: Researcher may provide research associates and colleagues with access to the Dataset provided that they first agree to be bound by these terms and conditions. 6. Termination Rights: Alljoined reserves the right to terminate Researcher’s access to the Dataset at any time. 7. Commercial Entity Binding: If Researcher is employed by a for-profit, commercial entity, Researcher’s employer shall also be bound by these terms and conditions, and Researcher hereby represents that he or she is fully authorized to enter into this agreement on behalf of such employer. 8. Governing Law: The law of the State of California shall apply to all disputes under this agreement.

Note: The original Alljoined1 dataset on OSF (https://osf.io/kqgs8/) does not specify an explicit license. The terms above are from the Alljoined-1.6M HuggingFace distribution and the Alljoined website; they are included here as the best available guidance. Contact the Alljoined team (team@alljoined.com) for clarification on redistribution rights.

Full terms: https://www.alljoined.com/terms-of-use
Privacy policy: https://www.alljoined.com/privacy-policy

References

Xu, J., Aristimunha, B., Feucht, M. E., Qian, E., Liu, C., Shahjahan, T., … & Nestor, A. (2024). Alljoined–A dataset for EEG-to-Image decoding. Workshop Data Curation and Augmentation in Medical Imaging at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1–9. https://doi.org/10.48550/arXiv.2404.05553

NEMAR Metadata#

[![DOI](https://img.shields.io/badge/DOI-10.82901%2Fnemar.nm000133-blue)](https://doi.org/10.82901/nemar.nm000133) # Alljoined1: EEG Responses to Natural Images ## Overview Alljoined1 is an EEG dataset of neural responses to rapid serial visual presentation (RSVP) of natural images, designed for EEG-to-image decoding research. Eight healthy right-handed adults (6 male, 2 female; mean age 22 +/- 0.64 years, normal or corrected-to-normal vision) each viewed 10,000 natural images across two recording sessions on separate days. The original data were recorded in BioSemi Data Format (BDF) via a 64-channel BioSemi ActiveTwo system with 24-bit A/D conversion, digitized at 512 Hz. This BIDS-formatted version preserves the BDF format to maintain full 24-bit data fidelity. Reference: Xu, J., Aristimunha, B., Feucht, M. E., Qian, E., Liu, C., Shahjahan, T., … & Nestor, A. (2024). Alljoined–A dataset for EEG-to-Image decoding. Workshop Data Curation and Augmentation in Medical Imaging at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1–9. https://doi.org/10.48550/arXiv.2404.05553 ## Recording Setup - Equipment: BioSemi ActiveTwo, 64 Ag/AgCl sintered electrodes - Montage: International 10-20 system - Sampling rate: 512 Hz - Reference: CMS/DRL (BioSemi default); average reference applied in preprocessing - Electrode offset: kept below 40 mV - Power line: 60 Hz notch filter applied during preprocessing ## Task Paradigm Participants viewed natural images in a rapid serial visual presentation (RSVP) paradigm with an oddball detection task. Each trial consisted of an image presented for 300 ms, followed by 300 ms of black screen, plus 0-50 ms of random jitter. Participants pressed the space bar when two consecutive trials contained the same image (oddball detection). Oddball trials (24 per block) were excluded from analysis. ## Stimulus Set 10,000 natural images per participant drawn from the Natural Scenes Dataset (NSD), which itself is sourced from MS-COCO: - 1,000 shared images: the first 960 images from the NSD “shared1000” subset, shown to all participants (each image repeated 4 times per participant) - 9,000 unique images: different for each participant Each image was shown 4 times per participant across blocks and sessions (presented twice per block, with blocks repeated within sessions). The BIDS event tables (every events.tsv) reference the stimuli as trial_type = “image/N” where N is a 1-indexed position (1..960) into the shared subset. The full mapping chain is: ``` events.tsv value (N, 1..960)

↓

sharedix[N-1] (from code/0_data_collection/nsd_expdesign.mat; 1-indexed NSD id): ↓
nsdId = sharedix[N-1] - 1 (0-indexed): ↓
code/1_preprocessing/data/nsd_stim_info_merged.csv: ↓
cocoId, cocoSplit (val2017 / train2017): ↓

stimuli/<cocoSplit>/000000<cocoId:012d>.jpg (preserves original COCO 2017 layout) ` Empirically the 960 shared NSD ids are all in `train2017`, so every stimulus path under this dataset is `stimuli/train2017/000000<id>.jpg`. To populate `stimuli/`: ```bash python code/download_stimuli.py # ~140 MB, fetches only the 960 needed python code/smoke_test.py # confirms every event row resolves ` A small alignment helper is provided: `python import pandas as pd from code.align_stimuli import StimulusAligner aligner = StimulusAligner('.') events = pd.read_csv('sub-01/ses-01/eeg/sub-01_ses-01_task-images_events.tsv', sep='\t') paths = aligner.paths_for_events(events, subject=1, session=1) # list[Path | None] img = aligner.image_for_event(events.iloc[0], subject=1, session=1) # PIL.Image ` ## Subjects and Sessions 8 subjects, 1-2 sessions each (13 sessions total): | Subject | Sessions | Notes | |---------|———-|-------| | sub-01 | ses-01, ses-02 | | | sub-02 | ses-01 | | | sub-03 | ses-01, ses-02 | Epoched file missing for ses-01 | | sub-04 | ses-01, ses-02 | | | sub-05 | ses-01, ses-02 | | | sub-06 | ses-01, ses-02 | | | sub-07 | ses-01 | | | sub-08 | ses-01 | | Total: approximately 46,080 epochs across all participants (approximately 3,839 events per session after oddball exclusion). ## Data Format Raw continuous EEG recordings are stored as BDF files (BioSemi Data Format, 24-bit resolution). The original data were distributed as MNE-Python FIF files; conversion to BDF was performed to preserve the native 24-bit precision of the BioSemi ActiveTwo system. Round-trip validation confirmed data integrity to within 1.55e-8 V (sub-nanovolt), and event onsets match exactly (zero timing error). Per-session files: | Path | Description | |------|————-| | sub-XX/ses-YY/eeg/sub-XX_ses-YY_task-images_eeg.bdf | Raw EEG | | sub-XX/ses-YY/eeg/sub-XX_ses-YY_task-images_events.tsv | Event markers | Shared sidecar files (root level, BIDS inheritance principle): | File | Description | |------|————-| | task-images_eeg.json | Recording parameters | | task-images_channels.tsv | Channel descriptions (64 EEG channels) | | task-images_electrodes.tsv | Electrode positions (standard 10-20, CapTrak) | | task-images_coordsystem.json | Coordinate system specification | Event values in the events.tsv files represent image indices (1-960+) corresponding to NSD image identifiers. The trial_type column uses the format image/{index}. ## Derivatives The derivatives/epoched/ directory contains preprocessed and epoched data provided by the original authors, stored in MNE-Python FIF format (.fif). Preprocessing pipeline applied by the original authors: 1. Band-pass filter: 0.5-125 Hz 2. Notch filter: 60 Hz (power line) 3. Independent Component Analysis (ICA): FastICA, retaining 95% of variance 4. Epoch extraction: -50 ms to 600 ms relative to stimulus onset 5. Artifact rejection: AutoReject algorithm (mean 130.75 epochs dropped per subject, SD 260.44) 6. Baseline correction 7. Average re-referencing These epoched files are derivative products, not raw recordings, and are stored separately per BIDS conventions. Note: the epoched file for sub-03 ses-01 was not available in the source distribution. ## Code The code/ directory contains the original Alljoined1 analysis code, cloned from <Alljoined/alljoined-dataset1>. ## BIDS Conversion Converted to BIDS by Yahya Shirazi (Swartz Center for Computational Neuroscience, UC San Diego) using MNE-Python and custom scripts. - Source data: OSF repository <https://osf.io/kqgs8/> - Conversion validated with round-trip integrity checks (data, channels, sampling frequency, event count, event values, and event timing) ## License and Terms of Use This dataset is distributed under CC-BY-NC-ND-4.0 (Creative Commons Attribution-NonCommercial-NoDerivatives 4.0). The Alljoined team imposes additional terms on their datasets. By using this dataset you agree to all conditions below. 1. Researcher shall use the Dataset only for non-commercial research and educational purposes, in accordance with Alljoined’s [Terms of Use](https://www.alljoined.com/terms-of-use). 2. No Warranties: Alljoined makes no representations or warranties regarding the Dataset, including but not limited to warranties of non-infringement or fitness for a particular purpose. 3. Full Responsibility: Researcher accepts full responsibility for his or her use of the Dataset and shall defend and indemnify Alljoined, including their employees, officers and agents, against any and all claims arising from Researcher’s use of the Dataset. 4. Privacy Compliance: Researcher shall comply with Alljoined’s [Privacy Policy](https://www.alljoined.com/privacy-policy) and ensure that any use of the Dataset respects the privacy rights of individuals whose data may be included. 5. Sharing Rights: Researcher may provide research associates and colleagues with access to the Dataset provided that they first agree to be bound by these terms and conditions. 6. Termination Rights: Alljoined reserves the right to terminate Researcher’s access to the Dataset at any time. 7. Commercial Entity Binding: If Researcher is employed by a for-profit, commercial entity, Researcher’s employer shall also be bound by these terms and conditions, and Researcher hereby represents that he or she is fully authorized to enter into this agreement on behalf of such employer. 8. Governing Law: The law of the State of California shall apply to all disputes under this agreement. > Note: The original Alljoined1 dataset on OSF (<https://osf.io/kqgs8/>) does not specify an explicit license. The terms above are from the Alljoined-1.6M HuggingFace distribution and the Alljoined website; they are included here as the best available guidance. Contact the Alljoined team (team@alljoined.com) for clarification on redistribution rights. - Full terms: <https://www.alljoined.com/terms-of-use> - Privacy policy: <https://www.alljoined.com/privacy-policy> ## References Xu, J., Aristimunha, B., Feucht, M. E., Qian, E., Liu, C., Shahjahan, T., … & Nestor, A. (2024). Alljoined–A dataset for EEG-to-Image decoding. Workshop Data Curation and Augmentation in Medical Imaging at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1–9. https://doi.org/10.48550/arXiv.2404.05553

License: CC-BY-NC-ND-4.0

Authors:

Jonathan Xu
Si Kai Lee
Wangshu Jiang

Versions:

Version	DOI	Released
`current`	10.82901/nemar.nm000133

§ 03Cohort · Participants

Cohort#

Dataset Statistics#

Channel counts: 64 ch (n=13 recordings)

Sampling frequencies: 512.0 Hz (n=13 recordings)

§ 04Signal · Electrodes & trace

Signal · Electrodes & live trace#

Fig. 01 Signal & montage 64 ch · EEG · 512 Hz · 8 subjects, 13 recordings

Live trace viewer — sub-08 · ses-01 · task-images

Showing one representative recording out of 8 subjects and 13 recordings in this dataset. Browse the full set on OpenNeuro; drop any other _eeg.{set,edf,bdf,vhdr} file onto the viewer (or pass ?eeg=<url>) to inspect it.

Electrode layout — EEG · 64 sensors — 64 channels

NEMAR Processing Statistics#

The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.

HED event descriptors word cloud

§ 05Manifest · BIDS tree

Manifest#

File Explorer#

Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.

Recordings—

Files—

Subjects—

Modalities—

Click to load file structure…

§ 06API · Programmatic access

API Reference#

Signature

eegdash.dataset

class

eegdash.dataset.NM000133(cache_dir, query=None, s3_bucket=None, **kwargs)

Bases: EEGDashDataset

Author (year)Xu2024

Canonical—

Importable asNM000133 · Xu2024

Sourceeegdash/dataset/registry.py · [source ↗]

class eegdash.dataset.NM000133(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#

Alljoined1

Study:: nm000133 (NeMAR)
Author (year):: Xu2024
Canonical:: —

Also importable as: NM000133, Xu2024.

Modality: eeg; Subject type: Unknown. Subjects: 8; recordings: 13; tasks: 1.

Parameters:

cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key dataset.
s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to EEGDashDataset.

data_dir#

Local dataset cache directory (cache_dir / dataset_id).

Type:: Path

query#

Merged query with the dataset filter applied.

Type:: dict

records#

Metadata records used to build the dataset, if pre-fetched.

Type:: list[dict] | None

Notes

Each item is a recording; recording-level metadata are available via dataset.description. query supports MongoDB-style filters on fields in ALLOWED_QUERY_FIELDS and is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.

References

OpenNeuro dataset: https://openneuro.org/datasets/nm000133 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=nm000133 DOI: https://doi.org/10.82901/nemar.nm000133

Examples

>>> from eegdash.dataset import NM000133
>>> dataset = NM000133(cache_dir="./data")
>>> recording = dataset[0]
>>> raw = recording.load()

__init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#

save(path: str, overwrite: bool = False, offset: int = 0)[source]#

Save datasets to files by creating one subdirectory for each dataset:

path/
    0/
        0-raw.fif | 0-epo.fif
        description.json
        raw_preproc_kwargs.json (if raws were preprocessed)
        window_kwargs.json (if this is a windowed dataset)
        window_preproc_kwargs.json  (if windows were preprocessed)
        target_name.json (if target_name is not None and dataset is raw)
    1/
        1-raw.fif | 1-epo.fif
        description.json
        raw_preproc_kwargs.json (if raws were preprocessed)
        window_kwargs.json (if this is a windowed dataset)
        window_preproc_kwargs.json  (if windows were preprocessed)
        target_name.json (if target_name is not None and dataset is raw)

Parameters:

path (str) –

Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.

Access modesMNE → braindecode → PyTorch → ML

.rawMNE Raw object — standard tools (filter, epoch, ICA, plot_psd).mne

BaseConcatDatasetEach record is a lazy BaseDataset from braindecode — windowed via create_windows_from_events.braindecode

DataLoaderWraps the windowed dataset into a PyTorch DataLoader; supports parallel workers and on-the-fly augmentations.pytorch

Zarr cacheOptional braindecode Zarr mirror for fast resume; persisted to cache_dir.zarr

Hugging FacePre-bundled mirror at EEGDash/nm000133 · pull with datasets.load_dataset("EEGDash/nm000133").huggingface

Croissant 1.0Machine-readable JSON-LD descriptor — NM000133.croissant.json (MLCommons schema, ingestible by PyTorch / TensorFlow / JAX).mlcommons

Examples using EEGDashcurated · start here

Find datasets with the EEGDash APIQuery the catalogue, filter by task or modality, list candidates.

Load one EEG recordingResolve a single record to an MNE Raw with channels and events.

EEG recording to PyTorch DataLoaderWrap braindecode windows in a DataLoader for model training.

Preprocess EEG and create windowsFilter, resample, epoch — and persist the windowed dataset.

Save and reload prepared dataCache a windowed dataset to disk and reattach it without recompute.

Download a dataset locallyPrefetch BIDS files to a local cache and validate the layout.

Swap any load_dataset(...) call for nm000133 to reproduce the tutorial on this dataset.

Citation

Jonathan Xu, Si Kai Lee, Wangshu Jiang (2024). Alljoined1. 10.82901/nemar.nm000133

Provenance

¹Contributed to nemar in BIDS format.

²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.

³Persistent identifier: 10.82901/nemar.nm000133.

Related & sibling datasets

ON004022EEG · 7 subj ON007315EEG · 2 subj NM000150EEG ON005028EEG · 11 subj ON003380EEG · 1 subj

+ 1 more — see See Also below →

BIDS

BIDS 1.9.0

Sidecars

events

Provenance

CC-BY-NC-ND-4.0 · 10.82901/nemar.nm000133

Machine-readable

schema.org/Dataset · Croissant

Mirrors

OpenNeuro · NEMAR · HuggingFace · Paper

Dataset ID	`NM000133`
Title	Alljoined1
Author (year)	`Xu2024`
Canonical	—
Importable as	`NM000133`, `Xu2024`
Year	2024
Authors	Jonathan Xu, Si Kai Lee, Wangshu Jiang
License	CC-BY-NC-ND-4.0
Citation / DOI	10.82901/nemar.nm000133
Source links	OpenNeuro \| NeMAR

NM000133: eeg dataset, 8 subjects#

Quickstart#

About This Dataset#

NEMAR Metadata#

Cohort#

Dataset Statistics#

Signal · Electrodes & live trace#

NEMAR Processing Statistics#

Manifest#

File Explorer#

API Reference#

Citation

Provenance

Related & sibling datasets

See Also#