EEGdashOpenNeuroDS003029
Iss. 3029 · 35 subjects · 106 recordings · CC0
Dataset Brief · Epilepsy-iEEG-Multicenter-Dataset

DS003029: ieeg dataset, 35 subjects#

Epilepsy-iEEG-Multicenter-Dataset

Citation: Adam Li, Sara Inati, Kareem Zaghloul, Nathan Crone, William Anderson, Emily Johnson, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Sridevi Sarma (2019). Epilepsy-iEEG-Multicenter-Dataset. 10.18112/openneuro.ds003029.v1.0.5

35-participant iEEG dataset — Epilepsy-iEEG-Multicenter-Dataset.

iEEG · 129 (30), 132 (8), 135 (6), 88 (6), 123 (6), 147 (6), 101 (5), 91 (4), 98 (4), 110 (3), 86 (3), 81 (3), 111 (3), 99 (3), 89 (3), 80 (3), 53 (3), 60 (3), 65 (2), 47, 216 ch250, 500, 999, 1000, 1025, 2000 HzBIDS 1.4.0Task · ictalEpilepsyOtherClinical/Intervention
Layer 01Study
What was asked
Hypothesis, independent & dependent variables, paradigm, cohort, and the editorial caveats around what the recordings can and cannot answer.
Layer 02Signal · BIDS
What was recorded
Sidecars, channels & electrodes, coordinate system, event semantics, and quality stats from the NEMAR pipeline when available.
Layer 03Training · ML
What you can train on
Recommended access modes — MNE Raw, braindecode windows, PyTorch DataLoader — plus the targets the metadata makes addressable.
§ 01Access · Get started

Quickstart#

Install

pip install eegdash

Access the data

from eegdash.dataset import DS003029

dataset = DS003029(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)

Filter by subject

dataset = DS003029(cache_dir="./data", subject="01")

Advanced query

dataset = DS003029(
    cache_dir="./data",
    query={"subject": {"$in": ["01", "02"]}},
)

Iterate recordings

for rec in dataset:
    print(rec.subject, rec.raw.info['sfreq'])

If you use this dataset in your research, please cite the original authors.

BibTeX

@dataset{ds003029,
  title = {Epilepsy-iEEG-Multicenter-Dataset},
  author = {Adam Li and Sara Inati and Kareem Zaghloul and Nathan Crone and William Anderson and Emily Johnson and Iahn Cajigas and Damian Brusko and Jonathan Jagid and Angel Claudio and Andres Kanner and Jennifer Hopp and Stephanie Chen and Jennifer Haagensen and Sridevi Sarma},
  doi = {10.18112/openneuro.ds003029.v1.0.5},
  url = {https://doi.org/10.18112/openneuro.ds003029.v1.0.5},
}
§ 02Study · The README

About This Dataset#

This dataset was updated and prepared for release as part of a manuscript by Bernabei & Li et al. (in preparation). A subset of the data has been featured in [1].

iEEG and EEG data from 5 centers is organized in our study with a total of 100 subjects. We publish 4 centers’ dataset here due to data sharing issues.

Fragility Multi-Center Retrospective Study

Acquisitions include ECoG and SEEG. Each run specifies a different snapshot of EEG data from that specific subject’s session. For seizure sessions, this means that each run is a EEG snapshot around a different seizure event. For additional clinical metadata about each subject, refer to the clinical Excel table in the publication.

Data Availability

NIH, JHH, UMMC, and UMF agreed to share. Cleveland Clinic did not, so requires an additional DUA.

All data, except for Cleveland Clinic was approved by their centers to be de-identified and shared. All data in this dataset have no PHI, or other identifiers associated with patient. In order to access Cleveland Clinic data, please forward all requests to Amber Sours, SOURSA@ccf.org:

View full README

Fragility Multi-Center Retrospective Study

Acquisitions include ECoG and SEEG. Each run specifies a different snapshot of EEG data from that specific subject’s session. For seizure sessions, this means that each run is a EEG snapshot around a different seizure event. For additional clinical metadata about each subject, refer to the clinical Excel table in the publication.

Data Availability

NIH, JHH, UMMC, and UMF agreed to share. Cleveland Clinic did not, so requires an additional DUA.

All data, except for Cleveland Clinic was approved by their centers to be de-identified and shared. All data in this dataset have no PHI, or other identifiers associated with patient. In order to access Cleveland Clinic data, please forward all requests to Amber Sours, SOURSA@ccf.org:

Amber Sours, MPH Research Supervisor | Epilepsy Center Cleveland Clinic | 9500 Euclid Ave. S3-399 | Cleveland, OH 44195 (216) 444-8638 You will need to sign a data use agreement (DUA).

Sourcedata

For each subject, there was a raw EDF file, which was converted into the BrainVision format with mne_bids.

Each subject with SEEG implantation, also has an Excel table, called electrode_layout.xlsx, which outlines where the clinicians marked each electrode anatomically. Note that there is no rigorous atlas applied, so the main points of interest are: WM, GM, VENTRICLE, CSF, and OUT, which represent white-matter, gray-matter, ventricle, cerebrospinal fluid and outside the brain. WM, Ventricle, CSF and OUT were removed channels from further analysis. These were labeled in the corresponding BIDS channels.tsv sidecar file as status=bad. The dataset uploaded to openneuro.org does not contain the sourcedata since there was an extra anonymization step that occurred when fully converting to BIDS.

Derivatives

Derivatives include: * fragility analysis * frequency analysis * graph metrics analysis * figures

These can be computed by following the following paper: Neural Fragility as an EEG Marker for the Seizure Onset Zone

Events and Descriptions

Within each EDF file, there contain event markers that are annotated by clinicians, which may inform you of specific clinical events that are occuring in time, or of when they saw seizures onset and offset (clinical and electrographic).

During a seizure event, specifically event markers may follow this time course:
  • eeg onset, or clinical onset - the onset of a seizure that is either marked electrographically, or by clinical behavior. Note that the clinical onset may not always be present, since some seizures manifest without clinical behavioral changes.

  • Marker/Mark On - these are usually annotations within some cases, where a health practitioner injects a chemical marker for use in ICTAL SPECT imaging after a seizure occurs. This is commonly done to see which portions of the brain are active metabolically.

  • Marker/Mark Off - This is when the ICTAL SPECT stops imaging.

  • eeg offset, or clinical offset - this is the offset of the seizure, as determined either electrographically, or by clinical symptoms.

Other events included may be beneficial for you to understand the time-course of each seizure. Note that ICTAL SPECT occurs in all Cleveland Clinic data. Note that seizure markers are not consistent in their description naming, so one might encode some specific regular-expression rules to consistently capture seizure onset/offset markers across all dataset. In the case of UMMC data, all onset and offset markers were provided by the clinicians on an Excel sheet instead of via the EDF file. So we went in and added the annotations manually to each EDF file.

Seizure Electrographic and Clinical Onset Annotations

For various datasets, there are seizures present within the dataset. Generally there is only one seizure per EDF file. When seizures are present, they are marked electrographically (and clinically if present) via standard approaches in the epilepsy clinical workflow.

Clinical onset are just manifestation of the seizures with clinical syndromes. Sometimes the maker may not be present.

Seizure Onset Zone Annotations

What is actually important in the evaluation of datasets is the clinical annotations of their localization hypotheses of the seizure onset zone.

These generally include:
  • early onset: the earliest onset electrodes participating in the seizure that clinicians saw

  • early/late spread (optional): the electrodes that showed epileptic spread activity after seizure onset. Not all seizures has spread contacts annotated.

Surgical Zone (Resection or Ablation) Annotations

For patients with the post-surgical MRI available, then the segmentation process outlined above tells us which electrodes were within the surgical removed brain region.

Otherwise, clinicians give us their best estimate, of which electrodes were resected/ablated based on their surgical notes. For surgical patients whose postoperative medical records did not explicitly indicate specific resected or ablated contacts, manual visual inspection was performed to determine the approximate contacts that were located in later resected/ablated tissue. Postoperative T1 MRI scans were compared against post-SEEG implantation CT scans or CURRY coregistrations of preoperative MRI/post SEEG CT scans. Contacts of interest in and around the area of the reported resection were selected individually and the corresponding slice was navigated to on the CT scan or CURRY coregistration. After identifying landmarks of that slice (e.g. skull shape, skull features, shape of prominent brain structures like the ventricles, central sulcus, superior temporal gyrus, etc.), the location of a given contact in relation to these landmarks, and the location of the slice along the axial plane, the corresponding slice in the postoperative MRI scan was navigated to. The resected tissue within the slice was then visually inspected and compared against the distinct landmarks identified in the CT scans, if brain tissue was not present in the corresponding location of the contact, then the contact was marked as resected/ablated. This process was repeated for each contact of interest.

References

[1] Adam Li, Chester Huynh, Zachary Fitzgerald, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Emily Johnson, William Anderson, Nathan Crone, Sara Inati, Kareem Zaghloul, Juan Bulacio, Jorge Gonzalez-Martinez, Sridevi V. Sarma. Neural Fragility as an EEG Marker of the Seizure Onset Zone. bioRxiv 862797; doi: https://doi.org/10.1101/862797 [2] Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Höchenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896 [3] Holdgraf, C., Appelhoff, S., Bickel, S., Bouchard, K., D’Ambrosio, S., David, O., … Hermes, D. (2019). iEEG-BIDS, extending the Brain Imaging Data Structure specification to human intracranial electrophysiology. Scientific Data, 6, 102. https://doi.org/10.1038/s41597-019-0105-7 [4] Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8

§ 03Cohort · Participants

Cohort#

Dataset Statistics#

Age distribution by gender (n=28, range 13–59 yr, mean 36.5 yr)

10152025303540455055
Female · 12Male · 16

Sex composition

28
subjects
Female
12
Male
16
F : M ratio
0.75 : 1
43% female · n = 28 subjects with reported sex.
HandednessRight · 21Left · 2

Channel counts (ch)

475360658081868889919899101110111123129132135147216

Sampling frequencies (Hz)

249.9499.7999.41000.010001000.01024.62000.0

Total recording duration: 8 h 12 min

§ 04Signal · Electrodes & trace

Signal · Electrodes & live trace#

Fig. 01 Signal & montage 129 (30), 132 (8), 135 (6), 88 (6), 123 (6), 147 (6), 101 (5), 91 (4), 98 (4), 110 (3), 86 (3), 81 (3), 111 (3), 99 (3), 89 (3), 80 (3), 53 (3), 60 (3), 65 (2), 47, 216 ch · iEEG · 250, 500, 999, 1000, 1025, 2000 Hz · 35 subjects, 106 recordings
Live trace viewer — sub-pt6 · ses-presurgery · task-ictal · run-01

Showing one representative recording out of 35 subjects and 106 recordings in this dataset. Browse the full set on OpenNeuro; drop any other _ieeg.{set,edf,bdf,vhdr} file onto the viewer (or pass ?ieeg=<url>) to inspect it.

No scalp electrode layout is currently indexed for this dataset. Once the eegdash montage registry ingests it, the interactive viewer will appear here automatically.

NEMAR Processing Statistics#

The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.

HED event descriptors word cloud HED event descriptors word cloud — DS003029
§ 05Manifest · BIDS tree

Manifest#

File Explorer#

Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.

Recordings
Files
Subjects
Modalities
Click to load file structure…
Full dataset metadata table

Dataset ID

DS003029

Title

Epilepsy-iEEG-Multicenter-Dataset

Author (year)

Li2020

Canonical

Importable as

DS003029, Li2020

Year

2019

Authors

Adam Li, Sara Inati, Kareem Zaghloul, Nathan Crone, William Anderson, Emily Johnson, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Sridevi Sarma

License

CC0

Citation / DOI

doi:10.18112/openneuro.ds003029.v1.0.5

Source links

OpenNeuro | NeMAR | Source URL

Copy-paste BibTeX
@dataset{ds003029,
  title = {Epilepsy-iEEG-Multicenter-Dataset},
  author = {Adam Li and Sara Inati and Kareem Zaghloul and Nathan Crone and William Anderson and Emily Johnson and Iahn Cajigas and Damian Brusko and Jonathan Jagid and Angel Claudio and Andres Kanner and Jennifer Hopp and Stephanie Chen and Jennifer Haagensen and Sridevi Sarma},
  doi = {10.18112/openneuro.ds003029.v1.0.5},
  url = {https://doi.org/10.18112/openneuro.ds003029.v1.0.5},
}
§ 06API · Programmatic access

API Reference#

Signature
eegdash.dataset
class
eegdash.dataset.DS003029(cache_dir, query=None, s3_bucket=None, **kwargs)
Bases: EEGDashDataset
Author (year)Li2020
Canonical
Importable asDS003029 · Li2020
Sourceeegdash/dataset/registry.py · [source ↗]
class eegdash.dataset.DS003029(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#

Epilepsy-iEEG-Multicenter-Dataset

Study:

ds003029 (OpenNeuro)

Author (year):

Li2020

Canonical:

Also importable as: DS003029, Li2020.

Modality: ieeg; Experiment type: Clinical/Intervention; Subject type: Epilepsy. Subjects: 35; recordings: 106; tasks: 1.

Parameters:
  • cache_dir (str | Path) – Directory where data are cached locally.

  • query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key dataset.

  • s3_bucket (str | None) – Base S3 bucket used to locate the data.

  • **kwargs (dict) – Additional keyword arguments forwarded to EEGDashDataset.

data_dir#

Local dataset cache directory (cache_dir / dataset_id).

Type:

Path

query#

Merged query with the dataset filter applied.

Type:

dict

records#

Metadata records used to build the dataset, if pre-fetched.

Type:

list[dict] | None

Notes

Each item is a recording; recording-level metadata are available via dataset.description. query supports MongoDB-style filters on fields in ALLOWED_QUERY_FIELDS and is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.

References

OpenNeuro dataset: https://openneuro.org/datasets/ds003029 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds003029 DOI: https://doi.org/10.18112/openneuro.ds003029.v1.0.5 NEMAR citation count: 19

Examples

>>> from eegdash.dataset import DS003029
>>> dataset = DS003029(cache_dir="./data")
>>> recording = dataset[0]
>>> raw = recording.load()
__init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
save(path: str, overwrite: bool = False, offset: int = 0)[source]#

Save datasets to files by creating one subdirectory for each dataset:

path/
    0/
        0-raw.fif | 0-epo.fif
        description.json
        raw_preproc_kwargs.json (if raws were preprocessed)
        window_kwargs.json (if this is a windowed dataset)
        window_preproc_kwargs.json  (if windows were preprocessed)
        target_name.json (if target_name is not None and dataset is raw)
    1/
        1-raw.fif | 1-epo.fif
        description.json
        raw_preproc_kwargs.json (if raws were preprocessed)
        window_kwargs.json (if this is a windowed dataset)
        window_preproc_kwargs.json  (if windows were preprocessed)
        target_name.json (if target_name is not None and dataset is raw)
Parameters:
  • path (str) –

    Directory in which subdirectories are created to store

    -raw.fif | -epo.fif and .json files to.

  • overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.

  • offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.

Access modesMNE → braindecode → PyTorch → ML
.rawMNE Raw object — standard tools (filter, epoch, ICA, plot_psd).mne
DataLoaderWraps the windowed dataset into a PyTorch DataLoader; supports parallel workers and on-the-fly augmentations.pytorch
Zarr cacheOptional braindecode Zarr mirror for fast resume; persisted to cache_dir.zarr
Hugging FacePre-bundled mirror at EEGDash/ds003029 · pull with datasets.load_dataset("EEGDash/ds003029").huggingface
Croissant 1.0Machine-readable JSON-LD descriptorDS003029.croissant.json (MLCommons schema, ingestible by PyTorch / TensorFlow / JAX).mlcommons
Examples using EEGDashcurated · start here

Swap any load_dataset(...) call for ds003029 to reproduce the tutorial on this dataset.

Citation

Adam Li, Sara Inati, Kareem Zaghloul, Nathan Crone, William Anderson, … (2019). Epilepsy-iEEG-Multicenter-Dataset. 10.18112/openneuro.ds003029.v1.0.5

Provenance

¹Contributed to openneuro in BIDS format.

²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.

³Persistent identifier: 10.18112/openneuro.ds003029.v1.0.5.

BIDS
BIDS 1.4.0
Sidecars
events · channels
Machine-readable

See Also#