DS003029: ieeg dataset, 35 subjects#
Epilepsy-iEEG-Multicenter-Dataset
Citation: Adam Li, Sara Inati, Kareem Zaghloul, Nathan Crone, William Anderson, Emily Johnson, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Sridevi Sarma (2019). Epilepsy-iEEG-Multicenter-Dataset. 10.18112/openneuro.ds003029.v1.0.5
35-participant iEEG dataset — Epilepsy-iEEG-Multicenter-Dataset.
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import DS003029
dataset = DS003029(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = DS003029(cache_dir="./data", subject="01")
Advanced query
dataset = DS003029(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{ds003029,
title = {Epilepsy-iEEG-Multicenter-Dataset},
author = {Adam Li and Sara Inati and Kareem Zaghloul and Nathan Crone and William Anderson and Emily Johnson and Iahn Cajigas and Damian Brusko and Jonathan Jagid and Angel Claudio and Andres Kanner and Jennifer Hopp and Stephanie Chen and Jennifer Haagensen and Sridevi Sarma},
doi = {10.18112/openneuro.ds003029.v1.0.5},
url = {https://doi.org/10.18112/openneuro.ds003029.v1.0.5},
}
About This Dataset#
This dataset was updated and prepared for release as part of a manuscript by Bernabei & Li et al. (in preparation). A subset of the data has been featured in [1].
iEEG and EEG data from 5 centers is organized in our study with a total of 100 subjects. We publish 4 centers’ dataset here due to data sharing issues.
Fragility Multi-Center Retrospective Study
Acquisitions include ECoG and SEEG. Each run specifies a different snapshot of EEG data from that specific subject’s session. For seizure sessions, this means that each run is a EEG snapshot around a different seizure event. For additional clinical metadata about each subject, refer to the clinical Excel table in the publication.
Data Availability
NIH, JHH, UMMC, and UMF agreed to share. Cleveland Clinic did not, so requires an additional DUA.
All data, except for Cleveland Clinic was approved by their centers to be de-identified and shared. All data in this dataset have no PHI, or other identifiers associated with patient. In order to access Cleveland Clinic data, please forward all requests to Amber Sours, SOURSA@ccf.org:
View full README
Fragility Multi-Center Retrospective Study
Acquisitions include ECoG and SEEG. Each run specifies a different snapshot of EEG data from that specific subject’s session. For seizure sessions, this means that each run is a EEG snapshot around a different seizure event. For additional clinical metadata about each subject, refer to the clinical Excel table in the publication.
Data Availability
NIH, JHH, UMMC, and UMF agreed to share. Cleveland Clinic did not, so requires an additional DUA.
All data, except for Cleveland Clinic was approved by their centers to be de-identified and shared. All data in this dataset have no PHI, or other identifiers associated with patient. In order to access Cleveland Clinic data, please forward all requests to Amber Sours, SOURSA@ccf.org:
Amber Sours, MPH Research Supervisor | Epilepsy Center Cleveland Clinic | 9500 Euclid Ave. S3-399 | Cleveland, OH 44195 (216) 444-8638 You will need to sign a data use agreement (DUA).
Sourcedata
For each subject, there was a raw EDF file, which was converted into the BrainVision format with mne_bids.
Each subject with SEEG implantation, also has an Excel table, called electrode_layout.xlsx, which outlines where the clinicians marked each electrode anatomically. Note that there is no rigorous atlas applied, so the main points of interest are: WM, GM, VENTRICLE, CSF, and OUT, which represent white-matter, gray-matter, ventricle, cerebrospinal fluid and outside the brain. WM, Ventricle, CSF and OUT were removed channels from further analysis. These were labeled in the corresponding BIDS channels.tsv sidecar file as status=bad.
The dataset uploaded to openneuro.org does not contain the sourcedata since there was an extra
anonymization step that occurred when fully converting to BIDS.
Derivatives
Derivatives include: * fragility analysis * frequency analysis * graph metrics analysis * figures
These can be computed by following the following paper: Neural Fragility as an EEG Marker for the Seizure Onset Zone
Events and Descriptions
Within each EDF file, there contain event markers that are annotated by clinicians, which may inform you of specific clinical events that are occuring in time, or of when they saw seizures onset and offset (clinical and electrographic).
- During a seizure event, specifically event markers may follow this time course:
eeg onset, or clinical onset - the onset of a seizure that is either marked electrographically, or by clinical behavior. Note that the clinical onset may not always be present, since some seizures manifest without clinical behavioral changes.
Marker/Mark On - these are usually annotations within some cases, where a health practitioner injects a chemical marker for use in ICTAL SPECT imaging after a seizure occurs. This is commonly done to see which portions of the brain are active metabolically.
Marker/Mark Off - This is when the ICTAL SPECT stops imaging.
eeg offset, or clinical offset - this is the offset of the seizure, as determined either electrographically, or by clinical symptoms.
Other events included may be beneficial for you to understand the time-course of each seizure. Note that ICTAL SPECT occurs in all Cleveland Clinic data. Note that seizure markers are not consistent in their description naming, so one might encode some specific regular-expression rules to consistently capture seizure onset/offset markers across all dataset. In the case of UMMC data, all onset and offset markers were provided by the clinicians on an Excel sheet instead of via the EDF file. So we went in and added the annotations manually to each EDF file.
Seizure Electrographic and Clinical Onset Annotations
For various datasets, there are seizures present within the dataset. Generally there is only one seizure per EDF file. When seizures are present, they are marked electrographically (and clinically if present) via standard approaches in the epilepsy clinical workflow.
Clinical onset are just manifestation of the seizures with clinical syndromes. Sometimes the maker may not be present.
Seizure Onset Zone Annotations
What is actually important in the evaluation of datasets is the clinical annotations of their localization hypotheses of the seizure onset zone.
- These generally include:
early onset: the earliest onset electrodes participating in the seizure that clinicians saw
early/late spread (optional): the electrodes that showed epileptic spread activity after seizure onset. Not all seizures has spread contacts annotated.
Surgical Zone (Resection or Ablation) Annotations
For patients with the post-surgical MRI available, then the segmentation process outlined above tells us which electrodes were within the surgical removed brain region.
Otherwise, clinicians give us their best estimate, of which electrodes were resected/ablated based on their surgical notes. For surgical patients whose postoperative medical records did not explicitly indicate specific resected or ablated contacts, manual visual inspection was performed to determine the approximate contacts that were located in later resected/ablated tissue. Postoperative T1 MRI scans were compared against post-SEEG implantation CT scans or CURRY coregistrations of preoperative MRI/post SEEG CT scans. Contacts of interest in and around the area of the reported resection were selected individually and the corresponding slice was navigated to on the CT scan or CURRY coregistration. After identifying landmarks of that slice (e.g. skull shape, skull features, shape of prominent brain structures like the ventricles, central sulcus, superior temporal gyrus, etc.), the location of a given contact in relation to these landmarks, and the location of the slice along the axial plane, the corresponding slice in the postoperative MRI scan was navigated to. The resected tissue within the slice was then visually inspected and compared against the distinct landmarks identified in the CT scans, if brain tissue was not present in the corresponding location of the contact, then the contact was marked as resected/ablated. This process was repeated for each contact of interest.
References
[1] Adam Li, Chester Huynh, Zachary Fitzgerald, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Emily Johnson, William Anderson, Nathan Crone, Sara Inati, Kareem Zaghloul, Juan Bulacio, Jorge Gonzalez-Martinez, Sridevi V. Sarma. Neural Fragility as an EEG Marker of the Seizure Onset Zone. bioRxiv 862797; doi: https://doi.org/10.1101/862797 [2] Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Höchenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896 [3] Holdgraf, C., Appelhoff, S., Bickel, S., Bouchard, K., D’Ambrosio, S., David, O., … Hermes, D. (2019). iEEG-BIDS, extending the Brain Imaging Data Structure specification to human intracranial electrophysiology. Scientific Data, 6, 102. https://doi.org/10.1038/s41597-019-0105-7 [4] Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8
Cohort#
Dataset Statistics#
Age distribution by gender (n=28, range 13–59 yr, mean 36.5 yr)
Sex composition
Channel counts (ch)
Sampling frequencies (Hz)
Total recording duration: 8 h 12 min
Signal · Electrodes & live trace#
Live trace viewer — sub-pt6 · ses-presurgery · task-ictal · run-01
Showing one representative recording out of
35 subjects and 106 recordings in this dataset.
Browse the full set on OpenNeuro;
drop any other _ieeg.{set,edf,bdf,vhdr} file onto the
viewer (or pass ?ieeg=<url>) to inspect it.
No scalp electrode layout is currently indexed for this dataset. Once the eegdash montage registry ingests it, the interactive viewer will appear here automatically.
NEMAR Processing Statistics#
The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.
HED event descriptors word cloud
Manifest#
File Explorer#
Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.
Full dataset metadata table
Dataset ID |
|
Title |
Epilepsy-iEEG-Multicenter-Dataset |
Author (year) |
|
Canonical |
— |
Importable as |
|
Year |
2019 |
Authors |
Adam Li, Sara Inati, Kareem Zaghloul, Nathan Crone, William Anderson, Emily Johnson, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Sridevi Sarma |
License |
CC0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{ds003029,
title = {Epilepsy-iEEG-Multicenter-Dataset},
author = {Adam Li and Sara Inati and Kareem Zaghloul and Nathan Crone and William Anderson and Emily Johnson and Iahn Cajigas and Damian Brusko and Jonathan Jagid and Angel Claudio and Andres Kanner and Jennifer Hopp and Stephanie Chen and Jennifer Haagensen and Sridevi Sarma},
doi = {10.18112/openneuro.ds003029.v1.0.5},
url = {https://doi.org/10.18112/openneuro.ds003029.v1.0.5},
}
API Reference#
eegdash.datasetEEGDashDatasetDS003029 · Li2020eegdash/dataset/registry.py · [source ↗]- class eegdash.dataset.DS003029(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Epilepsy-iEEG-Multicenter-Dataset
- Study:
ds003029(OpenNeuro)- Author (year):
Li2020- Canonical:
—
Also importable as:
DS003029,Li2020.Modality:
ieeg; Experiment type:Clinical/Intervention; Subject type:Epilepsy. Subjects: 35; recordings: 106; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/ds003029 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds003029 DOI: https://doi.org/10.18112/openneuro.ds003029.v1.0.5 NEMAR citation count: 19
Examples
>>> from eegdash.dataset import DS003029 >>> dataset = DS003029(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
- __init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
- save(path: str, overwrite: bool = False, offset: int = 0)[source]#
Save datasets to files by creating one subdirectory for each dataset:
path/ 0/ 0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw) 1/ 1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- Parameters:
path (str) –
- Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.
BaseDataset from braindecode — windowed via create_windows_from_events.braindecodeDataLoader; supports parallel workers and on-the-fly augmentations.pytorchdatasets.load_dataset("EEGDash/ds003029").huggingfaceSwap any load_dataset(...) call for ds003029 to reproduce the tutorial on this dataset.
Citation
Adam Li, Sara Inati, Kareem Zaghloul, Nathan Crone, William Anderson, … (2019). Epilepsy-iEEG-Multicenter-Dataset. 10.18112/openneuro.ds003029.v1.0.5
Provenance
¹Contributed to openneuro in BIDS format.
²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.
³Persistent identifier: 10.18112/openneuro.ds003029.v1.0.5.
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset