DS005555#
The Bitbrain Open Access Sleep (BOAS) dataset
Access recordings and metadata through EEGDash.
Citation: Eduardo López-Larraz, María Sierra-Torralba, Sergio Clemente, Galit Fierro, David Oriol, Javier Minguez, Luis Montesano, Jens G. Klinzing (2024). The Bitbrain Open Access Sleep (BOAS) dataset. 10.18112/openneuro.ds005555.v1.1.1
Modality: eeg Subjects: 128 Recordings: 1414 License: CC0 Source: openneuro Citations: 0.0
Metadata: Complete (100%)
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import DS005555
dataset = DS005555(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = DS005555(cache_dir="./data", subject="01")
Advanced query
dataset = DS005555(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{ds005555,
title = {The Bitbrain Open Access Sleep (BOAS) dataset},
author = {Eduardo López-Larraz and María Sierra-Torralba and Sergio Clemente and Galit Fierro and David Oriol and Javier Minguez and Luis Montesano and Jens G. Klinzing},
doi = {10.18112/openneuro.ds005555.v1.1.1},
url = {https://doi.org/10.18112/openneuro.ds005555.v1.1.1},
}
About This Dataset#
README
The Bitbrain Open Access Sleep (BOAS) dataset.
Overview
View full README
README
The Bitbrain Open Access Sleep (BOAS) dataset.
Overview
This project aimed at bridging the gap between gold-standard clinical sleep monitoring and emerging wearable EEG technologies. The dataset contains data from **128 nights**in which participants were simultaneously monitored with two technologies: a ** Brain Quick Plus Evolution PSG system by Micromed**and a **wearable EEG headband by Bitbrain**. The Micromed PSG system records a comprehensive and clinically validated set of physiological sleep parameters, while the Bitbrain wearable EEG headband offers a user-friendly, self-administered alternative, limited to forehead EEG electrodes, movement sensors, and photo-plethysmography. **Data from both systems were acquired simultaneously**, allowing for direct comparison and validation of the wearable EEG device against the established PSG standard. This dual-recording approach provides a rich resource for evaluating the performance and potential of wearable EEG technology in sleep studies.
Human sleep scoring: To ensure robust and reliable sleep staging, we followed a rigorous labeling process. Three expert sleep scorers independently annotated the PSG recordings**following criteria developed by the American Academy of Sleep Medicine (AASM) (Berry et al., 2015). From the resulting three scorings, a ** consensus label was derived: each epoch of sleep data received the label scored by at least two of the scorers. In cases where all three scorers had given different labels, a fourth scorer made the final decision. This consensus labeling approach addresses the inherent variability in human-derived sleep scoring, with an estimated inter-scorer agreement of approximately 85% (Danker-Hopfe et al., 2009; Rosenberg and Van Hout, 2013).
Automatic scoring: We used the human expert consensus labels to train a deep learning model (Esparza-Iaizzo et al., 2024). By implementing a cross-validation procedure, we trained and validated the model separately on the PSG and wearable EEG datasets. The model achieved an 87.13% match between human-consensus and network-provided labels for the PSG data, and an 86.71% match for the wearable EEG data.
Our dataset includes:
**PSG recordings**from 128 nights (files ending with “psg_eeg.edf”),
**Wearable EEG recordings**from the same nights (files ending with “headband_eeg.edf”),
Human-consensus sleep stage labels, obtained from the PSG recordings (”stage_hum” in the PSG data’s event files),
AI-generated sleep stage labels, separately obtained from PSG recordings and from wearable EEG recordings (”stage_ai” in both the PSG and headband data’s event files).
**Further meta data**for each recording (i.e., the participants’ age, sex, and BMI, provided in the file “participants.tsv”)
Participants
Participants were members of the general population, provided written informed consent, and received economic compensation of 50€ per night.
In order to represent the general population, we recruited a broad spectrum of participants along the dimensions of age, sex, and body mass index. We did not recruit patients with particular health conditions but only excluded severe conditions that could have affected the feasibility or safety of the study. In detail, inclusion and exclusion criteria were as follows.
Inclusion criteria - Age > 18 years, - Sufficient knowledge of Spanish to understand the explanatory text, the consent form and study-related instructions.
Exclusion criteria - Current severe medical interventions or medication, - History of severe neurological or psychiatric disorders, - Severe health problems in the last 12 months (especially neurological or cardiac disorders), - Current pregnancy or nursing, - Use of psychotropic medication, benzodiazepines, gamma-hydroxybutyric acid, and similar drugs before or during the study.
Format
The dataset is formatted according to the Brain Imaging Data Structure (BIDS). Please note that while the recordings are named from sub-1 up to sub-128, some come from the same participants. 108 unique individuals participated in the recordings, data of which can be matched using the pid (= unique participant ID) property in the file “participants.tsv”
The folder of each recording contains the data recorded with the PSG (”sub-xx_task-Sleep_acq-psg_eeg.edf”) and with the wearable EEG headband (”sub-xx_task-Sleep_acq-headband_eeg.edf”).
Channel groups
Not all recordings contain data from all available sensors. The full list of available sensors for each recording can be obtained on the “channels.tsv” file. Channels in this file are coded in groups: - PSG_EEG: Electroencephalography recorded with the PSG system. Channels available are F3, F4, C3, C4, O1, O2 (PSG_F3, PSG_F4, PSG_C3, PSG_C4, PSG_O1, PSG_O2). - PSG_EOG: Electrooculography signals recorded with the PSG system. The location of the EOG electrodes was lateral of the eyes; one slightly lower than the participant’s left eye and one slightly higher than the participant’s right eye (according to AASM guidelines). For recordings containing only one EOG channel (PSG_EOG), the electrodes were recorded as a bipolar derivation. If two EOG channels are present (PSG_EOGR, PSG_EOGL), both electrodes were referenced against the left mastoid. - PSG_EMG: Electromyography signals recorded with the PSG system. Data contain a single EMG channel (PSG_EMG), which is the result of a bipolar derivation of two chin electrodes. - PSG_BELTS: Breathing activity recorded by the PSG system using abdominal and thoracic breathing belts (PSG_ABD, PSG_THOR). - PSG_THER: Respiratory airflow recorded with the PSG system using a thermistor (PSG_THER). - PSG_CAN: Respiratory airflow recorded with the PSG system using a nasal cannula (PSG_CAN). - PSG_PPG: Photopletismographic (PPG) activity recorded with the PSG system. Channels available are pulse (PSG_PULSE), heart beat (PSG_BEAT) and oxygen saturation (PSG_SPO2). - HB_EEG: Electroencephalography recorded with the wearable EEG headband. Headband channels are approximately located at AF7 and AF8 (HB_1, HB_2). - HB_IMU: Movement activity recorded by an Inertial Measurement Unit (IMU) in the headband. Signals are derived from an accelerometer (HB_IMU_1, HB_IMU_2, HB_IMU_3) and gyroscope (HB_IMU_4, HB_IMU_5, HB_IMU_6), both recording signals for all three spatial dimensions. - HB_PULSE: Pulse activity recorded with the wearable EEG headband using a PPG sensor (HB_PULSE).
Sleep staging labels
The sleep stage labels for each recording are coded as events in corresponding event files (stage_hum and stage_ai; see above). Stages are coded as follows: - 0: Wake, - 1: NonREM sleep stage 1 (N1), - 2: NonREM sleep stage 2 (N2), - 3: NonREM sleep stage 3 (N3), - 4: REM sleep, - 8: PSG disconnections (e.g., due to bathroom breaks; human-scored only) - -2: Artifacts and missing data (AI-scored only)
References
Berry, R. B., Brooks, R., Gamaldo, C. E., Harding, S. M., Lloyd, R. M., Marcus, C. L., et al. (2015). The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications, Version 2.2. Darien, Illinois.
Danker-Hopfe, H., Anderer, P., Zeitlhofer, J., Boeck, M., Dorn, H., Gruber, G., et al. (2009). Interrater reliability for sleep scoring according to the Rechtschaffen & Kales and the new AASM standard. J. Sleep Res. 18, 74–84. doi: 10.1111/j.1365-2869.2008.00700.x.
Esparza-Iaizzo, M., Sierra-Torralba, M., Klinzing, J. G., Minguez, J., Montesano, L., and López-Larraz, E. (2024). Automatic sleep scoring for real-time monitoring and stimulation in individuals with and without sleep apnea. bioRxiv, 2024.06.12.597764. doi: 10.1101/2024.06.12.597764.
Rosenberg, R. S., and Van Hout, S. (2013). The American Academy of Sleep Medicine inter-scorer reliability program: sleep stage scoring. J. Clin. sleep Med. 9, 81–87. doi: 10.5664/jcsm.2350.
Contact
If you have any questions or comments, please contact:
Eduardo López-Larraz: eduardo.lopez@bitbrain.com Jens G. Klinzing: jens.klinzing@bitbrain.com
Dataset Information#
Dataset ID |
|
Title |
The Bitbrain Open Access Sleep (BOAS) dataset |
Year |
2024 |
Authors |
Eduardo López-Larraz, María Sierra-Torralba, Sergio Clemente, Galit Fierro, David Oriol, Javier Minguez, Luis Montesano, Jens G. Klinzing |
License |
CC0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{ds005555,
title = {The Bitbrain Open Access Sleep (BOAS) dataset},
author = {Eduardo López-Larraz and María Sierra-Torralba and Sergio Clemente and Galit Fierro and David Oriol and Javier Minguez and Luis Montesano and Jens G. Klinzing},
doi = {10.18112/openneuro.ds005555.v1.1.1},
url = {https://doi.org/10.18112/openneuro.ds005555.v1.1.1},
}
Found an issue with this dataset?
If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!
Technical Details#
Subjects: 128
Recordings: 1414
Tasks: 1
Channels: 2 (144), 11 (106), 9 (98), 8 (73), 14 (25), 13 (19), 16 (19), 3 (18), 12 (5), 15 (5)
Sampling rate (Hz): 256.0
Duration (hours): 0.0
Pathology: Not specified
Modality: —
Type: —
Size on disk: 33.5 GB
File count: 1414
Format: BIDS
License: CC0
DOI: doi:10.18112/openneuro.ds005555.v1.1.1
API Reference#
Use the DS005555 class to access this dataset programmatically.
- class eegdash.dataset.DS005555(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Bases:
EEGDashDatasetOpenNeuro dataset
ds005555. Modality:eeg; Experiment type:Unknown; Subject type:Unknown. Subjects: 128; recordings: 256; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
- query#
Merged query with the dataset filter applied.
- Type:
dict
- records#
Metadata records used to build the dataset, if pre-fetched.
- Type:
list[dict] | None
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/ds005555 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds005555
Examples
>>> from eegdash.dataset import DS005555 >>> dataset = DS005555(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset