EEGdash›OpenNeuro›DS004119

Iss. 4119 · 21 subjects · 22 recordings · CC0

Dataset Brief · BCIT Basic Guard Duty

DS004119: eeg dataset, 21 subjects#

BCIT Basic Guard Duty

Access recordings and metadata through EEGDash.

Citation: Jonathan Touryan (data and curation), Brent Lance (data), Scott Kerick (data), Anthony Ries (data), Kaleb McDowell (data), Tony Johnson (curation), Kay Robbins (curation) (20). BCIT Basic Guard Duty. 10.18112/openneuro.ds004119.v1.0.0

Modality: eeg Subjects: 21 Recordings: 22 License: CC0 Source: openneuro Citations: 0.0

Metadata: Complete (100%)

21-participant EEG dataset — BCIT Basic Guard Duty.

Data & curation Jonathan Touryan (data and curation) · Brent Lance (data) · Scott Kerick (data) · Anthony Ries (data) · Kaleb McDowell (data) · Tony Johnson (curation) · …
Year 20 · Distributed via OpenNeuro
Funding This research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-10-0-0002.

EEG · 262 ch1024 HzBIDS 1.7.0HED ✓Task · GuardDutyHealthyVisualAttention

Layer 01Study

What was asked

Hypothesis, independent & dependent variables, paradigm, cohort, and the editorial caveats around what the recordings can and cannot answer.

Layer 02Signal · BIDS

What was recorded

Sidecars, channels & electrodes, coordinate system, event semantics, and quality stats from the NEMAR pipeline when available.

Layer 03Training · ML

What you can train on

Recommended access modes — MNE Raw, braindecode windows, PyTorch DataLoader — plus the targets the metadata makes addressable.

§ 01Access · Get started

Quickstart#

Get Started

Install

pip install eegdash

Access the data

from eegdash.dataset import DS004119

dataset = DS004119(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)

Query & Filter

Filter by subject

dataset = DS004119(cache_dir="./data", subject="01")

Advanced query

dataset = DS004119(
    cache_dir="./data",
    query={"subject": {"$in": ["01", "02"]}},
)

Iterate recordings

for rec in dataset:
    print(rec.subject, rec.raw.info['sfreq'])

Cite This Dataset

If you use this dataset in your research, please cite the original authors.

BibTeX

@dataset{ds004119,
  title = {BCIT Basic Guard Duty},
  author = {Jonathan Touryan (data and curation) and Brent Lance (data) and Scott Kerick (data) and Anthony Ries (data) and Kaleb McDowell (data) and Tony Johnson (curation) and Kay Robbins (curation)},
  doi = {10.18112/openneuro.ds004119.v1.0.0},
  url = {https://doi.org/10.18112/openneuro.ds004119.v1.0.0},
}

§ 02Study · The README

About This Dataset#

Overview: The Basic Guard Duty study was designed to measure sustained vigilance in realistic settings

by having subjects verify information on replica ID badges.

The task was performed in conjunction with two other tasks a calibration driving task and a baseline driving task.

The data collected for the two driving tasks is not included in this dataset.

BCIT Basic Guard Duty

Introduction

Another study (Advanced Guard Duty), which included a similar set-up but a different experimental design and a different subject pool, is not included in this dataset. In the Basic Guard Duty study the rate of ID presentation varied among tasks. In the Advanced Guard Duty study both the rate of ID presentation and the criteria for verification varied among blocks. Further information is available on request from cancta.net.

View full README

BCIT Basic Guard Duty

Introduction

Another study (Advanced Guard Duty), which included a similar set-up but a different experimental design and a different subject pool, is not included in this dataset. In the Basic Guard Duty study the rate of ID presentation varied among tasks. In the Advanced Guard Duty study both the rate of ID presentation and the criteria for verification varied among blocks. Further information is available on request from cancta.net.

Methods

Subjects: Volunteers from the local community recruited through advertisements. Apparatus: Driving simulator with steering wheel and brake / foot pedals (Real Time Technologies; Dearborn, MI); Video Refresh Rate (VRR) = 900 Hz; Vehicle data log file Sampling Rate (SR) = 100 Hz); EEG (BioSemi 256 (+8) channel systems with 4 eye and 2 mastoid channels recorded; SR=1024 Hz); Eye Tracking (Sensomotoric Instruments (SMI); REDEYE250). Initial setup: Upon arrival to the lab, subjects were given an introduction to the primary study for which they were recruited and provided informed consent and provided demographics information.

This was followed by a practice session, to acclimate the subject to the driving simulator. The driving practice task lasted 10-15 min, until asymptotic performance in steering and speed control was demonstrated and lack of motion sickness was reported. Subjects were then outfitted and prepped for eye tracking and EEG acquisition. Task organization: Subjects always began recording sessions by performing a Calibration Driving task, which was a 15-minute drive where the subject controlled only the steering (and speed was controlled by the simulator).

Following this, subjects would perform the Baseline Driving task and the Guard Duty task, with counter-balancing used across subjects as to which of them came first.

The Baseline Driving and Calibration Driving tasks are not included in this dataset. Guard duty task details: The guard duty task entailed a serial presentation of replica identification (ID) cards (750 ? 450 pixels) paired with a reference image (300 x 400 pixels).

The replica ID cards had eight components or fields in addition to a common background. These components were: photo, name, date of birth (DOB), date of issue, date of expiration, area access, ID number, bar code and watermark. The reference images consisted of color photographs of faces.

Both the ID photo and reference image were chosen from the Multi-PIE database (Gross, Matthews, Cohn, Kanade, & Baker, 2010). This database consists of color photographs (forward facing head shots) of individuals taken at different points in time. Therefore, while the ID photo and reference image were of the same individual, the images were not identical (e.g., different hair style, different clothes, different lighting). The task was divided into ten blocks of five minutes each.

At the beginning of each block, participants were instructed that they were guarding a restricted area that required a particular letter designation on the ID card for access (e.g., area C access required).

Participants were asked to determine if the individual in the image, paired with the corresponding ID card, should have access to their restricted area. Some of the ID cards were valid and some were not (e.g., expiration date passed, incorrect access area, or photos did not match).

Participants were instructed to press either an allow*or*deny button for each image-ID pairing. The two-alternative forced-choice response was self-paced with a maximum time limit of 20s.

If the participants chose to deny access, they were subsequently asked to provide a reason. Reasons for denied access were selected from a numerical list of five options: 1:incorrect access, 2:expired ID, 3:suspicious DOB, 4:face mismatch, 5:no watermark.

If the participant did not respond within the allotted time, the computer forced a deny decision. The restricted area (area A-E) assigned at the beginning of each block was randomly chosen without replacement such that all participants completed two blocks guarding each of the five areas.

To maintain consistency across participants, expiration dates were automatically generated at the beginning of the experiment to have a symmetrical distribution around the current date.

This distribution was such that the majority of IDs had expiration dates temporally close to the current date (i.e., in the near future or recent past).

In each block, the image-ID pairings were presented at one of six different stochastic queuing rates, ranging from 1 to 25 per minute (1, 2.5, 10, 15, 20, and 25 per minute).

The queuing rate varied within each block according to a predefined profile. The rate profile had randomly permuted epochs of each queuing rate.

Each epoch lasted 30s with approximately twice as many low rate epochs (1 and 2.5 image-IDs per minute) as high. The rate profiles were shifted for each participant (Latin square design) so that each rate profile was assigned to every block for at least two participants. The current rate was indicated through a processing queue, on the extreme right-hand side of the display, notifying each participant how many IDs are waiting to be checked. For slow rates, most participants were able to process all IDs in their queue and had periods where they were waiting for the next ID (i.e., blank screen).

For fast rates, most participants were not able to processes IDs as quickly as they were added to the queue, increasing the size of the processing queue. IDs in the queue persisted until they were processed by the participant or the block ended. At the beginning of the experiment, participants were instructed to correctly process each image-ID while keeping the queue as short as possible.

Whereas the stochastic queuing rate was used to increase task realism, incorporating periods of high and low task demand, the dynamic rate itself was not explicitly considered an independent factor in the present study.

All blocks contained the same ratio of valid and invalid image-ID pairings (82% valid, 18% invalid). The majority of invalid IDs were due to incorrect access (6%) and expiration (6%) whereas the rest were invalid for the other reasons: suspicious DOB (2%), face mismatch (2%), no watermark (2%).

This second group of invalid IDs served as catch trials to verify that participants were examining all fields of the ID. Independent variables: ID presentation rate (varied by block) Dependent variables: ID disposition accuracy and processing times, Task-Induced Fatigue Scale (TIFS), Karolinska Sleepiness Scale (KSS), Visual Analog Scale of Fatigue (VAS-F).

Note: The questionnaire data is available upon request from cancta.net. Additional data acquired: Participant Enrollment Questionnaire, Subject Questionnaire for Current Session, Simulator Sickness Questionnaire. Experimental Location: Science Applications International Corporation, Louisville, CO Note 1: This dataset has corresponding runs in the BCIT Calibration Driving ds004118 during which a the 15 minute driving task was performed prior to this one. Note 2: This dataset has a corresponding runs in the BCIT Baseline Driving ds004120 which were conducted on the same subject during the same session, counterbalanced with these.

§ 03Cohort · Participants

Cohort#

Dataset Statistics#

Channel counts: 262 ch (n=22 recordings)

Sampling frequencies: 1024.0 Hz (n=22 recordings)

§ 04Signal · Electrodes & trace

Signal · Electrodes & live trace#

Fig. 01 Signal & montage 262 ch · EEG · 1024 Hz · 21 subjects, 22 recordings

Live trace viewer — sub-13 · ses-01 · task-GuardDuty · run-1

Showing one representative recording out of 21 subjects and 22 recordings in this dataset. Browse the full set on OpenNeuro; drop any other _eeg.{set,edf,bdf,vhdr} file onto the viewer (or pass ?eeg=<url>) to inspect it.

Electrode layout — EEG · 256 sensors — 256 channels

NEMAR Processing Statistics#

The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.

HED event descriptors word cloud

§ 05Manifest · BIDS tree

Manifest#

File Explorer#

Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.

Recordings—

Files—

Subjects—

Modalities—

Click to load file structure…

§ 06API · Programmatic access

API Reference#

Signature

eegdash.dataset

class

eegdash.dataset.DS004119(cache_dir, query=None, s3_bucket=None, **kwargs)

Bases: EEGDashDataset

Author (year)Touryan2022_BCIT_Basic

Canonical—

Importable asDS004119 · Touryan2022_BCIT_Basic

Sourceeegdash/dataset/registry.py · [source ↗]

class eegdash.dataset.DS004119(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#

BCIT Basic Guard Duty

Study:: ds004119 (OpenNeuro)
Author (year):: Touryan2022_BCIT_Basic
Canonical:: —

Also importable as: DS004119, Touryan2022_BCIT_Basic.

Modality: eeg; Experiment type: Attention; Subject type: Healthy. Subjects: 21; recordings: 22; tasks: 1.

Parameters:

cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key dataset.
s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to EEGDashDataset.

data_dir#

Local dataset cache directory (cache_dir / dataset_id).

Type:: Path

query#

Merged query with the dataset filter applied.

Type:: dict

records#

Metadata records used to build the dataset, if pre-fetched.

Type:: list[dict] | None

Notes

Each item is a recording; recording-level metadata are available via dataset.description. query supports MongoDB-style filters on fields in ALLOWED_QUERY_FIELDS and is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.

References

OpenNeuro dataset: https://openneuro.org/datasets/ds004119 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds004119 DOI: https://doi.org/10.18112/openneuro.ds004119.v1.0.0 NEMAR citation count: 0

Examples

>>> from eegdash.dataset import DS004119
>>> dataset = DS004119(cache_dir="./data")
>>> recording = dataset[0]
>>> raw = recording.load()

__init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#

save(path: str, overwrite: bool = False, offset: int = 0)[source]#

Save datasets to files by creating one subdirectory for each dataset:

path/
    0/
        0-raw.fif | 0-epo.fif
        description.json
        raw_preproc_kwargs.json (if raws were preprocessed)
        window_kwargs.json (if this is a windowed dataset)
        window_preproc_kwargs.json  (if windows were preprocessed)
        target_name.json (if target_name is not None and dataset is raw)
    1/
        1-raw.fif | 1-epo.fif
        description.json
        raw_preproc_kwargs.json (if raws were preprocessed)
        window_kwargs.json (if this is a windowed dataset)
        window_preproc_kwargs.json  (if windows were preprocessed)
        target_name.json (if target_name is not None and dataset is raw)

Parameters:

path (str) –

Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.

Access modesMNE → braindecode → PyTorch → ML

.rawMNE Raw object — standard tools (filter, epoch, ICA, plot_psd).mne

BaseConcatDatasetEach record is a lazy BaseDataset from braindecode — windowed via create_windows_from_events.braindecode

DataLoaderWraps the windowed dataset into a PyTorch DataLoader; supports parallel workers and on-the-fly augmentations.pytorch

Zarr cacheOptional braindecode Zarr mirror for fast resume; persisted to cache_dir.zarr

Hugging FacePre-bundled mirror at EEGDash/ds004119 · pull with datasets.load_dataset("EEGDash/ds004119").huggingface

Croissant 1.0Machine-readable JSON-LD descriptor — DS004119.croissant.json (MLCommons schema, ingestible by PyTorch / TensorFlow / JAX).mlcommons

Examples using EEGDashcurated · start here

Find datasets with the EEGDash APIQuery the catalogue, filter by task or modality, list candidates.

Load one EEG recordingResolve a single record to an MNE Raw with channels and events.

EEG recording to PyTorch DataLoaderWrap braindecode windows in a DataLoader for model training.

Preprocess EEG and create windowsFilter, resample, epoch — and persist the windowed dataset.

Save and reload prepared dataCache a windowed dataset to disk and reattach it without recompute.

Download a dataset locallyPrefetch BIDS files to a local cache and validate the layout.

Swap any load_dataset(...) call for ds004119 to reproduce the tutorial on this dataset.

Citation

Jonathan Touryan (data and curation), Brent Lance (data), Scott Kerick (data), Anthony Ries (data), Kaleb McDowell (data), … (20). BCIT Basic Guard Duty. 10.18112/openneuro.ds004119.v1.0.0

Provenance

¹Contributed to openneuro in BIDS format.

²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.

³Persistent identifier: 10.18112/openneuro.ds004119.v1.0.0.

Related & sibling datasets

DS004106EEG · 27 subj DS004043EEG · 20 subj DS004816EEG · 20 subj DS004817EEG · 20 subj DS004123EEG · 29 subj

+ 1 more — see See Also below →

BIDS

BIDS 1.7.0

Sidecars

events · channels · electrodes · coordsystem

Provenance

CC0 · 10.18112/openneuro.ds004119.v1.0.0

Machine-readable

schema.org/Dataset · Croissant

Mirrors

OpenNeuro · NEMAR · HuggingFace · Paper

Dataset ID	`DS004119`
Title	BCIT Basic Guard Duty
Author (year)	`Touryan2022_BCIT_Basic`
Canonical	—
Importable as	`DS004119`, `Touryan2022_BCIT_Basic`
Year	20
Authors	Jonathan Touryan (data and curation), Brent Lance (data), Scott Kerick (data), Anthony Ries (data), Kaleb McDowell (data), Tony Johnson (curation), Kay Robbins (curation)
License	CC0
Citation / DOI	doi:10.18112/openneuro.ds004119.v1.0.0
Source links	OpenNeuro \| NeMAR

DS004119: eeg dataset, 21 subjects#

Quickstart#

About This Dataset#

Cohort#

Dataset Statistics#

Signal · Electrodes & live trace#

NEMAR Processing Statistics#

Manifest#

File Explorer#

API Reference#

Citation

Provenance

Related & sibling datasets

See Also#