DS006923#
Dataset of Electroencephalograms of Juvenile Offenders
Access recordings and metadata through EEGDash.
Citation: Aura Polo, Elmer León, Mariana Pino-Melgarejo, Julie Viloria-Porto (2025). Dataset of Electroencephalograms of Juvenile Offenders. 10.18112/openneuro.ds006923.v1.0.0
Modality: eeg Subjects: 140 Recordings: 985 License: CC0 Source: openneuro
Metadata: Complete (100%)
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import DS006923
dataset = DS006923(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = DS006923(cache_dir="./data", subject="01")
Advanced query
dataset = DS006923(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{ds006923,
title = {Dataset of Electroencephalograms of Juvenile Offenders},
author = {Aura Polo and Elmer León and Mariana Pino-Melgarejo and Julie Viloria-Porto},
doi = {10.18112/openneuro.ds006923.v1.0.0},
url = {https://doi.org/10.18112/openneuro.ds006923.v1.0.0},
}
About This Dataset#
Dataset of Electroencephalograms of Juvenile Offenders
Project’s name
Desarrollo de un sistema inteligente multiparamétrico para el reconocimiento de patrones asociados a disfunciones neurocognitivas en jóvenes en conflicto con la ley en el departamento del Atlántico.
View full README
Dataset of Electroencephalograms of Juvenile Offenders
Project’s name
Desarrollo de un sistema inteligente multiparamétrico para el reconocimiento de patrones asociados a disfunciones neurocognitivas en jóvenes en conflicto con la ley en el departamento del Atlántico.
Year of project execution
2021
Authors and acknowledgment
Aura Polo, Elmer León, Mariana Pino-Melgarejo and Julie Viloria-Porto.
Ronald Ruiz for his assistance during the data collection process, and Sergio Miranda for his dedication to data processing and cleaning.
Work team
MAGMA Ingeniería research group
Hogares Claret foundation
Institutions
Institución Universitaria de Barranquilla (sede Soledad)
Universidad del Magdalena
Universidad Autónoma del Caribe
Description
This repository contains resting-state EEG data collected with the Biosemi ActiveTwo of 140 participants: - 74 juvenile offenders (JO) - 66 juvenile non-offender controls
Exclusion criteria: No psychiatric treatment, dental/orthodontic appliances.
Recruitment: JO Hogares Claret Foundation (Centro de Reeducación el Oasis & Fundación Luz de Esperanza).
Controls: Institución Nacional de Educación Media INEM Miguel Antonio Caro (Barranquilla).
Contents of the dataset
Core Files
dataset_description.json: General information about the studyparticipants.json: Demographic and group assignment dataparticipants.tsv: Demographic and group assignment data in table format
Features Data (EEGJODataset/code)
Feature file nomenclature
Files are named using the pattern:
FR_Dats_band_{BAND}_EP_{EYESTATE}_{EPOCH#}_can_{CHANNEL}.xlsx
| Component | Example | Description |
|--------------------|-------------|---------------------------------------------------------------------------|
| **FR_Dats_band** | Fixed | Prefix = "Feature Results Dataset" |
| **{BAND}** | `ALFA` | EEG frequency band: `ALFA` = Alpha (8-13Hz); `BETA` = Beta (13-30Hz); `DELTA` = Delta (1-4Hz); `THETA` = Theta (4-8Hz) |
| **EP_{EYESTATE}_** | `EP_C_` | Eye state during epoch: `C` = Eyes closed; `O` = Eyes open |
| **{EPOCH#}** | `1` | Epoch number (1 or 2) two epochs per eye state |
| **can_** | Fixed | "Channel" prefix |
| **{CHANNEL}** | `A1` | Electrode position (ABCD system): First letter = A • B • C • D
Number = Electrode ID (1-32) |
File Contents:
Each Excel file contains 7 features for the specified band/channel/epoch combination:
Mean Power
RMS of PSD
Standard Deviation
Min Power
Max Power
Skewness
Kurtosis
Examples:
FR_Dats_band_ALFA_EP_C_1_can_A1.xlsx- Alpha band features - First closed-eyes epoch - Channel A1 (Frontal electrode 1)FR_Dats_band_THETA_EP_O_2_can_C15.xlsx- Theta band features - Second open-eyes epoch - Channel C15 (Posterior electrode 15)FR_Dats_band_BETA_EP_C_2_can_B7.xlsx- Beta band features - Second closed-eyes epoch - Channel B7 (Central electrode 7)
Dataset Structure:
4 epochs per subject: - 2 closed-eyes:
EP_C_1,EP_C_2- 2 open-eyes:EP_O_1,EP_O_2128 channels (A1-D32)
4 frequency bands
Total files per subject: 4 epochs × 128 channels × 4 bands = 2,048 files
EEG Data
EEG_JO_Dataset/
├── code/
├── sub-{Subject ID}{Group}/
| ├── eeg/
| | ├── sub-{Subject ID}{Group}_coordsystem.json
| | ├── sub-{Subject ID}{Group}_electrodes.tsv
| | ├── sub-{Subject ID}{Group}_task-{Task Name}_acq-{Datatype}_eeg.json # Epoched data sidecar json
| | ├── sub-{Subject ID}{Group}_task-{Task Name}_acq-{Datatype}_eeg.set # Epoched data
| | ├── sub-{Subject ID}{Group}_task-{Task Name}_channels.tsv
| | ├── sub-{Subject ID}{Group}_task-{Task Name}_desc-{Datatype}_eeg.json # Preprocessed data sidecar json
| | └── sub-{Subject ID}{Group}_task-{Task Name}_desc-{Datatype}_eeg.set # Preprocessed data
├── ...
├── CHANGES
├── dataset_description.json
├── participants.json
├── participants.tsv
└── README.md
File Nomenclature
| Denomination | Value | Description |
|-----------------------|-----------------|------------------------------------------------------------------|
| `sub-` | Fixed | Subject prefix |
| `{Subject ID}` | Fixed | **Unique identifier**:
| `{Group}` | `cg`/`sg`/`sg2` | **Group**: `cg`=control, `sg`=study group 1, `sg2`=study group 2 |
| `{Task Name}` | `restingstate` | **Task name** (resting state) |
| `acq-` `desc-` | `acq-`/`desc-` | **Label**: `acq-` = acquisition, `desc-` = description |
| `{Datatype}` | `epochs`/`preprocessed` | Adquisition type |
| `eeg` | Electroencephalography data | Data type |
| Extension | `.set` | **File type**: processed |
Examples
sub-1005sg_task-restingstate_acq-epochs_eeg.set= Epochs EEG for study group 1 subject 005 (full ID 1005)sub-1005sg_task-restingstate_desc-preprocessing_eeg.set= Preprocessed EEG for study group 1 subject 005 (full ID 1005)
Methods
EEG Acquisition
Device: Biosemi ActiveTwo system
Electrodes: 128 channels (radial placement, 10-20 system reference)
Additional channels: EOG, ECG recorded
Sampling rate: 2048 Hz (downsampled to 128 Hz during preprocessing)
Online filtering: 0.1-100 Hz bandpass
Setup: - Participants seated awake - Continuous monitoring for movements/sleep - Event markers via serial communication (paradigm triggers)
Paradigms
(Dataset contains only resting-state recordings)
Resting State (RS): - Total duration: 12 minutes - Sequence:
4 min alternating eyes closed/open (COCO: Closed-Open-Closed-Open)
8 min eyes closed (excluded from current dataset)
- Segment trimming:
5s post-event onset
5s pre-event offset (to avoid transition artifacts)
Preprocessing pipeline (EEGLAB/MATLAB)
Visual inspection: - Raw data review using BDFreader - Identification of bad channels/artifacts
Downsampling: - 2048 Hz → 128 Hz (resting-state data)
Rereferencing: - Average reference (replaced failed earlobe reference)
Filtering: - Bandpass FIR: 1-40 Hz - High-pass: 1 Hz (0.5 Hz cutoff, 425 points) - Low-pass: 40 Hz (45 Hz cutoff, 45 points)
Artifact Removal: - Bad channel rejection:
Flat signals > 5s
SD > 4
Correlation < 0.8 with neighbors
ASR (Artifact Subspace Reconstruction)
ICA + ICLabel (components >90% non-brain removed)
Feature Extraction
PSD Calculation: Welch’s method (50% overlap, Hamming window)
Frequency bands: - Delta (δ): 1-4 Hz - Theta (θ): 4-8 Hz - Alpha (α): 8-13 Hz - Beta (β): 13-30 Hz
Features per band/channel: 1. Mean Power 2. RMS of PSD 3. Standard Deviation 4. Minimum Power 5. Maximum Power 6. Skewness 7. Kurtosis
Feature volume: 14,336 features/subject (4 bands × 128 channels × 4 segments × 7 features)
Technical Specifications
Processing Hardware: - Intel Core i5-9400F @2.9GHz - 16GB RAM - Windows 10 (64-bit)
Software: - MATLAB 2020a - EEGLAB toolbox - Python (scikit-learn, pandas for feature selection)
Processing Time: ~10 minutes/subject
Funding
This research was funded by the SISTEMA GENERAL DE REGALÍAS - SGR and the MINISTERIO DE CIENCIA TECNOLOGÍA E INNOVACIÓN - MINCIENCIAS from Colombia, in the framework of the project “Desarrollo de un sistema inteligente multiparamétrico para el reconocimiento de patrones asociados a disfunciones neurocognitivas en jóvenes en conflicto con la ley en el departamento del Atlántico”, with grant number BPIN 2020000100006.
Support
Correspondence: Aura Polo (apolol@unimagdalena.edu.co); Elmer León (elmerleondb@unimagdalena.edu.co); Julie Viloria-Porto (julieviloriapp@unimagdalena.edu.co)
Dataset Information#
Dataset ID |
|
Title |
Dataset of Electroencephalograms of Juvenile Offenders |
Year |
2025 |
Authors |
Aura Polo, Elmer León, Mariana Pino-Melgarejo, Julie Viloria-Porto |
License |
CC0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{ds006923,
title = {Dataset of Electroencephalograms of Juvenile Offenders},
author = {Aura Polo and Elmer León and Mariana Pino-Melgarejo and Julie Viloria-Porto},
doi = {10.18112/openneuro.ds006923.v1.0.0},
url = {https://doi.org/10.18112/openneuro.ds006923.v1.0.0},
}
Found an issue with this dataset?
If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!
Technical Details#
Subjects: 140
Recordings: 985
Tasks: 1
Channels: 128
Sampling rate (Hz): 128.0
Duration (hours): 0.0
Pathology: Other
Modality: Resting State
Type: Clinical/Intervention
Size on disk: 8.1 GB
File count: 985
Format: BIDS
License: CC0
DOI: doi:10.18112/openneuro.ds006923.v1.0.0
API Reference#
Use the DS006923 class to access this dataset programmatically.
- class eegdash.dataset.DS006923(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Bases:
EEGDashDatasetOpenNeuro dataset
ds006923. Modality:eeg; Experiment type:Clinical/Intervention; Subject type:Other. Subjects: 140; recordings: 280; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
- query#
Merged query with the dataset filter applied.
- Type:
dict
- records#
Metadata records used to build the dataset, if pre-fetched.
- Type:
list[dict] | None
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/ds006923 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=ds006923
Examples
>>> from eegdash.dataset import DS006923 >>> dataset = DS006923(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset