NM000105: emg dataset, 100 subjects#
FRL Discrete Gestures: Hand Gesture Recognition from Surface Electromyography
Access recordings and metadata through EEGDash.
Citation: Patrick Kaifosh, Thomas R. Reardon, CTRL-labs at Reality Labs (2019). FRL Discrete Gestures: Hand Gesture Recognition from Surface Electromyography. 10.82901/nemar.nm000105
Modality: emg Subjects: 100 Recordings: 100 License: CC-BY-NC 4.0 Source: nemar
Metadata: Complete (100%)
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import NM000105
dataset = NM000105(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = NM000105(cache_dir="./data", subject="01")
Advanced query
dataset = NM000105(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{nm000105,
title = {FRL Discrete Gestures: Hand Gesture Recognition from Surface Electromyography},
author = {Patrick Kaifosh and Thomas R. Reardon and CTRL-labs at Reality Labs},
doi = {10.82901/nemar.nm000105},
url = {https://doi.org/10.82901/nemar.nm000105},
}
About This Dataset#
discrete_gestures: Discrete Hand Gesture Detection from EMG
Overview
Dataset: discrete_gestures - Discrete hand gestures from wrist-based surface electromyography Task: Nine discrete hand gestures (pinches and swipes) Participants: 100 subjects Sessions: 100 total (1 per subject)
View full README
discrete_gestures: Discrete Hand Gesture Detection from EMG
Overview
Dataset: discrete_gestures - Discrete hand gestures from wrist-based surface electromyography Task: Nine discrete hand gestures (pinches and swipes) Participants: 100 subjects Sessions: 100 total (1 per subject) Publication: Kaifosh et al., 2025 - “A generic non-invasive neuromotor interface for human-computer interaction” (Nature)
Purpose
This dataset captures wrist-based sEMG signals during prompted discrete hand gestures for navigation and activation tasks. The goal is to enable gesture-based computer control without cameras or visible hand movements, with applications in AR/VR, mobile interfaces, and accessibility. Key research objectives: - Generic models that work across users without calibration - Discrete gesture classification with high accuracy - Real-time gesture detection for interactive systems - Robustness to electrode placement variability
Dataset Details
Participants
Sample size: 100 participants Demographics: Not available (age, sex, handedness marked as n/a) Recording side: Dominant wrist (assumed right-handed, varies by participant) Sessions: 1 session per participant
Hardware
Device: sEMG Research Device (sEMG-RD) Configuration: Single wristband (dominant wrist) Channels: 16 Sampling rate: 2000 Hz Bit depth: 12 bits Dynamic range: ±6.6 mV Bandwidth: 20-850 Hz Connectivity: Bluetooth Electrode type: Dry gold-plated differential pairs
Gestures
Nine discrete gestures: Thumb swipes (4): - Left swipe - Right swipe - Up swipe - Down swipe
Pinches (4): - Index-to-thumb pinch - Middle-to-thumb pinch - Ring-to-thumb pinch - Pinky-to-thumb pinch
Activation (1): - Thumb tap
Recording Protocol
Participant dons sEMG-RD on dominant wrist
Gesture prompter displays gesture cue (scrolling left-to-right)
Participant performs prompted gesture
Randomized order with randomized inter-gesture intervals
Multiple repetitions of each gesture type
Session duration: Varies by participant Total gestures: 1900 prompted gestures across all participants Stage boundaries: 16 recording stages per session
Data Contents
Files per Session
sub-XXX/ses-XXX/emg/
├── sub-XXX_ses-XXX_task-discretegestures_emg.edf
├── sub-XXX_ses-XXX_task-discretegestures_emg.json
├── sub-XXX_ses-XXX_task-discretegestures_channels.tsv
├── sub-XXX_ses-XXX_task-discretegestures_events.tsv
└── sub-XXX_ses-XXX_electrodes.tsv
Channel Configuration
Total channels: 16 (EMG0-EMG15)
Channel naming: Unique identifiers (EMG0-EMG15)
Electrode naming: E0-E15 (physical positions)
Reference: Bipolar (differential sensing)
channels.tsv columns:
- name: Channel identifier (EMG0-EMG15)
- type: EMG
- units: V
- signal_electrode: Physical electrode name (E0-E15)
- reference: bipolar
electrodes.tsv columns:
- name: Electrode identifier (E0-E15)
- x, y, z: 3D coordinates (percent units, no decimals)
Events
events.tsv contains: - Gesture prompts: Timestamped prompts for each gesture
type: gesture_X (where X is the gesture name)
latency: Sample index when gesture was prompted
gesture_type: Specific gesture (e.g., “index_pinch”, “thumb_swipe_left”)
Stage boundaries: Recording session phases -
type: stage_boundary -stage_name: Stage identifier
Total events: 1916 (1900 gesture prompts + 16 stage boundaries)
Coordinate System
Single coordinate system (no space entity):
EMGCoordinateSystem: Other
EMGCoordinateUnits: percent
X: USP → RSP (0-100%)
Y: Right-hand rule perpendicular (0-100%)
Z: Radial offset (constant 10%)
Anatomical landmarks: - RSP: Radial Styloid Process - USP: Ulnar Styloid Process
Note: Right-handed coordinate system for dominant wrist
Signal Processing
Preprocessing Applied
High-pass filtering: 40 Hz cutoff
Clock drift correction: Time synchronization
Irregular sampling handling: Resampling when deviation >1% (up to 9290% deviation detected)
Signal Characteristics
Gesture patterns: - Patterned activity across channels corresponding to flexor/extensor muscles - Fine differences across gesture instances - Channel activity correlates with muscle positions (Fig. 1 in paper)
Baseline Performance
Published Results (Kaifosh et al., 2025)
Offline Classification (held-out participants): - Accuracy: >90% for gesture classification - False-negative rate improves with more training data - Generic models trained on hundreds of participants
Closed-loop Performance (n=24 naive test users): - First-hit probability: Median improvement from 0.74 (practice) to 0.82 (evaluation block 2) - Gesture completion rate: Median 0.88 gestures/second (evaluation block 2) - Baseline comparison: Gaming controller achieves 1.45 completions/second
Model architecture: 1D convolution → LSTM layers Learning effects: Participants improve from practice to evaluation blocks
Representation Analysis
Network learns: - First layer filters resemble motor unit action potentials (MUAPs) - Deeper layers progressively separate gesture categories - Invariance to nuisance variables (participant ID, electrode placement, signal power)
Confusion Matrix
Common confusions (from paper): - Index and middle holds sometimes released too early - Similar gestures (e.g., adjacent finger pinches) occasionally confused - Swipe directions generally well-separated
Note: Some errors are behavioral (wrong gesture performed) not just decoding errors
Use Cases
Machine Learning
Time series classification: Discrete event detection
Generic modeling: Out-of-the-box cross-user generalization
Representation learning: Physiologically-grounded features
Real-time prediction: Low-latency gesture detection
Applications
Grid navigation: Discrete movement in 2D space
Menu selection: Activation gestures for UI elements
Game control: Gesture-based game inputs
AR/VR interfaces: Hands-free navigation
Accessibility: Alternative input modality
Known Issues and Limitations
By Design
Single wrist: Dominant hand only (not bilateral)
Handedness unknown: Assumed right-handed, varies by participant
Gesture novelty: Users needed coaching to learn effective gestures
No demographic data: Age, sex, handedness not collected
Technical
Electrode placement: Single session per user (less cross-session data than emg2qwerty)
Signal amplitude: Varies with gesture force
Hardware unavailable: sEMG-RD not commercially available
Data Quality
Irregular sampling: High deviation detected (up to 9290%), resampling applied
Behavioral errors: Not all errors are decoder errors (some user mistakes)
Comparison to Baselines
Nintendo Joy-Con controller: - Median: 1.45 completions/second - sEMG decoder: 0.88 completions/second (66% slower)
However: sEMG doesn’t require hand-encumbering device
BIDS Format
Pernet, C.R., et al. (2019). EEG-BIDS, an extension to the brain
imaging data structure for electroencephalography.
Scientific Data, 6(1), 103.
Access and Contact
Original data: Part of Meta Reality Labs neuromotor interface research BIDS conversion: Custom MATLAB tools using EEGLAB BIDS plugin Data curator: Yahya Shirazi, SCCN (Swartz Center for Computational Neuroscience), INC (Institute for Neural Computation), UCSD Contact: See Nature paper for corresponding authors
License
Research and educational use. See original publication.
Citation
Kaifosh, P., Reardon, T.R., & CTRL-labs at Reality Labs. (2025).
A generic non-invasive neuromotor interface for human-computer interaction.
Nature, 645(8081), 702-711. https://doi.org/10.1038/s41586-025-09255-w
Data Curator
Yahya Shirazi SCCN (Swartz Center for Computational Neuroscience) INC (Institute for Neural Computation) University of California San Diego
Version History
v1.0 (2025-10-01): Initial BIDS conversion
BIDS Version: 1.11 | EMG-BIDS: BEP-042 | Updated: Oct 1, 2025
Dataset Information#
Dataset ID |
|
Title |
FRL Discrete Gestures: Hand Gesture Recognition from Surface Electromyography |
Author (year) |
|
Canonical |
— |
Importable as |
|
Year |
2019 |
Authors |
Patrick Kaifosh, Thomas R. Reardon, CTRL-labs at Reality Labs |
License |
CC-BY-NC 4.0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{nm000105,
title = {FRL Discrete Gestures: Hand Gesture Recognition from Surface Electromyography},
author = {Patrick Kaifosh and Thomas R. Reardon and CTRL-labs at Reality Labs},
doi = {10.82901/nemar.nm000105},
url = {https://doi.org/10.82901/nemar.nm000105},
}
Found an issue with this dataset?
If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!
Technical Details#
Subjects: 100
Recordings: 100
Tasks: 1
Channels: 16
Sampling rate (Hz): 2000.0
Duration (hours): 63.93759180555556
Pathology: Not specified
Modality: —
Type: —
Size on disk: 20.6 GB
File count: 100
Format: BIDS
License: CC-BY-NC 4.0
DOI: 10.82901/nemar.nm000105
Electrode Layout#
Electrode layout — EMG · 16 sensors — 16 channels
Dataset Statistics#
Channel counts: 16 ch (n=100 recordings)
Sampling frequencies: 2000.0 Hz (n=100 recordings)
Total recording duration: 63 h
NEMAR Processing Statistics#
The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.
HED event descriptors word cloud
File Explorer#
Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.
API Reference#
Use the NM000105 class to access this dataset programmatically.
- class eegdash.dataset.NM000105(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Bases:
EEGDashDatasetFRL Discrete Gestures: Hand Gesture Recognition from Surface Electromyography
- Study:
nm000105(NeMAR)- Author (year):
Kaifosh2025- Canonical:
—
Also importable as:
NM000105,Kaifosh2025.Modality:
emg. Subjects: 100; recordings: 100; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/nm000105 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=nm000105 DOI: https://doi.org/10.82901/nemar.nm000105
Examples
>>> from eegdash.dataset import NM000105 >>> dataset = NM000105(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
- __init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
- save(path: str, overwrite: bool = False, offset: int = 0)[source]#
Save datasets to files by creating one subdirectory for each dataset:
path/ 0/ 0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw) 1/ 1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- Parameters:
path (str) –
- Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset