NM000106: emg dataset, 100 subjects#
FRL Handwriting: Handwriting Decoding from Surface Electromyography
Citation: Patrick Kaifosh, Thomas R. Reardon, CTRL-labs at Reality Labs (2025). FRL Handwriting: Handwriting Decoding from Surface Electromyography. 10.82901/nemar.nm000106
100-participant EMG dataset — FRL Handwriting: Handwriting Decoding from Surface Electromyography.
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import NM000106
dataset = NM000106(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = NM000106(cache_dir="./data", subject="01")
Advanced query
dataset = NM000106(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{nm000106,
title = {FRL Handwriting: Handwriting Decoding from Surface Electromyography},
author = {Patrick Kaifosh and Thomas R. Reardon and CTRL-labs at Reality Labs},
doi = {10.82901/nemar.nm000106},
url = {https://doi.org/10.82901/nemar.nm000106},
}
About This Dataset#
Dataset: handwriting - Imagined handwriting from wrist-based surface electromyography
Task: Air-writing (imagined handwriting without pen) Participants: 100 subjects Sessions: ~700 total (~7 per subject) Publication: Kaifosh et al., 2025 - “A generic non-invasive neuromotor interface for human-computer interaction” (Nature)
This dataset captures wrist-based sEMG signals during imagined handwriting motions for text entry. Participants “write” prompted text with fingers together (as if holding an invisible pen) without any physical writing surface. Applications include AR/VR text input, mobile computing, and hands-free communication.
handwriting: Handwriting Recognition from EMG
Overview
Dataset Details
Participants
View full README
handwriting: Handwriting Recognition from EMG
Overview
Dataset Details
Participants
Sample size: 100 participants
Demographics: Not available (marked as n/a)
Recording side: Dominant wrist
Sessions: Average 7 per participant
Hardware
Device: sEMG-RD (single wristband)
Channels: 16 (EMG0-EMG15)
Sampling rate: 2000 Hz
Reference: Bipolar differential
Recording Protocol
Participant holds fingers together (as if holding pen)
Prompted text appears on screen
Participant “writes” the text in air
Session duration: ~11 minutes
Prompts per session: 96 phrases
Data Contents
Files per Session
sub-XXX/ses-XXX/emg/├── sub-XXX_ses-XXX_task-handwriting_emg.edf ├── sub-XXX_ses-XXX_task-handwriting_emg.json ├── sub-XXX_ses-XXX_task-handwriting_channels.tsv ├── sub-XXX_ses-XXX_task-handwriting_events.tsv └── sub-XXX_ses-XXX_electrodes.tsvEvents
Handwriting prompts: Text to be written -
prompt_text: Displayed phraseStage boundaries: Posture changes (sitting/standing), session phases
Coordinate System
Single coordinate system at root (dominant wrist, percent units, no decimals)
Baseline Performance
Published Results (Kaifosh et al., 2025)
Generic Model (6,527 training participants): - Offline CER: >90% classification accuracy on held-out participants - Online performance: 20.9 words per minute (WPM) - Online CER: Median improvement from ~35% (practice) to ~25% (evaluation)
Personalized Model (20 min fine-tuning): - 16% improvement over generic model - Better performance for users with higher generic CER - Diminishing returns with more pretraining data
Comparison: - Open-loop handwriting (no pen): 25.1 WPM - sEMG handwriting: 20.9 WPM (83% of baseline) - Mobile phone keyboard: 36 WPM
Model architecture: MPF features + Conformer (attention mechanism)
Use Cases
Keyboard-free text entry: AR/VR, mobile devices
Silent communication: Private text input in public spaces
Personalization research: Few-shot learning, transfer learning
Sequence modeling: Character-level prediction with attention
Known Limitations
Single wrist (dominant hand only)
Handedness not recorded
Learning curve: Users improve with practice/coaching
Lower WPM than physical writing or typing
Citation
Kaifosh, P., Reardon, T.R., & CTRL-labs at Reality Labs. (2025). A generic non-invasive neuromotor interface for human-computer interaction. Nature, 645(8081), 702-711. https://doi.org/10.1038/s41586-025-09255-wData Curator
Yahya Shirazi SCCN (Swartz Center for Computational Neuroscience) INC (Institute for Neural Computation) University of California San Diego
Version History
v1.0 (2025-10-01): Initial BIDS conversion
BIDS Version: 1.11 | EMG-BIDS: BEP-042 | Updated: Oct 1, 2025
Cohort#
Dataset Statistics#
Channel counts: 16 ch (n=807 recordings)
Sampling frequencies: 2000.0 Hz (n=807 recordings)
Total recording duration: 140 h
Signal · Electrodes & live trace#
Live trace viewer — sub-021 · ses-008 · task-handwriting
Showing one representative recording out of
100 subjects and 807 recordings in this dataset.
Browse the full set on OpenNeuro;
drop any other _emg.{set,edf,bdf,vhdr} file onto the
viewer (or pass ?emg=<url>) to inspect it.
Electrode layout — EMG · 16 sensors — 16 channels
NEMAR Processing Statistics#
The plots below are generated by NEMAR’s automated EEG pipeline. The histogram shows pipeline success for data cleaning and ICA decomposition, the percentage of data frames and EEG channels retained after artefact removal, line noise per channel (RMS, dB), and the age/gender distribution of participants.
HED event descriptors word cloud
Manifest#
File Explorer#
Browse the BIDS file structure of this dataset. Records are fetched on demand from the EEGDash catalog the first time you open the explorer.
Full dataset metadata table
Dataset ID |
|
Title |
FRL Handwriting: Handwriting Decoding from Surface Electromyography |
Author (year) |
|
Canonical |
— |
Importable as |
|
Year |
2025 |
Authors |
Patrick Kaifosh, Thomas R. Reardon, CTRL-labs at Reality Labs |
License |
CC-BY-NC 4.0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{nm000106,
title = {FRL Handwriting: Handwriting Decoding from Surface Electromyography},
author = {Patrick Kaifosh and Thomas R. Reardon and CTRL-labs at Reality Labs},
doi = {10.82901/nemar.nm000106},
url = {https://doi.org/10.82901/nemar.nm000106},
}
API Reference#
eegdash.datasetEEGDashDatasetNM000106 · Kaifosh2025_106eegdash/dataset/registry.py · [source ↗]- class eegdash.dataset.NM000106(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
FRL Handwriting: Handwriting Decoding from Surface Electromyography
- Study:
nm000106(NeMAR)- Author (year):
Kaifosh2025_106- Canonical:
—
Also importable as:
NM000106,Kaifosh2025_106.Modality:
emg. Subjects: 100; recordings: 807; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/nm000106 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=nm000106 DOI: https://doi.org/10.82901/nemar.nm000106
Examples
>>> from eegdash.dataset import NM000106 >>> dataset = NM000106(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
- __init__(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
- save(path: str, overwrite: bool = False, offset: int = 0)[source]#
Save datasets to files by creating one subdirectory for each dataset:
path/ 0/ 0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw) 1/ 1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- Parameters:
path (str) –
- Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.
BaseDataset from braindecode — windowed via create_windows_from_events.braindecodeDataLoader; supports parallel workers and on-the-fly augmentations.pytorchSwap any load_dataset(...) call for nm000106 to reproduce the tutorial on this dataset.
Citation
Patrick Kaifosh, Thomas R. Reardon, CTRL-labs at Reality Labs (2025). FRL Handwriting: Handwriting Decoding from Surface Electromyography. 10.82901/nemar.nm000106
Provenance
¹Contributed to nemar in BIDS format.
²Curated & ingested by the EEGDash catalog; see CITATION.cff for canonical reference.
³Persistent identifier: 10.82901/nemar.nm000106.
Related & sibling datasets
+ 1 more — see See Also below →
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset