NM000106#
handwriting: Handwriting Recognition from EMG
Access recordings and metadata through EEGDash.
Citation: Patrick Kaifosh, Thomas R. Reardon, CTRL-labs at Reality Labs (2025). handwriting: Handwriting Recognition from EMG. 10.5281/zenodo.17283865
Modality: emg Subjects: 100 Recordings: 807 License: CC-BY-NC 4.0 Source: nemar
Metadata: Complete (100%)
Quickstart#
Install
pip install eegdash
Access the data
from eegdash.dataset import NM000106
dataset = NM000106(cache_dir="./data")
# Get the raw object of the first recording
raw = dataset.datasets[0].raw
print(raw.info)
Filter by subject
dataset = NM000106(cache_dir="./data", subject="01")
Advanced query
dataset = NM000106(
cache_dir="./data",
query={"subject": {"$in": ["01", "02"]}},
)
Iterate recordings
for rec in dataset:
print(rec.subject, rec.raw.info['sfreq'])
If you use this dataset in your research, please cite the original authors.
BibTeX
@dataset{nm000106,
title = {handwriting: Handwriting Recognition from EMG},
author = {Patrick Kaifosh and Thomas R. Reardon and CTRL-labs at Reality Labs},
doi = {10.5281/zenodo.17283865},
url = {https://doi.org/10.5281/zenodo.17283865},
}
About This Dataset#
handwriting: Handwriting Recognition from EMG
Overview
Dataset: handwriting - Imagined handwriting from wrist-based surface electromyography Task: Air-writing (imagined handwriting without pen) Participants: 100 subjects Sessions: ~700 total (~7 per subject)
View full README
handwriting: Handwriting Recognition from EMG
Overview
Dataset: handwriting - Imagined handwriting from wrist-based surface electromyography Task: Air-writing (imagined handwriting without pen) Participants: 100 subjects Sessions: ~700 total (~7 per subject) Publication: Kaifosh et al., 2025 - “A generic non-invasive neuromotor interface for human-computer interaction” (Nature)
Purpose
This dataset captures wrist-based sEMG signals during imagined handwriting motions for text entry. Participants “write” prompted text with fingers together (as if holding an invisible pen) without any physical writing surface. Applications include AR/VR text input, mobile computing, and hands-free communication.
Dataset Details
Participants
Sample size: 100 participants
Demographics: Not available (marked as n/a)
Recording side: Dominant wrist
Sessions: Average 7 per participant
Hardware
Device: sEMG-RD (single wristband)
Channels: 16 (EMG0-EMG15)
Sampling rate: 2000 Hz
Reference: Bipolar differential
Recording Protocol
Participant holds fingers together (as if holding pen)
Prompted text appears on screen
Participant “writes” the text in air
Session duration: ~11 minutes
Prompts per session: 96 phrases
Data Contents
Files per Session
sub-XXX/ses-XXX/emg/
├── sub-XXX_ses-XXX_task-handwriting_emg.edf
├── sub-XXX_ses-XXX_task-handwriting_emg.json
├── sub-XXX_ses-XXX_task-handwriting_channels.tsv
├── sub-XXX_ses-XXX_task-handwriting_events.tsv
└── sub-XXX_ses-XXX_electrodes.tsv
Events
Handwriting prompts: Text to be written -
prompt_text: Displayed phraseStage boundaries: Posture changes (sitting/standing), session phases
Coordinate System
Single coordinate system at root (dominant wrist, percent units, no decimals)
Baseline Performance
Published Results (Kaifosh et al., 2025)
Generic Model (6,527 training participants): - Offline CER: >90% classification accuracy on held-out participants - Online performance: 20.9 words per minute (WPM) - Online CER: Median improvement from ~35% (practice) to ~25% (evaluation)
Personalized Model (20 min fine-tuning): - 16% improvement over generic model - Better performance for users with higher generic CER - Diminishing returns with more pretraining data
Comparison: - Open-loop handwriting (no pen): 25.1 WPM - sEMG handwriting: 20.9 WPM (83% of baseline) - Mobile phone keyboard: 36 WPM
Model architecture: MPF features + Conformer (attention mechanism)
Use Cases
Keyboard-free text entry: AR/VR, mobile devices
Silent communication: Private text input in public spaces
Personalization research: Few-shot learning, transfer learning
Sequence modeling: Character-level prediction with attention
Known Limitations
Single wrist (dominant hand only)
Handedness not recorded
Learning curve: Users improve with practice/coaching
Lower WPM than physical writing or typing
Citation
Kaifosh, P., Reardon, T.R., & CTRL-labs at Reality Labs. (2025).
A generic non-invasive neuromotor interface for human-computer interaction.
Nature, 645(8081), 702-711. https://doi.org/10.1038/s41586-025-09255-w
Data Curator
Yahya Shirazi SCCN (Swartz Center for Computational Neuroscience) INC (Institute for Neural Computation) University of California San Diego
Version History
v1.0 (2025-10-01): Initial BIDS conversion
BIDS Version: 1.11 | EMG-BIDS: BEP-042 | Updated: Oct 1, 2025
Dataset Information#
Dataset ID |
|
Title |
handwriting: Handwriting Recognition from EMG |
Year |
2025 |
Authors |
Patrick Kaifosh, Thomas R. Reardon, CTRL-labs at Reality Labs |
License |
CC-BY-NC 4.0 |
Citation / DOI |
|
Source links |
OpenNeuro | NeMAR | Source URL |
Copy-paste BibTeX
@dataset{nm000106,
title = {handwriting: Handwriting Recognition from EMG},
author = {Patrick Kaifosh and Thomas R. Reardon and CTRL-labs at Reality Labs},
doi = {10.5281/zenodo.17283865},
url = {https://doi.org/10.5281/zenodo.17283865},
}
Found an issue with this dataset?
If you encounter any problems with this dataset (missing files, incorrect metadata, loading errors, etc.), please let us know!
Technical Details#
Subjects: 100
Recordings: 807
Tasks: 1
Channels: 16
Sampling rate (Hz): 2000.0
Duration (hours): 0.0
Pathology: Healthy
Modality: Visual
Type: Motor
Size on disk: 11.2 MB
File count: 807
Format: BIDS
License: CC-BY-NC 4.0
DOI: 10.5281/zenodo.17283865
API Reference#
Use the NM000106 class to access this dataset programmatically.
- class eegdash.dataset.NM000106(cache_dir: str, query: dict | None = None, s3_bucket: str | None = None, **kwargs)[source]#
Bases:
EEGDashDatasetOpenNeuro dataset
nm000106. Modality:emg; Experiment type:Motor; Subject type:Healthy. Subjects: 100; recordings: 807; tasks: 1.- Parameters:
cache_dir (str | Path) – Directory where data are cached locally.
query (dict | None) – Additional MongoDB-style filters to AND with the dataset selection. Must not contain the key
dataset.s3_bucket (str | None) – Base S3 bucket used to locate the data.
**kwargs (dict) – Additional keyword arguments forwarded to
EEGDashDataset.
- data_dir#
Local dataset cache directory (
cache_dir / dataset_id).- Type:
Path
- query#
Merged query with the dataset filter applied.
- Type:
dict
- records#
Metadata records used to build the dataset, if pre-fetched.
- Type:
list[dict] | None
Notes
Each item is a recording; recording-level metadata are available via
dataset.description.querysupports MongoDB-style filters on fields inALLOWED_QUERY_FIELDSand is combined with the dataset filter. Dataset-specific caveats are not provided in the summary metadata.References
OpenNeuro dataset: https://openneuro.org/datasets/nm000106 NeMAR dataset: https://nemar.org/dataexplorer/detail?dataset_id=nm000106
Examples
>>> from eegdash.dataset import NM000106 >>> dataset = NM000106(cache_dir="./data") >>> recording = dataset[0] >>> raw = recording.load()
See Also#
eegdash.dataset.EEGDashDataseteegdash.dataset