eegdash.api module#
High-level interface to the EEGDash metadata database.
This module provides the main EEGDash class which serves as the primary entry point for interacting with the EEGDash ecosystem. It offers methods to query, insert, and update metadata records stored in the EEGDash database via REST API.
- class eegdash.api.EEGDash(*, database: str = 'eegdash', api_url: str | None = None, auth_token: str | None = None)[source]
Bases:
objectHigh-level interface to the EEGDash metadata database.
Provides methods to query, insert, and update metadata records stored in the EEGDash database via REST API gateway.
For working with collections of recordings as PyTorch datasets, prefer
EEGDashDataset.Create a new EEGDash client.
- Parameters:
database (str, default "eegdash") – Name of the MongoDB database to connect to. Common values:
"eegdash"(production),"eegdash_staging"(staging),"eegdash_v1"(legacy archive).api_url (str, optional) – Override the default API URL. If not provided, uses the default public endpoint or the
EEGDASH_API_URLenvironment variable.auth_token (str, optional) – Authentication token for admin write operations. Not required for public read operations.
Examples
>>> eegdash = EEGDash() # production >>> eegdash = EEGDash(database="eegdash_staging") # staging >>> records = eegdash.find({"dataset": "ds002718"})
- count(query: dict[str, Any] = None, /, **kwargs) int[source]
Count documents matching the query.
- Parameters:
query (dict, optional) – Complete query dictionary. This is a positional-only argument.
**kwargs – User-friendly field filters (same as find()).
- Returns:
Number of matching documents.
- Return type:
Examples
>>> eeg = EEGDash() >>> count = eeg.count({}) # count all >>> count = eeg.count(dataset="ds002718") # count by dataset
- exists(query: dict[str, Any] = None, /, **kwargs) bool[source]
Check if at least one record matches the query.
- Parameters:
query (dict, optional) – Complete query dictionary. This is a positional-only argument.
**kwargs – User-friendly field filters (same as find()).
- Returns:
True if at least one matching record exists; False otherwise.
- Return type:
Examples
>>> eeg = EEGDash() >>> eeg.exists(dataset="ds002718") # check by dataset >>> eeg.exists({"data_name": "ds002718_sub-001_eeg.set"}) # check by data_name
- find(query: dict[str, Any] = None, /, **kwargs) list[Mapping[str, Any]][source]
Find records in the collection.
Examples
>>> from eegdash import EEGDash >>> eegdash = EEGDash() >>> eegdash.find({"dataset": "ds002718", "subject": {"$in": ["012", "013"]}}) # pre-built query >>> eegdash.find(dataset="ds002718", subject="012") # keyword filters >>> eegdash.find(dataset="ds002718", subject=["012", "013"]) # sequence -> $in >>> eegdash.find({}) # fetch all (use with care) >>> eegdash.find({"dataset": "ds002718"}, subject=["012", "013"]) # combine query + kwargs (AND)
- Parameters:
query (dict, optional) – Complete MongoDB query dictionary. This is a positional-only argument.
**kwargs – User-friendly field filters that are converted to a MongoDB query. Values can be scalars (e.g.,
"sub-01") or sequences (translated to$inqueries). Special parameters:limit(int) andskip(int) for pagination.
- Returns:
DB records that match the query.
- Return type:
- find_datasets(query: dict[str, Any] | None = None, limit: int = 1000) list[Mapping[str, Any]][source]
Find datasets matching query.
- find_one(query: dict[str, Any] = None, /, **kwargs) Mapping[str, Any] | None[source]
Find a single record matching the query.
- Parameters:
query (dict, optional) – Complete query dictionary. This is a positional-only argument.
**kwargs – User-friendly field filters (same as find()).
- Returns:
The first matching record, or None if no match.
- Return type:
dict or None
Examples
>>> eeg = EEGDash() >>> record = eeg.find_one(data_name="ds002718_sub-001_eeg.set")
- get_dataset(dataset_id: str) Mapping[str, Any] | None[source]
Fetch metadata for a specific dataset.
- insert(records: dict[str, Any] | list[dict[str, Any]]) int[source]
Insert one or more records (requires auth_token).
- Parameters:
records (dict or list of dict) – A single record or list of records to insert.
- Returns:
Number of records inserted.
- Return type:
Examples
>>> eeg = EEGDash(auth_token="...") >>> eeg.insert({"dataset": "ds001", "subject": "01", ...}) # single >>> eeg.insert([record1, record2, record3]) # batch
- search_datasets(*, modality: str | None = None, task: str | None = None, clinical_group: str | None = None, source: str | None = None, n_subjects_min: int | None = None, license: str | None = None, limit: int = 100)[source]
Search the dataset catalogue with friendly keyword filters.
Convenience wrapper around
find_datasets()that translates a small set of human-friendly keyword arguments into a MongoDB-style query and returns a tidy summarypandas.DataFrame. This is the metadata-only entry point used by tutorials such asplot_00_first_search.- Parameters:
modality (str, optional) – Filter by recording modality (e.g.,
"eeg","meeg"). Matched case-insensitively against themodalityfield.task (str, optional) – Filter by BIDS task name (e.g.,
"rest","FacePerception").clinical_group (str, optional) – Filter by clinical cohort label (e.g.,
"healthy","adhd"). Matched againstclinical.group(nested) and falls back to the flatclinical_groupfield.source (str, optional) – Filter by data source (e.g.,
"openneuro","nemar","hbn"). Matched againstsourceandproviderfields.n_subjects_min (int, optional) – Minimum number of subjects in the dataset. Maps to
{"n_subjects": {"$gte": n_subjects_min}}.license (str, optional) – Filter by data license (e.g.,
"CC0","CC-BY-4.0"). Matched against thelicensefield.limit (int, default 100) – Maximum number of datasets to return.
- Returns:
One row per matching dataset with summary columns:
dataset_id,name,modality,task,n_subjects,source,license,dataset_doi. Missing fields surface asNone. The frame is empty (zero rows) when nothing matches.- Return type:
Notes
search_datasetsdoes not download any signal bytes; only small JSON catalogue documents are transferred. Pair withEEGDashDatasetonce a candidate dataset is chosen.Examples
>>> client = EEGDash() >>> df = client.search_datasets(modality="eeg", n_subjects_min=10) >>> df = client.search_datasets(task="rest", source="openneuro")
- update_dataset(dataset_id: str, update: dict[str, Any]) int[source]
Update metadata for a specific dataset (requires auth_token).
- Parameters:
- Returns:
Number of documents modified (0 or 1).
- Return type:
Examples
>>> eeg = EEGDash(auth_token="...") >>> eeg.update_dataset("ds002718", {"clinical.is_clinical": True})
- update_field(query: dict[str, Any] = None, /, *, update: dict[str, Any], **kwargs) tuple[int, int][source]
Update fields on records matching the query (requires auth_token).
Use this to add or modify fields across matching records, e.g., after re-extracting entities with an improved algorithm.
- Parameters:
- Returns:
Number of records matched and actually modified.
- Return type:
tuple of (matched_count, modified_count)
Examples
>>> eeg = EEGDash(auth_token="...") >>> # Update entities for all records in a dataset >>> eeg.update_field({"dataset": "ds002718"}, update={"entities": {"subject": "01"}}) >>> # Using kwargs for filter >>> eeg.update_field(dataset="ds002718", update={"entities": new_entities}) >>> # Combine query + kwargs >>> eeg.update_field({"dataset": "ds002718"}, subject="01", update={"entities": new_entities})