lamindb.curators.MuDataCurator

class lamindb.curators.MuDataCurator(dataset, schema)

Bases: SlotsCurator

Curator for a MuData object.

See also Curator and Schema.

Added in version 1.3.0.

Parameters:
  • dataset (MuData | Artifact) – The MuData-like object to validate & annotate.

  • schema (Schema) – A Schema object that defines the validation constraints.

Example:

import lamindb as ln
import bionty as bt

# define the global obs schema
obs_schema = ln.Schema(
    name="mudata_papalexi21_subset_obs_schema",
    features=[
        ln.Feature(name="perturbation", dtype="cat[ULabel[Perturbation]]").save(),
        ln.Feature(name="replicate", dtype="cat[ULabel[Replicate]]").save(),
    ],
).save()

# define the ['rna'].obs schema
obs_schema_rna = ln.Schema(
    name="mudata_papalexi21_subset_rna_obs_schema",
    features=[
        ln.Feature(name="nCount_RNA", dtype=int).save(),
        ln.Feature(name="nFeature_RNA", dtype=int).save(),
        ln.Feature(name="percent.mito", dtype=float).save(),
    ],
    coerce_dtype=True,
).save()

# define the ['hto'].obs schema
obs_schema_hto = ln.Schema(
    name="mudata_papalexi21_subset_hto_obs_schema",
    features=[
        ln.Feature(name="nCount_HTO", dtype=int).save(),
        ln.Feature(name="nFeature_HTO", dtype=int).save(),
        ln.Feature(name="technique", dtype=bt.ExperimentalFactor).save(),
    ],
    coerce_dtype=True,
).save()

# define ['rna'].var schema
var_schema_rna = ln.Schema(
    name="mudata_papalexi21_subset_rna_var_schema",
    itype=bt.Gene.symbol,
    dtype=float,
).save()

# define composite schema
mudata_schema = ln.Schema(
    name="mudata_papalexi21_subset_mudata_schema",
    otype="MuData",
    components={
        "obs": obs_schema,
        "rna:obs": obs_schema_rna,
        "hto:obs": obs_schema_hto,
        "rna:var": var_schema_rna,
    },
).save()

# curate an AnnData
mdata = ln.core.datasets.mudata_papalexi21_subset()
bt.settings.organism = "human" # set the organism
curator = ln.curators.MuDataCurator(mdata, mudata_schema)
artifact = curator.save_artifact(key="example_datasets/mudata_papalexi21_subset.h5mu")
assert artifact.schema == mudata_schema

Attributes

property slots: dict[str, DataFrameCurator]

Curator objects by slot.

Added in version 1.1.1.

Methods

save_artifact(*, key=None, description=None, revises=None, run=None)

Save an annotated artifact.

Parameters:
  • key (str | None, default: None) – A path-like key to reference artifact in default storage, e.g., "myfolder/myfile.fcs". Artifacts with the same key form a version family.

  • description (str | None, default: None) – A description.

  • revises (Artifact | None, default: None) – Previous version of the artifact. Is an alternative way to passing key to trigger a new version.

  • run (Run | None, default: None) – The run that creates the artifact.

Return type:

Artifact

Returns:

A saved artifact record.

validate()

Validate dataset.

Raises:

lamindb.errors.ValidationError – If validation fails.

Return type:

None