scverse_misc.datasets.parse_registry

Contents

scverse_misc.datasets.parse_registry#

scverse_misc.datasets.parse_registry(path)#

Parse a YAML registry into (base_url, {name: DatasetEntry}).

The YAML has a top-level base_url (or s3_base_url) and a datasets mapping of name -> {type, files: [{name, url?/s3_key?, sha256?}], ...}. Any keys other than type and files are collected into the entry’s metadata.

Return type:

tuple[str | None, dict[str, DatasetEntry]]