ontology-resolver — multimodal ontology-resolver, lancell, community, multimodal, ide skills, single-cell, Claude Code, Cursor, Windsurf

v1.0.0

关于此技能

The multimodal biomedical data atlas builder

# 核心主题

epiblastai epiblastai
[1]
[0]
更新于: 3/26/2026

Killer-Skills Review

Decision support comes first. Repository text comes second.

Reference-Only Page Review Score: 1/11

This page remains useful for operators, but Killer-Skills treats it as reference material instead of a primary organic landing page.

Review Score
1/11
Quality Score
37
Canonical Locale
en
Detected Body Locale
en

The multimodal biomedical data atlas builder

核心价值

The multimodal biomedical data atlas builder

适用 Agent 类型

Suitable for operator workflows that need explicit guardrails before installation and execution.

赋予的主要能力 · ontology-resolver

! 使用限制与门槛

Why this page is reference-only

  • - Current locale does not satisfy the locale-governance contract.
  • - The page lacks a strong recommendation layer.
  • - The page lacks concrete use-case guidance.
  • - The page lacks explicit limitations or caution signals.
  • - The underlying skill quality score is below the review floor.

Source Boundary

The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.

实验室 Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

常见问题与安装步骤

以下问题与步骤与页面结构化数据保持一致,便于搜索引擎理解页面内容。

? FAQ

ontology-resolver 是什么?

The multimodal biomedical data atlas builder

如何安装 ontology-resolver?

运行命令:npx killer-skills add epiblastai/lancell/ontology-resolver。支持 Cursor、Windsurf、VS Code、Claude Code 等 19+ IDE/Agent。

ontology-resolver 支持哪些 IDE 或 Agent?

该技能兼容 Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer。可使用 Killer-Skills CLI 一条命令通用安装。

安装步骤

  1. 1. 打开终端

    在你的项目目录中打开终端或命令行。

  2. 2. 执行安装命令

    运行:npx killer-skills add epiblastai/lancell/ontology-resolver。CLI 会自动识别 IDE 或 AI Agent 并完成配置。

  3. 3. 开始使用技能

    ontology-resolver 已启用,可立即在当前项目中调用。

! 参考页模式

此页面仍可作为安装与查阅参考,但 Killer-Skills 不再把它视为主要可索引落地页。请优先阅读上方评审结论,再决定是否继续查看上游仓库说明。

Imported Repository Instructions

The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.

Supporting Evidence

ontology-resolver

The multimodal biomedical data atlas builder

SKILL.md
Readonly
Imported Repository Instructions
The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.
Supporting Evidence

Ontology Resolver

Resolve free-text biological metadata values to canonical ontology terms with CELLxGENE-compatible IDs. Covers 9 entity types across 8 ontologies:

EntityOntologyPrefix
cell_typeCell OntologyCL
tissueUBERONUBERON
diseaseMONDOMONDO
organismNCBITaxonNCBITaxon
assayEFOEFO
development_stageHsapDv / MmusDvHsapDv, MmusDv
ethnicityHANCESTROHANCESTRO
sexPATOPATO
cell_lineCLOCLO

Interface

This resolver operates in Phase B (per-experiment obs resolution). It reads raw obs CSVs from an experiment directory and writes a fragment CSV with resolved ontology columns.

Input:

  • Raw obs CSV ({fs}_raw_obs.csv) — read-only, do not modify this file
  • A mapping of column names to OntologyEntity types (provided by the caller or derived from schema field classification)
  • Organism context (required for development_stage, helpful for others)
  • Output fragment path ({fs}_fragment_ontology_obs.csv)

Output:

  • A new fragment CSV file ({fs}_fragment_ontology_obs.csv) indexed by the same index as the raw obs, containing:
    • One column per schema field the resolver is responsible for, using the exact schema field name (e.g., cell_type, tissue, disease) — resolved to canonical ontology term names
    • ontology_resolved boolean indicating whether all ontology fields resolved successfully
  • A markdown report at resolver_reports/ontology-resolver.md in the working directory summarizing the experiment, fields resolved, unresolved values, and any omitted output columns with reasons.

Column naming: Output columns must exactly match the schema field names the resolver is told to fill. No _ontology_id suffixes, no validated_ prefixes. For organism fields, output the resolved scientific name (e.g., "Homo sapiens", "Mus musculus") — do not convert to common names. Ontology CURIEs are recorded in the markdown report for debugging but do not appear in the fragment CSV.

Reporting

Each run must write a markdown report to resolver_reports/ in the working directory.

  • Create the directory if it does not exist.
  • Default report path: resolver_reports/ontology-resolver.md
  • Overwrite the report for the current run unless the caller asks for a different naming scheme.
  • Include:
    • experiment directory and input file path
    • output fragment path
    • ontology fields attempted
    • resolved/unresolved counts per field
    • correction mappings or manual normalizations used
    • any omitted or blank output fields, with reasons

Imports

python
1from homeobox.standardization import ( 2 OntologyEntity, 3 resolve_ontology_terms, 4 is_control_label, 5 detect_control_labels, 6 detect_negative_control_type, 7) 8from homeobox.standardization.types import OntologyResolution, ResolutionReport

Scripts

scripts/resolve_ontology.py

Handles the standard ontology resolution workflow: control detection, resolution via resolve_ontology_terms, column writing, and report generation.

python .claude/skills/ontology-resolver/scripts/resolve_ontology.py \
    <input_csv> <output_csv> \
    --field <obs_col>:<schema_field>:<entity> [...] \
    [--organism human] \
    [--corrections corrections.json] \
    [--report-dir resolver_reports]
ArgumentDescription
input_csvPath to {fs}_raw_obs.csv (must have index_col=0)
output_csvPath to fragment output ({fs}_fragment_ontology_obs.csv)
--fieldRepeatable. Format: obs_column:schema_field:ENTITY_TYPE. Entity is the OntologyEntity enum name (e.g., CELL_TYPE, TISSUE, DISEASE).
--organismOrganism for development_stage resolution (default: human)
--correctionsJSON file with correction mappings (see below)
--report-dirOutput directory for markdown report (default: resolver_reports in input dir)

The script writes the fragment CSV with one column per schema field (exact name match) plus the ontology_resolved boolean. No _ontology_id suffix columns are written — ontology CURIEs are included in the markdown report only. It prints per-field resolution stats and unresolved values to stdout.

Corrections format

A JSON file mapping obs column names to {original: corrected} dictionaries:

json
1{ 2 "cell_type_annotation": { 3 "T-cell": "T cell", 4 "B-cell": "B cell", 5 "Monocyte/Macrophage": "monocyte" 6 } 7}

Corrections are applied before resolution. Build these after reviewing unresolved values from a first pass.

finalize_features.py — Schema finalization with type coercion (shared)

Uses the shared gene-resolver/scripts/finalize_features.py script if the ontology fragment needs to be written to parquet for direct LanceDB ingestion.

bash
1python .claude/skills/gene-resolver/scripts/finalize_features.py \ 2 <resolved_csv> <output_parquet> <schema_module> <schema_class> \ 3 [--column KEY=VALUE ...]

Agent Workflow

1. Inspect raw obs and determine field mappings

Read the raw obs CSV. Identify which columns contain ontology-resolvable metadata and map each to a schema field name and OntologyEntity:

python
1import pandas as pd 2raw_obs = pd.read_csv("<experiment_dir>/<fs>_raw_obs.csv", index_col=0) 3for col in raw_obs.columns: 4 unique = raw_obs[col].dropna().unique() 5 print(f"{col}: {len(unique)} unique — {list(unique[:5])}")

Not every dataset has every ontology field. If a column is absent or entirely null, skip it.

2. Determine organism context

Identify the organism from the dataset metadata or a dedicated organism column. Pass it to --organism for correct development_stage dispatch (HsapDv for human, MmusDv for mouse).

3. Run the script

bash
1python .claude/skills/ontology-resolver/scripts/resolve_ontology.py \ 2 /path/to/{fs}_raw_obs.csv \ 3 /path/to/{fs}_fragment_ontology_obs.csv \ 4 --field cell_type_annotation:cell_type:CELL_TYPE \ 5 --field donor_tissue:tissue:TISSUE \ 6 --field diagnosis:disease:DISEASE \ 7 --organism human

4. Review unresolved values

The script prints unresolved values to stdout. For each:

  • Case/whitespace issues — already handled by resolve_ontology_terms, but check for unusual Unicode
  • Abbreviations"T cell" vs "T-cell" vs "T lymphocyte". Build corrections.
  • Concatenated annotations"CD4+ T cell (activated)" may need qualifier stripping
  • Dataset-specific labels"Cluster_5", "Unknown", "Other" are not ontology terms; accept as unresolved
  • Near-misses — use hierarchy navigation to find the correct term:
python
1from homeobox.standardization import get_ontology_descendants 2descendants = get_ontology_descendants("CL:0000084", OntologyEntity.CELL_TYPE, max_depth=2)

5. Re-run with corrections (if needed)

Build a corrections JSON and re-run:

bash
1python .claude/skills/ontology-resolver/scripts/resolve_ontology.py \ 2 /path/to/{fs}_raw_obs.csv \ 3 /path/to/{fs}_fragment_ontology_obs.csv \ 4 --field cell_type_annotation:cell_type:CELL_TYPE \ 5 --corrections /path/to/corrections.json \ 6 --organism human

6. Verify fragment output

Confirm the fragment has the expected columns and the ontology_resolved counts are acceptable.


Entity-Specific Notes

organism — Common values: "Homo sapiens", "Mus musculus". Use the resolved scientific name directly — do not convert to common names.

development_stage — Pass --organism to select the correct ontology (HsapDv for human, MmusDv for mouse). Without it, both are searched and may produce wrong matches.

sex — Only 3 canonical values: "female" (PATO:0000383), "male" (PATO:0000384), "unknown" (PATO:0000461). "other" maps to "unknown sex". Resolution is hardcoded (not from the local DB).

cell_line — Uses Cellosaurus local DB with FTS fuzzy search fallback. Exact matches get confidence 1.0; fuzzy matches get confidence 0.7. Flag fuzzy matches for user review.

assay — Free-text assay names (e.g., "10x 3' v3", "Smart-seq2") must match EFO terms. Many GEO assay descriptions don't match EFO exactly — investigate failures carefully.


Rules

  • Read-only input. Never modify the _raw_obs.csv file. Write all output to the fragment file.
  • Exact schema field names only. Output columns must match schema field names exactly. No _ontology_id suffixes, no validated_ prefixes.
  • Organism as scientific name. Do not convert to common names.
  • Use resolve_ontology_terms() for all entities. It handles dispatch to DB lookup, OLS4, or hardcoded tables internally.
  • Pass organism for development_stage. Without it, wrong matches are possible.
  • Do not derive control fields from ontology columns. is_negative_control and negative_control_type are perturbation-level concepts populated by perturbation resolvers.
  • Use detect_control_labels() for control detection. Do not hardcode control label sets.
  • Never set output columns to NaN for failed resolution. Keep the original value in the schema field.
  • Always write ontology_resolved boolean. Named ontology_resolved to avoid collision during assembly.
  • Save after each column pair to prevent losing work.
  • Never modify h5ad files. All validated data goes into the fragment CSV only.
  • Investigate failures before giving up. Build correction mappings for close-but-not-exact values.
  • Dataset-specific labels are acceptable as unresolved. Cluster IDs, "Unknown", "Other" are not ontology terms.
  • Ask before guessing. If a column's entity type is ambiguous, ask the user.

Resolution Strategy

  1. Resolution succeeds — use the canonical ontology name in the schema field. Row is resolved. CURIE is recorded in the report.
  2. Resolution fails — keep the original value in the schema field. Row is unresolved.
  3. NaN only when no value exists — schema field is NaN.
  4. Control labels → None — Control values map to None in the schema field.

相关技能

寻找 ontology-resolver 的替代方案 (Alternative) 或可搭配使用的同类 community Skill?探索以下相关开源技能。

查看全部

openclaw-release-maintainer

Logo of openclaw
openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

333.8k
0
AI

widget-generator

Logo of f
f

为prompts.chat的信息反馈系统生成可定制的插件小部件

149.6k
0
AI

flags

Logo of vercel
vercel

React 框架

138.4k
0
浏览器

pr-review

Logo of pytorch
pytorch

Python中具有强大GPU加速的张量和动态神经网络

98.6k
0
开发者工具