string-database — community string-database, biochem-allergy, community, ide skills, Claude Code, Cursor, Windsurf

v1.0.0

关于此技能

适合需要全面蛋白质-蛋白质相互作用分析和功能富集的生物信息学代理 Query STRING API for protein-protein interactions (59M proteins, 20B interactions). Network analysis, GO/KEGG enrichment, interaction discovery, 5000+ species, for systems biology.

felixboehm felixboehm
[0]
[0]
更新于: 3/10/2026

Killer-Skills Review

Decision support comes first. Repository text comes second.

Reference-Only Page Review Score: 8/11

This page remains useful for operators, but Killer-Skills treats it as reference material instead of a primary organic landing page.

Original recommendation layer Concrete use-case guidance Explicit limitations and caution Quality floor passed for review
Review Score
8/11
Quality Score
60
Canonical Locale
en
Detected Body Locale
en

适合需要全面蛋白质-蛋白质相互作用分析和功能富集的生物信息学代理 Query STRING API for protein-protein interactions (59M proteins, 20B interactions). Network analysis, GO/KEGG enrichment, interaction discovery, 5000+ species, for systems biology.

核心价值

赋予代理查询相互作用网络、执行功能富集分析和通过REST API发现蛋白质合作伙伴的能力,以进行高级系统生物学和通路分析,利用5900万个蛋白质和200亿+相互作用跨5000+生物体

适用 Agent 类型

适合需要全面蛋白质-蛋白质相互作用分析和功能富集的生物信息学代理

赋予的主要能力 · string-database

检索单个或多个蛋白质的蛋白质-蛋白质相互作用网络
执行过敏原蛋白质研究的功能富集分析
发现IgE生物学和肥大细胞信号传导研究的合作伙伴

! 使用限制与门槛

  • 需要通过REST API访问STRING数据库
  • 仅限蛋白质-蛋白质相互作用数据

Why this page is reference-only

  • - Current locale does not satisfy the locale-governance contract.

Source Boundary

The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.

实验室 Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

常见问题与安装步骤

以下问题与步骤与页面结构化数据保持一致,便于搜索引擎理解页面内容。

? FAQ

string-database 是什么?

适合需要全面蛋白质-蛋白质相互作用分析和功能富集的生物信息学代理 Query STRING API for protein-protein interactions (59M proteins, 20B interactions). Network analysis, GO/KEGG enrichment, interaction discovery, 5000+ species, for systems biology.

如何安装 string-database?

运行命令:npx killer-skills add felixboehm/biochem-allergy/string-database。支持 Cursor、Windsurf、VS Code、Claude Code 等 19+ IDE/Agent。

string-database 适用于哪些场景?

典型场景包括:检索单个或多个蛋白质的蛋白质-蛋白质相互作用网络、执行过敏原蛋白质研究的功能富集分析、发现IgE生物学和肥大细胞信号传导研究的合作伙伴。

string-database 支持哪些 IDE 或 Agent?

该技能兼容 Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer。可使用 Killer-Skills CLI 一条命令通用安装。

string-database 有哪些限制?

需要通过REST API访问STRING数据库;仅限蛋白质-蛋白质相互作用数据。

安装步骤

  1. 1. 打开终端

    在你的项目目录中打开终端或命令行。

  2. 2. 执行安装命令

    运行:npx killer-skills add felixboehm/biochem-allergy/string-database。CLI 会自动识别 IDE 或 AI Agent 并完成配置。

  3. 3. 开始使用技能

    string-database 已启用,可立即在当前项目中调用。

! 参考页模式

此页面仍可作为安装与查阅参考,但 Killer-Skills 不再把它视为主要可索引落地页。请优先阅读上方评审结论,再决定是否继续查看上游仓库说明。

Imported Repository Instructions

The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.

Supporting Evidence

string-database

安装 string-database,这是一款面向AI agent workflows and automation的 AI Agent Skill。支持 Claude Code、Cursor、Windsurf,一键安装。

SKILL.md
Readonly
Imported Repository Instructions
The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.
Supporting Evidence

STRING Database

Overview

STRING is a comprehensive database of known and predicted protein-protein interactions covering 59M proteins and 20B+ interactions across 5000+ organisms. Query interaction networks, perform functional enrichment, discover partners via REST API for systems biology and pathway analysis.

When to Use This Skill

This skill should be used when:

  • Retrieving protein-protein interaction networks for single or multiple proteins
  • Performing functional enrichment analysis (GO, KEGG, Pfam) on protein lists
  • Discovering interaction partners and expanding protein networks
  • Testing if proteins form significantly enriched functional modules
  • Generating network visualizations with evidence-based coloring
  • Analyzing homology and protein family relationships
  • Conducting cross-species protein interaction comparisons
  • Identifying hub proteins and network connectivity patterns

Quick Start

The skill provides:

  1. Python helper functions (scripts/string_api.py) for all STRING REST API operations
  2. Comprehensive reference documentation (references/string_reference.md) with detailed API specifications

When users request STRING data, determine which operation is needed and use the appropriate function from scripts/string_api.py.

Core Operations

1. Identifier Mapping (string_map_ids)

Convert gene names, protein names, and external IDs to STRING identifiers.

When to use: Starting any STRING analysis, validating protein names, finding canonical identifiers.

Usage:

python
1from scripts.string_api import string_map_ids 2 3# Map single protein 4result = string_map_ids('TP53', species=9606) 5 6# Map multiple proteins 7result = string_map_ids(['TP53', 'BRCA1', 'EGFR', 'MDM2'], species=9606) 8 9# Map with multiple matches per query 10result = string_map_ids('p53', species=9606, limit=5)

Parameters:

  • species: NCBI taxon ID (9606 = human, 10090 = mouse, 7227 = fly)
  • limit: Number of matches per identifier (default: 1)
  • echo_query: Include query term in output (default: 1)

Best practice: Always map identifiers first for faster subsequent queries.

2. Network Retrieval (string_network)

Get protein-protein interaction network data in tabular format.

When to use: Building interaction networks, analyzing connectivity, retrieving interaction evidence.

Usage:

python
1from scripts.string_api import string_network 2 3# Get network for single protein 4network = string_network('9606.ENSP00000269305', species=9606) 5 6# Get network with multiple proteins 7proteins = ['9606.ENSP00000269305', '9606.ENSP00000275493'] 8network = string_network(proteins, required_score=700) 9 10# Expand network with additional interactors 11network = string_network('TP53', species=9606, add_nodes=10, required_score=400) 12 13# Physical interactions only 14network = string_network('TP53', species=9606, network_type='physical')

Parameters:

  • required_score: Confidence threshold (0-1000)
    • 150: low confidence (exploratory)
    • 400: medium confidence (default, standard analysis)
    • 700: high confidence (conservative)
    • 900: highest confidence (very stringent)
  • network_type: 'functional' (all evidence, default) or 'physical' (direct binding only)
  • add_nodes: Add N most connected proteins (0-10)

Output columns: Interaction pairs, confidence scores, and individual evidence scores (neighborhood, fusion, coexpression, experimental, database, text-mining).

3. Network Visualization (string_network_image)

Generate network visualization as PNG image.

When to use: Creating figures, visual exploration, presentations.

Usage:

python
1from scripts.string_api import string_network_image 2 3# Get network image 4proteins = ['TP53', 'MDM2', 'ATM', 'CHEK2', 'BRCA1'] 5img_data = string_network_image(proteins, species=9606, required_score=700) 6 7# Save image 8with open('network.png', 'wb') as f: 9 f.write(img_data) 10 11# Evidence-colored network 12img = string_network_image(proteins, species=9606, network_flavor='evidence') 13 14# Confidence-based visualization 15img = string_network_image(proteins, species=9606, network_flavor='confidence') 16 17# Actions network (activation/inhibition) 18img = string_network_image(proteins, species=9606, network_flavor='actions')

Network flavors:

  • 'evidence': Colored lines show evidence types (default)
  • 'confidence': Line thickness represents confidence
  • 'actions': Shows activating/inhibiting relationships

4. Interaction Partners (string_interaction_partners)

Find all proteins that interact with given protein(s).

When to use: Discovering novel interactions, finding hub proteins, expanding networks.

Usage:

python
1from scripts.string_api import string_interaction_partners 2 3# Get top 10 interactors of TP53 4partners = string_interaction_partners('TP53', species=9606, limit=10) 5 6# Get high-confidence interactors 7partners = string_interaction_partners('TP53', species=9606, 8 limit=20, required_score=700) 9 10# Find interactors for multiple proteins 11partners = string_interaction_partners(['TP53', 'MDM2'], 12 species=9606, limit=15)

Parameters:

  • limit: Maximum number of partners to return (default: 10)
  • required_score: Confidence threshold (0-1000)

Use cases:

  • Hub protein identification
  • Network expansion from seed proteins
  • Discovering indirect connections

5. Functional Enrichment (string_enrichment)

Perform enrichment analysis across Gene Ontology, KEGG pathways, Pfam domains, and more.

When to use: Interpreting protein lists, pathway analysis, functional characterization, understanding biological processes.

Usage:

python
1from scripts.string_enrichment import string_enrichment 2 3# Enrichment for a protein list 4proteins = ['TP53', 'MDM2', 'ATM', 'CHEK2', 'BRCA1', 'ATR', 'TP73'] 5enrichment = string_enrichment(proteins, species=9606) 6 7# Parse results to find significant terms 8import pandas as pd 9df = pd.read_csv(io.StringIO(enrichment), sep='\t') 10significant = df[df['fdr'] < 0.05]

Enrichment categories:

  • Gene Ontology: Biological Process, Molecular Function, Cellular Component
  • KEGG Pathways: Metabolic and signaling pathways
  • Pfam: Protein domains
  • InterPro: Protein families and domains
  • SMART: Domain architecture
  • UniProt Keywords: Curated functional keywords

Output columns:

  • category: Annotation database (e.g., "KEGG Pathways", "GO Biological Process")
  • term: Term identifier
  • description: Human-readable term description
  • number_of_genes: Input proteins with this annotation
  • p_value: Uncorrected enrichment p-value
  • fdr: False discovery rate (corrected p-value)

Statistical method: Fisher's exact test with Benjamini-Hochberg FDR correction.

Interpretation: FDR < 0.05 indicates statistically significant enrichment.

6. PPI Enrichment (string_ppi_enrichment)

Test if a protein network has significantly more interactions than expected by chance.

When to use: Validating if proteins form functional module, testing network connectivity.

Usage:

python
1from scripts.string_api import string_ppi_enrichment 2import json 3 4# Test network connectivity 5proteins = ['TP53', 'MDM2', 'ATM', 'CHEK2', 'BRCA1'] 6result = string_ppi_enrichment(proteins, species=9606, required_score=400) 7 8# Parse JSON result 9data = json.loads(result) 10print(f"Observed edges: {data['number_of_edges']}") 11print(f"Expected edges: {data['expected_number_of_edges']}") 12print(f"P-value: {data['p_value']}")

Output fields:

  • number_of_nodes: Proteins in network
  • number_of_edges: Observed interactions
  • expected_number_of_edges: Expected in random network
  • p_value: Statistical significance

Interpretation:

  • p-value < 0.05: Network is significantly enriched (proteins likely form functional module)
  • p-value ≥ 0.05: No significant enrichment (proteins may be unrelated)

7. Homology Scores (string_homology)

Retrieve protein similarity and homology information.

When to use: Identifying protein families, paralog analysis, cross-species comparisons.

Usage:

python
1from scripts.string_api import string_homology 2 3# Get homology between proteins 4proteins = ['TP53', 'TP63', 'TP73'] # p53 family 5homology = string_homology(proteins, species=9606)

Use cases:

  • Protein family identification
  • Paralog discovery
  • Evolutionary analysis

8. Version Information (string_version)

Get current STRING database version.

When to use: Ensuring reproducibility, documenting methods.

Usage:

python
1from scripts.string_api import string_version 2 3version = string_version() 4print(f"STRING version: {version}")

Common Analysis Workflows

Workflow 1: Protein List Analysis (Standard Workflow)

Use case: Analyze a list of proteins from experiment (e.g., differential expression, proteomics).

python
1from scripts.string_api import (string_map_ids, string_network, 2 string_enrichment, string_ppi_enrichment, 3 string_network_image) 4 5# Step 1: Map gene names to STRING IDs 6gene_list = ['TP53', 'BRCA1', 'ATM', 'CHEK2', 'MDM2', 'ATR', 'BRCA2'] 7mapping = string_map_ids(gene_list, species=9606) 8 9# Step 2: Get interaction network 10network = string_network(gene_list, species=9606, required_score=400) 11 12# Step 3: Test if network is enriched 13ppi_result = string_ppi_enrichment(gene_list, species=9606) 14 15# Step 4: Perform functional enrichment 16enrichment = string_enrichment(gene_list, species=9606) 17 18# Step 5: Generate network visualization 19img = string_network_image(gene_list, species=9606, 20 network_flavor='evidence', required_score=400) 21with open('protein_network.png', 'wb') as f: 22 f.write(img) 23 24# Step 6: Parse and interpret results

Workflow 2: Single Protein Investigation

Use case: Deep dive into one protein's interactions and partners.

python
1from scripts.string_api import (string_map_ids, string_interaction_partners, 2 string_network_image) 3 4# Step 1: Map protein name 5protein = 'TP53' 6mapping = string_map_ids(protein, species=9606) 7 8# Step 2: Get all interaction partners 9partners = string_interaction_partners(protein, species=9606, 10 limit=20, required_score=700) 11 12# Step 3: Visualize expanded network 13img = string_network_image(protein, species=9606, add_nodes=15, 14 network_flavor='confidence', required_score=700) 15with open('tp53_network.png', 'wb') as f: 16 f.write(img)

Workflow 3: Pathway-Centric Analysis

Use case: Identify and visualize proteins in a specific biological pathway.

python
1from scripts.string_api import string_enrichment, string_network 2 3# Step 1: Start with known pathway proteins 4dna_repair_proteins = ['TP53', 'ATM', 'ATR', 'CHEK1', 'CHEK2', 5 'BRCA1', 'BRCA2', 'RAD51', 'XRCC1'] 6 7# Step 2: Get network 8network = string_network(dna_repair_proteins, species=9606, 9 required_score=700, add_nodes=5) 10 11# Step 3: Enrichment to confirm pathway annotation 12enrichment = string_enrichment(dna_repair_proteins, species=9606) 13 14# Step 4: Parse enrichment for DNA repair pathways 15import pandas as pd 16import io 17df = pd.read_csv(io.StringIO(enrichment), sep='\t') 18dna_repair = df[df['description'].str.contains('DNA repair', case=False)]

Workflow 4: Cross-Species Analysis

Use case: Compare protein interactions across different organisms.

python
1from scripts.string_api import string_network 2 3# Human network 4human_network = string_network('TP53', species=9606, required_score=700) 5 6# Mouse network 7mouse_network = string_network('Trp53', species=10090, required_score=700) 8 9# Yeast network (if ortholog exists) 10yeast_network = string_network('gene_name', species=4932, required_score=700)

Workflow 5: Network Expansion and Discovery

Use case: Start with seed proteins and discover connected functional modules.

python
1from scripts.string_api import (string_interaction_partners, string_network, 2 string_enrichment) 3 4# Step 1: Start with seed protein(s) 5seed_proteins = ['TP53'] 6 7# Step 2: Get first-degree interactors 8partners = string_interaction_partners(seed_proteins, species=9606, 9 limit=30, required_score=700) 10 11# Step 3: Parse partners to get protein list 12import pandas as pd 13import io 14df = pd.read_csv(io.StringIO(partners), sep='\t') 15all_proteins = list(set(df['preferredName_A'].tolist() + 16 df['preferredName_B'].tolist())) 17 18# Step 4: Perform enrichment on expanded network 19enrichment = string_enrichment(all_proteins[:50], species=9606) 20 21# Step 5: Filter for interesting functional modules 22enrichment_df = pd.read_csv(io.StringIO(enrichment), sep='\t') 23modules = enrichment_df[enrichment_df['fdr'] < 0.001]

Common Species

When specifying species, use NCBI taxon IDs:

OrganismCommon NameTaxon ID
Homo sapiensHuman9606
Mus musculusMouse10090
Rattus norvegicusRat10116
Drosophila melanogasterFruit fly7227
Caenorhabditis elegansC. elegans6239
Saccharomyces cerevisiaeYeast4932
Arabidopsis thalianaThale cress3702
Escherichia coliE. coli511145
Danio rerioZebrafish7955

Full list available at: https://string-db.org/cgi/input?input_page_active_form=organisms

Understanding Confidence Scores

STRING provides combined confidence scores (0-1000) integrating multiple evidence types:

Evidence Channels

  1. Neighborhood (nscore): Conserved genomic neighborhood across species
  2. Fusion (fscore): Gene fusion events
  3. Phylogenetic Profile (pscore): Co-occurrence patterns across species
  4. Coexpression (ascore): Correlated RNA expression
  5. Experimental (escore): Biochemical and genetic experiments
  6. Database (dscore): Curated pathway and complex databases
  7. Text-mining (tscore): Literature co-occurrence and NLP extraction

Choose threshold based on analysis goals:

  • 150 (low confidence): Exploratory analysis, hypothesis generation
  • 400 (medium confidence): Standard analysis, balanced sensitivity/specificity
  • 700 (high confidence): Conservative analysis, high-confidence interactions
  • 900 (highest confidence): Very stringent, experimental evidence preferred

Trade-offs:

  • Lower thresholds: More interactions (higher recall, more false positives)
  • Higher thresholds: Fewer interactions (higher precision, more false negatives)

Network Types

Functional Networks (Default)

Includes all evidence types (experimental, computational, text-mining). Represents proteins that are functionally associated, even without direct physical binding.

When to use:

  • Pathway analysis
  • Functional enrichment studies
  • Systems biology
  • Most general analyses

Physical Networks

Only includes evidence for direct physical binding (experimental data and database annotations for physical interactions).

When to use:

  • Structural biology studies
  • Protein complex analysis
  • Direct binding validation
  • When physical contact is required

API Best Practices

  1. Always map identifiers first: Use string_map_ids() before other operations for faster queries
  2. Use STRING IDs when possible: Use format 9606.ENSP00000269305 instead of gene names
  3. Specify species for networks >10 proteins: Required for accurate results
  4. Respect rate limits: Wait 1 second between API calls
  5. Use versioned URLs for reproducibility: Available in reference documentation
  6. Handle errors gracefully: Check for "Error:" prefix in returned strings
  7. Choose appropriate confidence thresholds: Match threshold to analysis goals

Detailed Reference

For comprehensive API documentation, complete parameter lists, output formats, and advanced usage, refer to references/string_reference.md. This includes:

  • Complete API endpoint specifications
  • All supported output formats (TSV, JSON, XML, PSI-MI)
  • Advanced features (bulk upload, values/ranks enrichment)
  • Error handling and troubleshooting
  • Integration with other tools (Cytoscape, R, Python libraries)
  • Data license and citation information

Troubleshooting

No proteins found:

  • Verify species parameter matches identifiers
  • Try mapping identifiers first with string_map_ids()
  • Check for typos in protein names

Empty network results:

  • Lower confidence threshold (required_score)
  • Check if proteins actually interact
  • Verify species is correct

Timeout or slow queries:

  • Reduce number of input proteins
  • Use STRING IDs instead of gene names
  • Split large queries into batches

"Species required" error:

  • Add species parameter for networks with >10 proteins
  • Always include species for consistency

Results look unexpected:

  • Check STRING version with string_version()
  • Verify network_type is appropriate (functional vs physical)
  • Review confidence threshold selection

Additional Resources

For proteome-scale analysis or complete species network upload:

  • Visit https://string-db.org
  • Use "Upload proteome" feature
  • STRING will generate complete interaction network and predict functions

For bulk downloads of complete datasets:

Data License

STRING data is freely available under Creative Commons BY 4.0 license:

  • Free for academic and commercial use
  • Attribution required when publishing
  • Cite latest STRING publication

Citation

When using STRING in publications, cite the most recent publication from: https://string-db.org/cgi/about

相关技能

寻找 string-database 的替代方案 (Alternative) 或可搭配使用的同类 community Skill?探索以下相关开源技能。

查看全部

openclaw-release-maintainer

Logo of openclaw
openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

333.8k
0
AI

widget-generator

Logo of f
f

Generate customizable widget plugins for the prompts.chat feed system

149.6k
0
AI

flags

Logo of vercel
vercel

The React Framework

138.4k
0
浏览器

pr-review

Logo of pytorch
pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

98.6k
0
开发者工具