Ideal for AI Agents focused on Bioinformatics and Protein Design, such as Claude Code, requiring advanced PyTorch and CUDA capabilities. esm is a protein design skill utilizing the ESM2 Protein Language Model, requiring specific technical prerequisites for operation.

How do I install esm?

Run the command: npx killer-skills add adaptyvbio/protein-design-skills/esm. It works with Cursor, Windsurf, VS Code, Claude Code, and 15+ other IDEs.

What are the use cases for esm?

Key use cases include: Designing novel proteins with specific functions, Predicting protein structures from sequences, Analyzing protein-ligand interactions with PyTorch.

Which IDEs are compatible with esm?

This skill is compatible with Cursor, Windsurf, VS Code, Claude Code, GitHub Copilot, JetBrains, Cline, Roo Code, and many more. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for esm?

Requires Python 3.8+. Needs PyTorch 1.10+ and CUDA 11.0+. Demands significant GPU VRAM (minimum 8GB) and RAM (minimum 16GB).

ESM2 Protein Language Model

Name: esm
Availability: InStock
Rating: 3.0 (1 reviews)
Author: adaptyvbio

Prerequisites

Requirement	Minimum	Recommended
Python	3.8+	3.10
PyTorch	1.10+	2.0+
CUDA	11.0+	11.7+
GPU VRAM	8GB	24GB (A10G)
RAM	16GB	32GB

How to run

First time? See Installation Guide to set up Modal and biomodals.

Option 1: Modal

bash
1cd biomodals
2modal run modal_esm2_predict_masked.py \
3  --input-faa sequences.fasta \
4  --out-dir embeddings/

GPU: A10G (24GB) | Timeout: 300s default

Option 2: Python API (recommended)

python
1import torch
2import esm
3
4# Load model
5model, alphabet = esm.pretrained.esm2_t33_650M_UR50D()
6batch_converter = alphabet.get_batch_converter()
7model = model.eval().cuda()
8
9# Process sequences
10data = [("seq1", "MKTAYIAKQRQISFVK...")]
11batch_labels, batch_strs, batch_tokens = batch_converter(data)
12
13with torch.no_grad():
14    results = model(batch_tokens.cuda(), repr_layers=[33])
15
16# Get embeddings
17embeddings = results["representations"][33]

Key parameters

ESM2 Models

Model	Parameters	Speed	Quality
esm2_t6_8M	8M	Fastest	Fast screening
esm2_t12_35M	35M	Fast	Good
esm2_t33_650M	650M	Medium	Better
esm2_t36_3B	3B	Slow	Best

Output format

embeddings/
├── embeddings.npy       # (N, 1280) array
├── pll_scores.csv       # PLL for each sequence
└── metadata.json        # Sequence info

Sample output

Successful run

$ modal run modal_esm2_predict_masked.py --input-faa designs.fasta
[INFO] Loading ESM2-650M model...
[INFO] Processing 100 sequences...
[INFO] Computing pseudo-log-likelihood...

embeddings/pll_scores.csv:
sequence_id,pll,pll_normalized,length
design_0,-0.82,0.15,78
design_1,-0.95,0.08,85
design_2,-1.23,-0.12,72
...

Summary:
  Mean PLL: -0.91
  Sequences with PLL > 0: 42/100 (42%)

What good output looks like:

PLL_normalized: > 0.0 (more natural-like)
Embeddings shape: (N, 1280) for 650M model
Higher PLL = more natural sequence

Decision tree

Should I use ESM2?
│
├─ What do you need?
│  ├─ Sequence plausibility score → ESM2 PLL ✓
│  ├─ Embeddings for clustering → ESM2 ✓
│  ├─ Variant effect prediction → ESM2 ✓
│  └─ Structure prediction → Use ESMFold
│
├─ What model size?
│  ├─ Fast screening → esm2_t12_35M
│  ├─ Standard use → esm2_t33_650M ✓
│  └─ Best quality → esm2_t36_3B
│
└─ Use case?
   ├─ QC filtering → PLL > 0.0 threshold
   ├─ Diversity analysis → Mean-pooled embeddings
   └─ Mutation scanning → Per-position log-odds

PLL interpretation

Normalized PLL	Interpretation
> 0.2	Very natural sequence
0.0 - 0.2	Good, natural-like
-0.5 - 0.0	Acceptable
< -0.5	May be unnatural

Typical performance

Campaign Size	Time (A10G)	Cost (Modal)	Notes
100 sequences	5-10 min	~$1	Quick screen
1000 sequences	30-60 min	~$5	Standard
5000 sequences	2-3h	~$20	Large batch

Throughput: ~100-200 sequences/minute with 650M model.

Verify

bash
1wc -l embeddings/pll_scores.csv  # Should match input + 1 (header)

Troubleshooting

OOM errors: Use smaller model or batch sequences Slow processing: Use esm2_t12_35M for speed Low PLL scores: May indicate unusual/designed sequences

Error interpretation

Error	Cause	Fix
`RuntimeError: CUDA out of memory`	Sequence too long or large batch	Reduce batch size
`KeyError: representation`	Wrong layer requested	Use layer 33 for 650M model
`ValueError: sequence`	Invalid amino acid	Check for non-standard AAs

Next: Structure prediction with chai or boltz → protein-qc for filtering.

# Core Topics

↓ Quality Score

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for esm MCP Server

! Prerequisites & Limits

# Tags

ESM2 Protein Language Model

Prerequisites

How to run

Option 1: Modal

Option 2: Python API (recommended)

Key parameters

ESM2 Models

Output format

Sample output

Successful run

Decision tree

PLL interpretation

Typical performance

Verify

Troubleshooting

Error interpretation

Related Skills

Looking for an alternative to esm or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

widget-generator

chat-sdk

zustand

data-fetching

About this Skill

Features

# Core Topics

↓ Quality Score

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for esm MCP Server

! Prerequisites & Limits

# Tags

ESM2 Protein Language Model

Prerequisites

How to run

Option 1: Modal

Option 2: Python API (recommended)

Key parameters

ESM2 Models

Output format

Sample output

Successful run

Decision tree

PLL interpretation

Typical performance

Verify

Troubleshooting

Error interpretation

Related Skills

Looking for an alternative to esm or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

widget-generator

chat-sdk

zustand

data-fetching