What is bulk-inference?

적합한 상황: Ideal for AI agents that need input jsonl file with at minimum: an image path field, a question/prompt field, and one or more id. 현지화된 요약: Handles JSONL input/output, resume from interruption, and concurrent async requests. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.

How do I install bulk-inference?

Run the command: npx killer-skills add pminhyung/counterfactual-chart-reasoning/bulk-inference. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for bulk-inference?

Key use cases include: 사용 사례: Applying Input JSONL file with at minimum: an image path field, a question/prompt field, and one or more ID, 사용 사례: Applying For vllm local: running vLLM server(s) — use /vllm-serve first, 사용 사례: Applying For openai: OPENAI API KEY env var set.

Which IDEs are compatible with bulk-inference?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for bulk-inference?

제한 사항: Requires repository-specific context from the skill documentation. 제한 사항: Works best when the underlying tools and dependencies are already configured.

bulk-inference

Handles JSONL input/output, resume from interruption, and concurrent async requests. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.

SKILL.md

Readonly

Upstream Repository Material

The section below is imported from the upstream repository and should be treated as secondary evidence. Use the Killer-Skills review above as the primary layer for fit, risk, and installation decisions.

Supporting Evidence

Bulk Inference

Name: bulk-inference
Availability: InStock
Author: pminhyung

Purpose

Execute bulk VLM inference across multiple providers (vLLM local, OpenAI, Gemini) using scripts/inference_runner.py. Handles JSONL input/output, resume from interruption, and concurrent async requests.

Prerequisites

Input JSONL file with at minimum: an image path field, a question/prompt field, and one or more ID fields.
For vllm_local: running vLLM server(s) — use /vllm-serve first.
For openai: OPENAI_API_KEY env var set.
For gemini: GOOGLE_API_KEY env var set.

Process

Gather parameters from user:
- --provider: vllm_local, openai, or gemini
- --endpoints: server URLs (vllm_local) or API base URL
- --model-id: HF model name or API model ID
- --input: path to input JSONL
- --output: path for output JSONL
- --n-concurrent: requests per endpoint (vllm) or total (API), default 6
- --max-tokens: default 100
- --temperature: default 0.0
- Optional: --api-key-env, --reasoning-effort, --thinking-budget, --rate-limit-delay
- Optional: --image-field, --question-field, --id-fields, --prompt-template
Validate inputs — Confirm input JSONL exists and is readable. Check provider-specific requirements (API keys, server health).

Run inference:

bash
1python scripts/inference_runner.py \
2  --provider {provider} \
3  --endpoints {urls} \
4  --model-id {model_id} \
5  --input {input_jsonl} \
6  --output {output_jsonl} \
7  --n-concurrent {n} \
8  --max-tokens {max_tokens} \
9  --temperature {temp} \
10  [--api-key-env {env_var}] \
11  [--reasoning-effort {effort}] \
12  [--thinking-budget {budget}] \
13  [--rate-limit-delay {delay}] \
14  [--no-resume] \
15  [--image-field {field}] \
16  [--question-field {field}] \
17  [--id-fields {f1},{f2}] \
18  [--prompt-template "Answer the question..."]

Monitor output — The script prints a tqdm progress bar and final summary with total, success, errors, and throughput.
Report results — After completion, report: output file path, total processed, success rate, error count.

Input JSONL Format

Each line is a JSON object. Required fields are configurable via --image-field, --question-field, --id-fields. Defaults:

image_path — path to image file
question_string — prompt/question text
triplet_id, condition — composite ID for resume

Output JSONL Format

Each output line preserves ALL original input fields plus:

json
1{"...original fields...", "model": "...", "raw_response": "...", "parsed_answer": "...", "error": null}

Rules

Resume is ON by default — interrupted runs continue from where they stopped.
Never modify the input JSONL file.
Append mode: output JSONL is opened in append mode, one line per completed item.
All errors are captured per-item; the runner does not abort on individual failures.

bulk-inference — for Claude Code bulk-inference, counterfactual-chart-reasoning, community, for Claude Code, ide skills, vllm_local, vllm-serve, openai, OPENAI_API_KEY, gemini

# Core Topics

Killer-Skills Review

이 스킬을 사용하는 이유

최적의 용도

↓ 실행 가능한 사용 사례 for bulk-inference

! 보안 및 제한 사항

Why this page is reference-only

Source Boundary

Decide The Next Action Before You Keep Reading Repository Material

Start With Installation And Validation

Cross-Check Against Trusted Picks

Move To Workflow Collections For Team Rollout

Browser Sandbox Environment

⚡️ Ready to unleash?

FAQ & Installation Steps

? Frequently Asked Questions

What is bulk-inference?

How do I install bulk-inference?

What are the use cases for bulk-inference?

Which IDEs are compatible with bulk-inference?

Are there any limitations for bulk-inference?

↓ How To Install

! Reference-Only Mode

Upstream Repository Material

bulk-inference

Bulk Inference

Purpose

Prerequisites

Process

Input JSONL Format

Output JSONL Format

Rules

관련 스킬

Looking for an alternative to bulk-inference or another community skill for your workflow? Explore these related open-source skills.

openclaw-release-maintainer

widget-generator

flags

pr-review

bulk-inference — for Claude Code bulk-inference, counterfactual-chart-reasoning, community, for Claude Code, ide skills, vllm_local, vllm-serve, openai, OPENAI_API_KEY, gemini

이 스킬 정보

기능

# Core Topics

Killer-Skills Review

이 스킬을 사용하는 이유

최적의 용도

↓ 실행 가능한 사용 사례 for bulk-inference

! 보안 및 제한 사항

Why this page is reference-only

Source Boundary

Decide The Next Action Before You Keep Reading Repository Material

Start With Installation And Validation

Cross-Check Against Trusted Picks

Move To Workflow Collections For Team Rollout

Browser Sandbox Environment

⚡️ Ready to unleash?

FAQ & Installation Steps

? Frequently Asked Questions

What is bulk-inference?

How do I install bulk-inference?

What are the use cases for bulk-inference?

Which IDEs are compatible with bulk-inference?

Are there any limitations for bulk-inference?

↓ How To Install

! Reference-Only Mode

Upstream Repository Material

bulk-inference

Bulk Inference

Purpose

Prerequisites

Process

Input JSONL Format

Output JSONL Format

Rules

관련 스킬

Looking for an alternative to bulk-inference or another community skill for your workflow? Explore these related open-source skills.

openclaw-release-maintainer

widget-generator

flags

pr-review