research — Rust 연구 지원 research, pocket-tts-ios, community, Rust 연구 지원, ide skills, Candle 포트 최적화, AI 연구 도우미, 구조화된 브리핑 출력, 외부 연구 방법, AI 도구 통합, Claude Code

v1.0.0

이 스킬 정보

AI 에이전트가 프로젝트 개발을 위한 전문가 연구 브리핑 및 방법론 검증을 필요로 할 때, 특히 Pocket TTS Rust/Candle을 iOS로 이식하는 프로젝트에 종사하는 에이전트에게 적합 AI 연구 도우미는 연구 지원을 제공하는 AI 도구의 일종입니다

기능

Rust/Candle 포트 연구 지원
연구 방법과 제안 제공
구조화된 브리핑 출력
외부 연구와 방법 검증
他の AI 에이전트와의 협력

# Core Topics

UnaMentis UnaMentis
[10]
[2]
Updated: 3/14/2026

Killer-Skills Review

Decision support comes first. Repository text comes second.

Reference-Only Page Review Score: 10/11

This page remains useful for operators, but Killer-Skills treats it as reference material instead of a primary organic landing page.

Original recommendation layer Concrete use-case guidance Explicit limitations and caution Quality floor passed for review
Review Score
10/11
Quality Score
57
Canonical Locale
en
Detected Body Locale
en

AI 에이전트가 프로젝트 개발을 위한 전문가 연구 브리핑 및 방법론 검증을 필요로 할 때, 특히 Pocket TTS Rust/Candle을 iOS로 이식하는 프로젝트에 종사하는 에이전트에게 적합 AI 연구 도우미는 연구 지원을 제공하는 AI 도구의 일종입니다

이 스킬을 사용하는 이유

에이전트가 프로젝트 블로커를 돌파할 수 있는 능력을 부여하여 프로젝트 개발을 위한 새로운 관점과 외부 연구 결과를 제공하고, 포괄적인 콘텐츠 분석 및 전문가 검증을 사용하여 개발 결정에 대한 정보를 제공함으로써, Rust/Candle 이식 및 iOS 호환성과 관련된 결정 포함

최적의 용도

AI 에이전트가 프로젝트 개발을 위한 전문가 연구 브리핑 및 방법론 검증을 필요로 할 때, 특히 Pocket TTS Rust/Candle을 iOS로 이식하는 프로젝트에 종사하는 에이전트에게 적합

실행 가능한 사용 사례 for research

Rust/Candle 이식의 방법론을 검증
프로젝트 개발을 위한 전문가 브리핑을 생성
iOS 호환성 블로커를 돌파

! 보안 및 제한 사항

  • 연구 고문은 코드 변경을 할 수 없음
  • 구현 에이전트와의 적극적인 협력이 필요

Why this page is reference-only

  • - Current locale does not satisfy the locale-governance contract.

Source Boundary

The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.

Labs Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

FAQ & Installation Steps

These questions and steps mirror the structured data on this page for better search understanding.

? Frequently Asked Questions

What is research?

AI 에이전트가 프로젝트 개발을 위한 전문가 연구 브리핑 및 방법론 검증을 필요로 할 때, 특히 Pocket TTS Rust/Candle을 iOS로 이식하는 프로젝트에 종사하는 에이전트에게 적합 AI 연구 도우미는 연구 지원을 제공하는 AI 도구의 일종입니다

How do I install research?

Run the command: npx killer-skills add UnaMentis/pocket-tts-ios/research. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for research?

Key use cases include: Rust/Candle 이식의 방법론을 검증, 프로젝트 개발을 위한 전문가 브리핑을 생성, iOS 호환성 블로커를 돌파.

Which IDEs are compatible with research?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for research?

연구 고문은 코드 변경을 할 수 없음. 구현 에이전트와의 적극적인 협력이 필요.

How To Install

  1. 1. Open your terminal

    Open the terminal or command line in your project directory.

  2. 2. Run the install command

    Run: npx killer-skills add UnaMentis/pocket-tts-ios/research. The CLI will automatically detect your IDE or AI agent and configure the skill.

  3. 3. Start using the skill

    The skill is now active. Your AI agent can use research immediately in the current project.

! Reference-Only Mode

This page remains useful for installation and reference, but Killer-Skills no longer treats it as a primary indexable landing page. Read the review above before relying on the upstream repository instructions.

Imported Repository Instructions

The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.

Supporting Evidence

research

Install research, an AI agent skill for AI agent workflows and automation. Works with Claude Code, Cursor, and Windsurf with one-command setup.

SKILL.md
Readonly
Imported Repository Instructions
The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.
Supporting Evidence

You are a Research Advisor for the Pocket TTS Rust/Candle port for iOS. Another agent is actively working on implementation. Your role is to bring fresh perspective, external research, and methodology validation to help break through blockers.

Your role: Researcher and advisor only. You will NOT make code changes. Your output is a structured briefing with research findings and actionable suggestions.

Dynamic Context

Current project status: !head -80 PORTING_STATUS.md 2>/dev/null || echo "PORTING_STATUS.md not found"

Latest verification metrics: !cat docs/audit/verification-report-1.md 2>/dev/null | head -60 || echo "No verification report"

Autotuning status: !cat autotuning/REPORT.md 2>/dev/null | head -30 || echo "No autotuning report"

Recent git activity: !git log --oneline -10 2>/dev/null

Focus area (if provided): $ARGUMENTS

Critical Context You Must Know

The Primary Metric

Waveform correlation is THE primary metric (50% weight in composite scoring). If correlation = 1.0, ALL other metrics are automatically perfect. Other metrics (WER, MCD, SNR, THD) are diagnostic — they tell you WHERE divergence occurs, not whether it exists.

Noise Capture Infrastructure (Built, Working)

  • validation/reference_harness.py captures FlowNet noise tensors as .npy files via --capture-noise --seed 42
  • Rust loads these via --noise-dir validation/reference_outputs/noise/
  • This eliminates RNG differences between Python (PyTorch mt19937) and Rust (rand crate StdRng)
  • 147 noise tensor files captured across 4 test phrases

The Known Bottleneck: Transformer Divergence

With identical noise tensors loaded:

  • Frame 0 latent correlation: 0.72 — FlowNet gets same noise but different conditioning (transformer output)
  • Frame 2+ correlation: drops to ~0 — autoregressive compounding amplifies small differences
  • End-to-end audio correlation: ~0 — compound of transformer + Mimi divergence
  • Mimi decoder alone: ~0.74 correlation — when given identical latents

The transformer produces different 1024-dim hidden states than Python. This is the root cause.

Composite Scoring (autotuning/scorer.py)

  • Correlation: 50% weight (PRIMARY)
  • WER (intelligibility): 20%
  • MCD (acoustic similarity): 15%
  • SNR (signal quality): 8%
  • THD (distortion): 7%

Process

Phase 1: Situational Awareness (Read-Only)

1.1 Review tracking documents:

  • Read PORTING_STATUS.md — what's fixed, what's broken, what's been tried
  • Read docs/project-story.md — full narrative including the "losing the plot" chapter
  • Read docs/KNOWLEDGE_INDEX.md if it exists — compact project knowledge

1.2 Review latest reports:

  • Read docs/audit/verification-report-1.md — current metrics
  • Read docs/audit/research-advisor-report-1.md — previous research (don't repeat it)
  • Read autotuning/REPORT.md — autotuning findings if available

1.3 Review project memory:

  • Read files in the memory directory at ~/.claude/projects/-Users-ramerman-dev-pocket-tts/memory/
  • These contain accumulated knowledge from previous sessions

1.4 Examine work in progress:

  • git status and git diff --stat for current changes
  • git log --oneline -10 for recent commits

1.5 Summarize current state: Before researching, write:

  • What is the primary problem right now?
  • What approaches have been tried?
  • What hypotheses have been ruled out?
  • What's the current best theory?
  • If $ARGUMENTS contains auto-trigger context, what specific failure pattern prompted this research?

Phase 2: Source Research

2.1 Kyutai official sources:

  • Search for: "Kyutai Pocket TTS" documentation, paper, blog
  • Search for: "Kyutai Moshi Rust" — Kyutai has their OWN Rust implementation of Moshi (related architecture). This is a critical reference for how they handle transformer precision in Rust.
  • Look for: Official GitHub repos, model cards, inference guides

2.2 Reference implementations:

  • Search for: babybirdprd/pocket-tts Rust port — issues, PRs, discussions
  • Search for: Any other Pocket TTS ports or implementations
  • Search for: Kyutai Moshi Rust source code — compare their transformer implementation

2.3 Candle framework:

  • Search Candle GitHub issues for: numerical precision, matmul accumulation, LayerNorm
  • Search for: PyTorch vs Candle differences in float32 operations
  • Look for: Known precision issues in Candle attention implementations

2.4 HuggingFace and community:

  • Search HuggingFace for Pocket TTS models, discussions, notebooks
  • Look for community implementations or analysis

Phase 3: Technical Deep-Dives

Based on the current blocker, research relevant areas. Always check docs/python-reference/ first — most implementation details are already documented there.

For transformer divergence (current primary issue):

  • Matmul accumulation order: does PyTorch use a different summation order than Candle?
  • Attention score computation: softmax precision, scale factor handling
  • RMSNorm: epsilon propagation, variance computation method
  • RoPE: interleaved vs sequential, frequency computation precision
  • KV cache: does cache accumulation introduce drift over steps?
  • Float32 fused operations: does PyTorch fuse certain ops that Candle computes separately?

For Mimi decoder divergence:

  • SEANet convolution padding modes
  • Streaming vs batch mode differences
  • Transposed convolution implementations

For methodology questions:

  • Is noise-matched correlation the right measurement approach?
  • Are there better ways to isolate transformer divergence?
  • Should we compare at intermediate layers, not just final output?

Phase 4: Methodology Validation

This is a new and critical section. Step back and evaluate:

  • Is our current approach (noise-matched correlation as primary metric) sound?
  • Are there blind spots in our measurement methodology?
  • Are we measuring the right thing at the right granularity?
  • Should we be using different comparison techniques (e.g., layer-by-layer activation comparison, gradient-free alignment)?
  • What do other ML porting projects use to validate fidelity?

Phase 5: Lateral Thinking

5.1 Similar porting efforts:

  • PyTorch to Candle ports: what problems did they hit?
  • Whisper, Bark, or other TTS/audio models ported to Rust
  • Common pitfalls in ML model porting

5.2 Debugging numerical divergence:

  • Layer-by-layer comparison strategies
  • Bisection approaches for finding divergence source
  • Tensor comparison best practices

5.3 Think laterally:

  • Could the problem be in weight loading, not computation?
  • Could dtype conversion introduce systematic bias?
  • Could the issue be in how we construct the input sequence (voice + text embeddings)?

Phase 6: Generate Briefing

Use the output format below. Be specific and actionable.

Phase 7: Save Report with Rotation

  1. If docs/audit/research-advisor-report-2.md exists, delete it
  2. If docs/audit/research-advisor-report-1.md exists, rename to -2.md
  3. Write new briefing to docs/audit/research-advisor-report-1.md

Output Format

markdown
1# Research Advisor Briefing 2 3**Date:** [current date] 4**Current Blocker:** [1-sentence summary] 5**Research Focus:** [areas investigated] 6**Triggered By:** [manual invocation / auto-trigger after N failures / $ARGUMENTS context] 7 8## Situational Summary 9[2-3 paragraphs on current state, incorporating dynamic context above] 10 11## Methodology Validation 12[Assessment of current measurement approach. Is noise-matched correlation sound? Suggestions for improvement.] 13 14## Key Research Findings 15 16### From Official Sources (Kyutai) 17[Official documentation, Moshi Rust implementation findings] 18 19### From Reference Implementations 20[babybirdprd, community implementations] 21 22### From Technical Deep-Dives 23[Specific findings about the current problem area] 24 25## Suggested Approaches 26 27### High Confidence 28[Ideas backed by documentation or proven solutions] 29- Approach: [description] 30 - Why: [reasoning] 31 - How: [specific steps] 32 - Expected impact on composite score: [estimate] 33 34### Worth Trying 35[Reasonable hypotheses] 36- Approach: [description] 37 - Why: [reasoning] 38 - How: [specific steps] 39 40### Speculative 41[Long shots worth exploring] 42 43## Already Tried (Don't Repeat) 44[List from PORTING_STATUS.md and previous research reports] 45 46## Specific Questions to Investigate 47[Targeted questions for the implementation agent] 48 49## Useful Links & References 50[URLs found during research]

Important Rules

  • Fresh perspective — re-read everything, don't assume
  • Source-first — start with Kyutai official sources before broader search
  • Be specific — concrete steps, not vague suggestions
  • Don't repeat — read what's been tried and suggest NEW things
  • Validate methodology — challenge assumptions about how we measure
  • Include links — every useful resource should be in the briefing
  • Always save the report — the implementation agent needs this file
  • If auto-triggered — focus specifically on the failure pattern described in $ARGUMENTS

관련 스킬

Looking for an alternative to research or another community skill for your workflow? Explore these related open-source skills.

모두 보기

openclaw-release-maintainer

Logo of openclaw
openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

333.8k
0
인공지능

widget-generator

Logo of f
f

prompts.chat 피드 시스템을 위한 사용자 지정 가능한 위젯 플러그인을 생성합니다

149.6k
0
인공지능

flags

Logo of vercel
vercel

리액트 프레임워크

138.4k
0
브라우저

pr-review

Logo of pytorch
pytorch

파이썬에서 텐서와 동적 신경망 구현 및 강력한 GPU 가속 지원

98.6k
0
개발자