What is alicloud-ai-audio-tts?

Perfect for AI Agents needing advanced text-to-speech capabilities using Model Studio DashScope Qwen TTS models. Generate human-like speech audio with Model Studio DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). Use when converting text to speech, producing voice lines for short drama/news

How do I install alicloud-ai-audio-tts?

Run the command: npx killer-skills add cinience/alicloud-skills/alicloud-ai-audio-tts. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for alicloud-ai-audio-tts?

Key use cases include: Generating voice lines for video content, Converting text to speech for audiobooks, Producing audio files for e-learning platforms.

Which IDEs are compatible with alicloud-ai-audio-tts?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for alicloud-ai-audio-tts?

Requires Python environment. Dependent on Model Studio Qwen TTS models. Needs filesystem access for output.

alicloud-ai-audio-tts

Install alicloud-ai-audio-tts, an AI agent skill for AI agent workflows and automation. Review the use cases, limitations, and setup path before rollout.

SKILL.md

Readonly

Upstream Repository Material

The section below is imported from the upstream repository and should be treated as secondary evidence. Use the Killer-Skills review above as the primary layer for fit, risk, and installation decisions.

Supporting Evidence

Category: provider

Model Studio Qwen TTS

Name: alicloud-ai-audio-tts
Availability: InStock
Author: cinience

Validation

bash
1mkdir -p output/alicloud-ai-audio-tts
2python -m py_compile skills/ai/audio/alicloud-ai-audio-tts/scripts/generate_tts.py && echo "py_compile_ok" > output/alicloud-ai-audio-tts/validate.txt

Pass criteria: command exits 0 and output/alicloud-ai-audio-tts/validate.txt is generated.

Output And Evidence

Save generated audio links, sample audio files, and request payloads to output/alicloud-ai-audio-tts/.
Keep one validation log per execution.

Critical model names

Use one of the recommended models:

qwen3-tts-flash
qwen3-tts-instruct-flash
qwen3-tts-instruct-flash-2026-01-26

Prerequisites

Install SDK (recommended in a venv to avoid PEP 668 limits):

bash
1python3 -m venv .venv
2. .venv/bin/activate
3python -m pip install dashscope

Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials (env takes precedence).

Normalized interface (tts.generate)

Request

text (string, required)
voice (string, required)
language_type (string, optional; default Auto)
instruction (string, optional; recommended for instruct models)
stream (bool, optional; default false)

Response

audio_url (string, when stream=false)
audio_base64_pcm (string, when stream=true)
sample_rate (int, 24000)
format (string, wav or pcm depending on mode)

Quick start (Python + DashScope SDK)

python
1import os
2import dashscope
3
4# Prefer env var for auth: export DASHSCOPE_API_KEY=...
5# Or use ~/.alibabacloud/credentials with dashscope_api_key under [default].
6# Beijing region; for Singapore use: https://dashscope-intl.aliyuncs.com/api/v1
7dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"
8
9text = "Hello, this is a short voice line."
10response = dashscope.MultiModalConversation.call(
11    model="qwen3-tts-instruct-flash",
12    api_key=os.getenv("DASHSCOPE_API_KEY"),
13    text=text,
14    voice="Cherry",
15    language_type="English",
16    instruction="Warm and calm tone, slightly slower pace.",
17    stream=False,
18)
19
20audio_url = response.output.audio.url
21print(audio_url)

Streaming notes

stream=True returns Base64-encoded PCM chunks at 24kHz.
Decode chunks and play or concatenate to a pcm buffer.
The response contains finish_reason == "stop" when the stream ends.

Operational guidance

Keep requests concise; split long text into multiple calls if you hit size or timeout errors.
Use language_type consistent with the text to improve pronunciation.
Use instruction only when you need explicit style/tone control.
Cache by (text, voice, language_type) to avoid repeat costs.

Output location

Default output: output/alicloud-ai-audio-tts/audio/
Override base dir with OUTPUT_DIR.

Workflow

Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
Run one minimal read-only query first to verify connectivity and permissions.
Execute the target operation with explicit parameters and bounded scope.
Verify results and save output/evidence files.

References

references/api_reference.md for parameter mapping and streaming example.
Realtime mode is provided by skills/ai/audio/alicloud-ai-audio-tts-realtime/.
Voice cloning/design are provided by skills/ai/audio/alicloud-ai-audio-tts-voice-clone/ and skills/ai/audio/alicloud-ai-audio-tts-voice-design/.
Source list: references/sources.md

alicloud-ai-audio-tts — community alicloud-ai-audio-tts, alicloud-skills, community, ide skills

About this Skill

Killer-Skills Review

Core Value

Ideal Agent Persona

↓ Capabilities Granted for alicloud-ai-audio-tts

! Prerequisites & Limits

Source Boundary

Decide The Next Action Before You Keep Reading Repository Material

Start With Installation And Validation

Cross-Check Against Trusted Picks

Move To Workflow Collections For Team Rollout

Browser Sandbox Environment

⚡️ Ready to unleash?

FAQ & Installation Steps

? Frequently Asked Questions