KS
Killer-Skills

alicloud-ai-audio-tts — how to use alicloud-ai-audio-tts how to use alicloud-ai-audio-tts, alicloud-ai-audio-tts setup guide, python text to speech library, alibaba cloud ai services, tts script compilation, py_compile tutorial, alicloud-ai-audio-tts alternative, text to speech audio generation, python audio processing

v1.0.0
GitHub

About this Skill

Perfect for Voice Assistant Agents needing advanced text-to-speech audio generation capabilities using Alibaba Cloud's AI capabilities. alicloud-ai-audio-tts is a Python-based skill for generating text-to-speech audio using Alibaba Cloud's AI services, utilizing scripts and validation via py_compile.

Features

Compiles TTS scripts using py_compile
Validates script compilation with output file generation
Generates text-to-speech audio files using Alibaba Cloud's AI services
Saves audio links, sample audio files, and request payloads to output directory
Utilizes bash commands for script execution and validation
Supports Python-based development for custom TTS applications

# Core Topics

cinience cinience
[0]
[0]
Updated: 3/6/2026

Quality Score

Top 5%
57
Excellent
Based on code quality & docs
Installation
SYS Universal Install (Auto-Detect)
Cursor IDE Windsurf IDE VS Code IDE
> npx killer-skills add cinience/alicloud-skills/alicloud-ai-audio-tts

Agent Capability Analysis

The alicloud-ai-audio-tts MCP Server by cinience is an open-source Categories.community integration for Claude and other AI agents, enabling seamless task automation and capability expansion. Optimized for how to use alicloud-ai-audio-tts, alicloud-ai-audio-tts setup guide, python text to speech library.

Ideal Agent Persona

Perfect for Voice Assistant Agents needing advanced text-to-speech audio generation capabilities using Alibaba Cloud's AI capabilities.

Core Value

Empowers agents to generate high-quality text-to-speech audio using Python and Alibaba Cloud's AI capabilities, supporting various audio processing tasks with protocols like HTTP requests and file formats such as WAV or MP3.

Capabilities Granted for alicloud-ai-audio-tts MCP Server

Generating audio responses for voice assistants
Creating text-to-speech audio for e-learning platforms
Automating audio content creation for podcasts

! Prerequisites & Limits

  • Requires Alibaba Cloud account and AI credentials
  • Python 3.x compatibility required
  • Dependent on Alibaba Cloud's AI service availability
Project
SKILL.md
3.4 KB
.cursorrules
1.2 KB
package.json
240 B
Ready
UTF-8

# Tags

[No tags]
SKILL.md
Readonly

Category: provider

Model Studio Qwen TTS

Validation

bash
1mkdir -p output/alicloud-ai-audio-tts 2python -m py_compile skills/ai/audio/alicloud-ai-audio-tts/scripts/generate_tts.py && echo "py_compile_ok" > output/alicloud-ai-audio-tts/validate.txt

Pass criteria: command exits 0 and output/alicloud-ai-audio-tts/validate.txt is generated.

Output And Evidence

  • Save generated audio links, sample audio files, and request payloads to output/alicloud-ai-audio-tts/.
  • Keep one validation log per execution.

Critical model names

Use one of the recommended models:

  • qwen3-tts-flash
  • qwen3-tts-instruct-flash
  • qwen3-tts-instruct-flash-2026-01-26

Prerequisites

  • Install SDK (recommended in a venv to avoid PEP 668 limits):
bash
1python3 -m venv .venv 2. .venv/bin/activate 3python -m pip install dashscope
  • Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials (env takes precedence).

Normalized interface (tts.generate)

Request

  • text (string, required)
  • voice (string, required)
  • language_type (string, optional; default Auto)
  • instruction (string, optional; recommended for instruct models)
  • stream (bool, optional; default false)

Response

  • audio_url (string, when stream=false)
  • audio_base64_pcm (string, when stream=true)
  • sample_rate (int, 24000)
  • format (string, wav or pcm depending on mode)

Quick start (Python + DashScope SDK)

python
1import os 2import dashscope 3 4# Prefer env var for auth: export DASHSCOPE_API_KEY=... 5# Or use ~/.alibabacloud/credentials with dashscope_api_key under [default]. 6# Beijing region; for Singapore use: https://dashscope-intl.aliyuncs.com/api/v1 7dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1" 8 9text = "Hello, this is a short voice line." 10response = dashscope.MultiModalConversation.call( 11 model="qwen3-tts-instruct-flash", 12 api_key=os.getenv("DASHSCOPE_API_KEY"), 13 text=text, 14 voice="Cherry", 15 language_type="English", 16 instruction="Warm and calm tone, slightly slower pace.", 17 stream=False, 18) 19 20audio_url = response.output.audio.url 21print(audio_url)

Streaming notes

  • stream=True returns Base64-encoded PCM chunks at 24kHz.
  • Decode chunks and play or concatenate to a pcm buffer.
  • The response contains finish_reason == "stop" when the stream ends.

Operational guidance

  • Keep requests concise; split long text into multiple calls if you hit size or timeout errors.
  • Use language_type consistent with the text to improve pronunciation.
  • Use instruction only when you need explicit style/tone control.
  • Cache by (text, voice, language_type) to avoid repeat costs.

Output location

  • Default output: output/alicloud-ai-audio-tts/audio/
  • Override base dir with OUTPUT_DIR.

Workflow

  1. Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
  2. Run one minimal read-only query first to verify connectivity and permissions.
  3. Execute the target operation with explicit parameters and bounded scope.
  4. Verify results and save output/evidence files.

References

  • references/api_reference.md for parameter mapping and streaming example.

  • Realtime mode is provided by skills/ai/audio/alicloud-ai-audio-tts-realtime/.

  • Voice cloning/design are provided by skills/ai/audio/alicloud-ai-audio-tts-voice-clone/ and skills/ai/audio/alicloud-ai-audio-tts-voice-design/.

  • Source list: references/sources.md

Related Skills

Looking for an alternative to alicloud-ai-audio-tts or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

View All

widget-generator

Logo of f
f

widget-generator is an open-source AI agent skill for creating widget plugins that are injected into prompt feeds on prompts.chat. It supports two rendering modes: standard prompt widgets using default PromptCard styling and custom render widgets built as full React components.

149.6k
0
Design

chat-sdk

Logo of lobehub
lobehub

chat-sdk is a unified TypeScript SDK for building chat bots across multiple platforms, providing a single interface for deploying bot logic.

73.0k
0
Communication

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication