search-docs
Search local documentation in the docs/ folder for content related to a topic
Explora e instala miles de habilidades para AI Agents en el directorio de Killer-Skills. Compatible con Claude Code, Windsurf, Cursor y más.
Search local documentation in the docs/ folder for content related to a topic
Pluggable sample-level metadata versioning for incremental multimodal pipelines.
The multimodal biomedical data atlas builder
Unified multimodal I/O with Gemini 3. Handle text/image/audio/video inputs, generate images (Imagen/Gemini), videos (Veo). Supports media resolution control, thinking levels, streaming.
Open Source framework for voice and multimodal conversational AI
Transformers.js permite ejecutar modelos de aprendizaje automático de última generación directamente en JavaScript. Soporta tareas de NLP, visión por computadora y audio.
Use Transformers.js to run state-of-the-art machine learning models directly in JavaScript/TypeScript. Supports NLP (text classification, translation, summarization), computer vision (image classification, object detection), audio (speech recognition, audio classification), and multimodal tasks. Works in Node.js and browsers (with WebGPU/WASM) using pre-trained models from Hugging Face Hub.
DART contribution workflow - branching, PRs, code review, dual-PR for bugfixes
A typescript library for connecting videos in your Mux account to multi-modal LLMs.
Un experto en SurrealDB es un desarrollador con habilidades avanzadas en bases de datos multi-modelo y SurrealQL
Multi-modal WhatsApp AI assistant that turns voice notes, images, links, and text into structured knowledge articles using a Claude agent.
ffmpeg es un framework de multimedia que proporciona herramientas de línea de comandos y bibliotecas para grabar, convertir y transmitir audio y video