KS
Killer-Skills

google-gemini-embeddings — how to use google-gemini-embeddings how to use google-gemini-embeddings, google-gemini-embeddings setup guide, google-gemini-embeddings alternative, google-gemini-embeddings vs Mastra, gemini-embedding-001 model tutorial, google-gemini-embeddings install

v1.0.0
GitHub

About this Skill

Perfect for NLP Agents needing advanced text embedding capabilities with the gemini-embedding-001 model google-gemini-embeddings is a comprehensive AI-powered travel assistant skill built with Mastra, TanStack Start, and AI SDK, featuring agent networks and dynamic UI.

Features

Generates text embeddings using the gemini-embedding-001 model
Supports SDK usage and REST API patterns for integration
Enables batch processing and RAG integration with Cloudflare Vectorize
Facilitates advanced use cases like semantic search and document clustering
Includes a quick start guide for rapid development and deployment

# Core Topics

ataschz ataschz
[0]
[0]
Updated: 3/6/2026

Quality Score

Top 5%
54
Excellent
Based on code quality & docs
Installation
SYS Universal Install (Auto-Detect)
Cursor IDE Windsurf IDE VS Code IDE
> npx killer-skills add ataschz/tanstack-start-mastra-example/google-gemini-embeddings

Agent Capability Analysis

The google-gemini-embeddings MCP Server by ataschz is an open-source Categories.community integration for Claude and other AI agents, enabling seamless task automation and capability expansion. Optimized for how to use google-gemini-embeddings, google-gemini-embeddings setup guide, google-gemini-embeddings alternative.

Ideal Agent Persona

Perfect for NLP Agents needing advanced text embedding capabilities with the gemini-embedding-001 model

Core Value

Empowers agents to generate text embeddings using the gemini-embedding-001 model, leveraging SDK usage, REST API patterns, and advanced use cases like semantic search and document clustering with Cloudflare Vectorize

Capabilities Granted for google-gemini-embeddings MCP Server

Generating text embeddings for semantic search
Integrating RAG with Cloudflare Vectorize for advanced document clustering
Utilizing batch processing for efficient text analysis

! Prerequisites & Limits

  • Requires gemini-embedding-001 model access
  • Dependent on Cloudflare Vectorize for RAG integration
Project
SKILL.md
26.3 KB
.cursorrules
1.2 KB
package.json
240 B
Ready
UTF-8

# Tags

[No tags]
SKILL.md
Readonly

Google Gemini Embeddings

Complete production-ready guide for Google Gemini embeddings API

This skill provides comprehensive coverage of the gemini-embedding-001 model for generating text embeddings, including SDK usage, REST API patterns, batch processing, RAG integration with Cloudflare Vectorize, and advanced use cases like semantic search and document clustering.


Table of Contents

  1. Quick Start
  2. gemini-embedding-001 Model
  3. Basic Embeddings
  4. Batch Embeddings
  5. Task Types
  6. RAG Patterns
  7. Error Handling
  8. Best Practices

1. Quick Start

Installation

Install the Google Generative AI SDK:

bash
1npm install @google/genai@^1.37.0

For TypeScript projects:

bash
1npm install -D typescript@^5.0.0

Environment Setup

Set your Gemini API key as an environment variable:

bash
1export GEMINI_API_KEY="your-api-key-here"

Get your API key from: https://aistudio.google.com/apikey

First Embedding Example

typescript
1import { GoogleGenAI } from "@google/genai"; 2 3const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY }); 4 5const response = await ai.models.embedContent({ 6 model: 'gemini-embedding-001', 7 content: 'What is the meaning of life?', 8 config: { 9 taskType: 'RETRIEVAL_QUERY', 10 outputDimensionality: 768 11 } 12}); 13 14console.log(response.embedding.values); // [0.012, -0.034, ...] 15console.log(response.embedding.values.length); // 768

Result: A 768-dimension embedding vector representing the semantic meaning of the text.


2. gemini-embedding-001 Model

Model Specifications

Current Model: gemini-embedding-001 (stable, production-ready)

  • Status: Stable
  • Experimental: gemini-embedding-exp-03-07 (deprecated October 2025, do not use)

Dimensions

The model supports flexible output dimensionality using Matryoshka Representation Learning:

DimensionUse CaseStoragePerformance
768Recommended for most use casesLowFast
1536Balance between accuracy and efficiencyMediumMedium
3072Maximum accuracy (default)HighSlower
128-3071Custom (any value in range)VariableVariable

Default: 3072 dimensions Recommended: 768, 1536, or 3072 for optimal performance

Context Window

  • Input Limit: 2,048 tokens per text
  • Input Type: Text only (no images, audio, or video)

Rate Limits

TierRPMTPMRPDRequirements
Free10030,0001,000No billing account
Tier 13,0001,000,000-Billing account linked
Tier 25,0005,000,000-$250+ spending, 30-day wait
Tier 310,00010,000,000-$1,000+ spending, 30-day wait

RPM = Requests Per Minute TPM = Tokens Per Minute RPD = Requests Per Day

Output Format

typescript
1{ 2 embedding: { 3 values: number[] // Array of floating-point numbers 4 } 5}

3. Basic Embeddings

SDK Approach (Node.js)

Single text embedding:

typescript
1import { GoogleGenAI } from "@google/genai"; 2 3const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY }); 4 5const response = await ai.models.embedContent({ 6 model: 'gemini-embedding-001', 7 content: 'The quick brown fox jumps over the lazy dog', 8 config: { 9 taskType: 'SEMANTIC_SIMILARITY', 10 outputDimensionality: 768 11 } 12}); 13 14console.log(response.embedding.values); 15// [0.00388, -0.00762, 0.01543, ...]

Fetch Approach (Cloudflare Workers)

For Workers/edge environments without SDK support:

typescript
1export default { 2 async fetch(request: Request, env: Env): Promise<Response> { 3 const apiKey = env.GEMINI_API_KEY; 4 const text = "What is the meaning of life?"; 5 6 const response = await fetch( 7 'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent', 8 { 9 method: 'POST', 10 headers: { 11 'x-goog-api-key': apiKey, 12 'Content-Type': 'application/json' 13 }, 14 body: JSON.stringify({ 15 content: { 16 parts: [{ text }] 17 }, 18 taskType: 'RETRIEVAL_QUERY', 19 outputDimensionality: 768 20 }) 21 } 22 ); 23 24 const data = await response.json(); 25 26 // Response format: 27 // { 28 // embedding: { 29 // values: [0.012, -0.034, ...] 30 // } 31 // } 32 33 return new Response(JSON.stringify(data), { 34 headers: { 'Content-Type': 'application/json' } 35 }); 36 } 37};

Response Parsing

typescript
1interface EmbeddingResponse { 2 embedding: { 3 values: number[]; 4 }; 5} 6 7const response: EmbeddingResponse = await ai.models.embedContent({ 8 model: 'gemini-embedding-001', 9 content: 'Sample text', 10 config: { taskType: 'SEMANTIC_SIMILARITY' } 11}); 12 13const embedding: number[] = response.embedding.values; 14const dimensions: number = embedding.length; // 3072 by default

Normalization Requirement

⚠️ CRITICAL: When using dimensions other than 3072, you MUST normalize embeddings before computing similarity. Only 3072-dimensional embeddings are pre-normalized by the API.

Why This Matters: Non-normalized embeddings have varying magnitudes that distort cosine similarity calculations, leading to incorrect search results.

Normalization Helper Function:

typescript
1/** 2 * Normalize embedding vector for accurate similarity calculations. 3 * REQUIRED for dimensions other than 3072. 4 * 5 * @param vector - Embedding values from API response 6 * @returns Normalized vector (unit length) 7 */ 8function normalize(vector: number[]): number[] { 9 const magnitude = Math.sqrt( 10 vector.reduce((sum, val) => sum + val * val, 0) 11 ); 12 return vector.map(val => val / magnitude); 13} 14 15// Usage with 768 or 1536 dimensions 16const response = await ai.models.embedContent({ 17 model: 'gemini-embedding-001', 18 content: text, 19 config: { 20 taskType: 'RETRIEVAL_QUERY', 21 outputDimensionality: 768 // NOT 3072 22 } 23}); 24 25// ❌ WRONG - Use raw values directly 26const embedding = response.embedding.values; 27await vectorize.insert([{ id, values: embedding }]); 28 29// ✅ CORRECT - Normalize first 30const normalized = normalize(response.embedding.values); 31await vectorize.insert([{ id, values: normalized }]);

Source: Official Embeddings Documentation


4. Batch Embeddings

Multiple Texts in One Request (SDK)

Generate embeddings for multiple texts simultaneously:

typescript
1import { GoogleGenAI } from "@google/genai"; 2 3const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY }); 4 5const texts = [ 6 "What is the meaning of life?", 7 "How does photosynthesis work?", 8 "Tell me about the history of the internet." 9]; 10 11const response = await ai.models.embedContent({ 12 model: 'gemini-embedding-001', 13 contents: texts, // Array of strings 14 config: { 15 taskType: 'RETRIEVAL_DOCUMENT', 16 outputDimensionality: 768 17 } 18}); 19 20// Process each embedding 21response.embeddings.forEach((embedding, index) => { 22 console.log(`Text ${index}: ${texts[index]}`); 23 console.log(`Embedding: ${embedding.values.slice(0, 5)}...`); 24 console.log(`Dimensions: ${embedding.values.length}`); 25});

Batch REST API (fetch)

Use the batchEmbedContents endpoint:

typescript
1const response = await fetch( 2 'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:batchEmbedContents', 3 { 4 method: 'POST', 5 headers: { 6 'x-goog-api-key': apiKey, 7 'Content-Type': 'application/json' 8 }, 9 body: JSON.stringify({ 10 requests: texts.map(text => ({ 11 model: 'models/gemini-embedding-001', 12 content: { 13 parts: [{ text }] 14 }, 15 taskType: 'RETRIEVAL_DOCUMENT' 16 })) 17 }) 18 } 19); 20 21const data = await response.json(); 22// data.embeddings: Array of {values: number[]}

Batch API Known Issues

⚠️ Ordering Bug (December 2025): Batch API may not preserve ordering with large batch sizes (>500 texts).

  • Symptom: Entry 328 appears at position 628 (silent data corruption)
  • Impact: Results cannot be reliably matched back to input texts
  • Workaround: Process smaller batches (<100 texts) or add unique IDs to verify ordering
  • Status: Acknowledged by Google, internal bug created (P0 priority)
  • Source: GitHub Issue #1207

⚠️ Memory Limit (December 2025): Large batches (>10k embeddings) can cause ERR_STRING_TOO_LONG crash.

  • Error: Cannot create a string longer than 0x1fffffe8 characters
  • Cause: API response includes excessive whitespace (~536MB limit)
  • Workaround: Limit to <5,000 texts per batch
  • Source: GitHub Issue #1205

⚠️ Rate Limit Anomaly (January 2026): Batch API may return 429 RESOURCE_EXHAUSTED even when under quota.

  • Status: Under investigation by Google team
  • Workaround: Implement exponential backoff and retry logic
  • Source: GitHub Issue #1264

Chunking for Rate Limits

When processing large datasets, chunk requests to stay within rate limits:

typescript
1async function batchEmbedWithRateLimit( 2 texts: string[], 3 batchSize: number = 50, // REDUCED from 100 due to ordering bug 4 delayMs: number = 60000 // 1 minute delay between batches 5): Promise<number[][]> { 6 const allEmbeddings: number[][] = []; 7 8 for (let i = 0; i < texts.length; i += batchSize) { 9 const batch = texts.slice(i, i + batchSize); 10 11 console.log(`Processing batch ${i / batchSize + 1} (${batch.length} texts)`); 12 13 const response = await ai.models.embedContent({ 14 model: 'gemini-embedding-001', 15 contents: batch, 16 config: { 17 taskType: 'RETRIEVAL_DOCUMENT', 18 outputDimensionality: 768 19 } 20 }); 21 22 allEmbeddings.push(...response.embeddings.map(e => e.values)); 23 24 // Wait before next batch (except last batch) 25 if (i + batchSize < texts.length) { 26 await new Promise(resolve => setTimeout(resolve, delayMs)); 27 } 28 } 29 30 return allEmbeddings; 31} 32 33// Usage 34const embeddings = await batchEmbedWithRateLimit(documents, 50);

Performance Optimization

Tips:

  1. Use batch API when embedding multiple texts (single request vs multiple requests)
  2. Choose lower dimensions (768) for faster processing and less storage
  3. Implement exponential backoff for rate limit errors
  4. Cache embeddings to avoid redundant API calls

5. Task Types

The taskType parameter optimizes embeddings for specific use cases. Always specify a task type for best results.

Available Task Types (8 total)

Task TypeUse CaseExample
RETRIEVAL_QUERYUser search queries"How do I fix a flat tire?"
RETRIEVAL_DOCUMENTDocuments to be indexed/searchedProduct descriptions, articles
SEMANTIC_SIMILARITYComparing text similarityDuplicate detection, clustering
CLASSIFICATIONCategorizing textsSpam detection, sentiment analysis
CLUSTERINGGrouping similar textsTopic modeling, content organization
CODE_RETRIEVAL_QUERYCode search queries"function to sort array"
QUESTION_ANSWERINGQuestions seeking answersFAQ matching
FACT_VERIFICATIONVerifying claims with evidenceFact-checking systems

When to Use Which

RAG Systems (Retrieval Augmented Generation):

typescript
1// When embedding user queries 2const queryEmbedding = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 content: userQuery, 5 config: { taskType: 'RETRIEVAL_QUERY' } // ← Use RETRIEVAL_QUERY 6}); 7 8// When embedding documents for indexing 9const docEmbedding = await ai.models.embedContent({ 10 model: 'gemini-embedding-001', 11 content: documentText, 12 config: { taskType: 'RETRIEVAL_DOCUMENT' } // ← Use RETRIEVAL_DOCUMENT 13});

Semantic Search:

typescript
1const embedding = await ai.models.embedContent({ 2 model: 'gemini-embedding-001', 3 content: text, 4 config: { taskType: 'SEMANTIC_SIMILARITY' } 5});

Document Clustering:

typescript
1const embedding = await ai.models.embedContent({ 2 model: 'gemini-embedding-001', 3 content: text, 4 config: { taskType: 'CLUSTERING' } 5});

Impact on Quality

Using the correct task type significantly improves retrieval quality:

typescript
1// ❌ BAD: No task type specified 2const embedding1 = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 content: userQuery 5}); 6 7// ✅ GOOD: Task type specified 8const embedding2 = await ai.models.embedContent({ 9 model: 'gemini-embedding-001', 10 content: userQuery, 11 config: { taskType: 'RETRIEVAL_QUERY' } 12});

Result: Using the right task type can improve search relevance by 10-30%.


6. RAG Patterns

RAG (Retrieval Augmented Generation) combines vector search with LLM generation to create AI systems that answer questions using custom knowledge bases.

Document Ingestion Pipeline

typescript
1import { GoogleGenAI } from "@google/genai"; 2 3const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY }); 4 5// Generate embeddings for chunks 6async function embedChunks(chunks: string[]): Promise<number[][]> { 7 const response = await ai.models.embedContent({ 8 model: 'gemini-embedding-001', 9 contents: chunks, 10 config: { 11 taskType: 'RETRIEVAL_DOCUMENT', // ← Documents for indexing 12 outputDimensionality: 768 // ← Match Vectorize index dimensions 13 } 14 }); 15 16 return response.embeddings.map(e => e.values); 17} 18 19// Store in Cloudflare Vectorize 20async function storeInVectorize( 21 env: Env, 22 chunks: string[], 23 embeddings: number[][] 24) { 25 const vectors = chunks.map((chunk, i) => ({ 26 id: `doc-${Date.now()}-${i}`, 27 values: embeddings[i], 28 metadata: { text: chunk } 29 })); 30 31 await env.VECTORIZE.insert(vectors); 32}

Query Flow (Retrieve + Generate)

typescript
1async function ragQuery(env: Env, userQuery: string): Promise<string> { 2 // 1. Embed user query 3 const queryResponse = await ai.models.embedContent({ 4 model: 'gemini-embedding-001', 5 content: userQuery, 6 config: { 7 taskType: 'RETRIEVAL_QUERY', // ← Query, not document 8 outputDimensionality: 768 9 } 10 }); 11 12 const queryEmbedding = queryResponse.embedding.values; 13 14 // 2. Search Vectorize for similar documents 15 const results = await env.VECTORIZE.query(queryEmbedding, { 16 topK: 5, 17 returnMetadata: true 18 }); 19 20 // 3. Extract context from top results 21 const context = results.matches 22 .map(match => match.metadata.text) 23 .join('\n\n'); 24 25 // 4. Generate response with context 26 const response = await ai.models.generateContent({ 27 model: 'gemini-2.5-flash', 28 contents: `Context:\n${context}\n\nQuestion: ${userQuery}\n\nAnswer based on the context above:` 29 }); 30 31 return response.text; 32}

Integration with Cloudflare Vectorize

Create Vectorize Index (768 dimensions for Gemini):

bash
1npx wrangler vectorize create gemini-embeddings --dimensions 768 --metric cosine

Bind in wrangler.jsonc:

jsonc
1{ 2 "name": "my-rag-app", 3 "main": "src/index.ts", 4 "compatibility_date": "2025-10-25", 5 "vectorize": { 6 "bindings": [ 7 { 8 "binding": "VECTORIZE", 9 "index_name": "gemini-embeddings" 10 } 11 ] 12 } 13}

Complete RAG Worker:

See templates/rag-with-vectorize.ts for full implementation.


7. Error Handling

Common Errors

1. API Key Missing or Invalid

typescript
1// ❌ Error: API key not set 2const ai = new GoogleGenAI({}); 3 4// ✅ Correct 5const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY }); 6 7if (!process.env.GEMINI_API_KEY) { 8 throw new Error('GEMINI_API_KEY environment variable not set'); 9}

2. Dimension Mismatch

typescript
1// ❌ Error: Embedding has 3072 dims, Vectorize expects 768 2const embedding = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 content: text 5 // No outputDimensionality specified → defaults to 3072 6}); 7 8await env.VECTORIZE.insert([{ 9 id: '1', 10 values: embedding.embedding.values // 3072 dims, but index is 768! 11}]); 12 13// ✅ Correct: Match dimensions 14const embedding = await ai.models.embedContent({ 15 model: 'gemini-embedding-001', 16 content: text, 17 config: { outputDimensionality: 768 } // ← Match index dimensions 18});

3. Rate Limiting

typescript
1// ❌ Error: 429 Too Many Requests 2for (let i = 0; i < 1000; i++) { 3 await ai.models.embedContent({ /* ... */ }); // Exceeds 100 RPM on free tier 4} 5 6// ✅ Correct: Implement rate limiting 7async function embedWithRetry(text: string, maxRetries = 3) { 8 for (let attempt = 0; attempt < maxRetries; attempt++) { 9 try { 10 return await ai.models.embedContent({ 11 model: 'gemini-embedding-001', 12 content: text, 13 config: { taskType: 'SEMANTIC_SIMILARITY' } 14 }); 15 } catch (error: any) { 16 if (error.status === 429 && attempt < maxRetries - 1) { 17 const delay = Math.pow(2, attempt) * 1000; // Exponential backoff 18 await new Promise(resolve => setTimeout(resolve, delay)); 19 continue; 20 } 21 throw error; 22 } 23 } 24}

See references/top-errors.md for all 8 documented errors with detailed solutions.

Known Issues Prevention

This section documents additional issues discovered in production use (beyond basic errors above).

Issue #9: Normalization Required for Non-3072 Dimensions

Error: Incorrect similarity scores, no error thrown Source: Official Embeddings Documentation Why It Happens: Only 3072-dimensional embeddings are pre-normalized by the API. All other dimensions (128-3071) have varying magnitudes that distort cosine similarity. Prevention: Always normalize embeddings when using dimensions other than 3072.

typescript
1function normalize(vector: number[]): number[] { 2 const magnitude = Math.sqrt(vector.reduce((sum, val) => sum + val * val, 0)); 3 return vector.map(val => val / magnitude); 4} 5 6// When using 768 or 1536 dimensions 7const response = await ai.models.embedContent({ 8 model: 'gemini-embedding-001', 9 content: text, 10 config: { outputDimensionality: 768 } 11}); 12 13const normalized = normalize(response.embedding.values); 14// Now safe for similarity calculations

Issue #10: Batch API Ordering Bug

Error: Silent data corruption - embeddings returned in wrong order Source: GitHub Issue #1207 Why It Happens: Batch API does not preserve ordering with large batch sizes (>500 texts). Example: entry 328 appears in position 628. Prevention: Process smaller batches (<100 texts) or add unique identifiers to verify ordering.

typescript
1// Safer approach with verification 2const taggedTexts = texts.map((text, i) => `[ID:${i}] ${text}`); 3const response = await ai.models.embedContent({ 4 model: 'gemini-embedding-001', 5 contents: taggedTexts, 6 config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 } 7}); 8 9// Verify ordering by parsing IDs if needed

Issue #11: Batch API Memory Limit

Error: Cannot create a string longer than 0x1fffffe8 characters Source: GitHub Issue #1205 Why It Happens: Batch API response contains excessive whitespace causing response size to exceed Node.js string limit (~536MB) with large payloads (>10k embeddings). Prevention: Limit batches to <5,000 texts per request.

typescript
1// Safe batch size 2async function batchEmbedSafe(texts: string[]) { 3 const maxBatchSize = 5000; 4 if (texts.length > maxBatchSize) { 5 throw new Error(`Batch too large: ${texts.length} texts (max: ${maxBatchSize})`); 6 } 7 // Process batch... 8}

Issue #12: LangChain Dimension Parameter Ignored (Community-sourced)

Error: Dimension mismatch - getting 3072 dimensions instead of specified 768 Source: Medium Article Verified: Multiple community reports Why It Happens: LangChain's GoogleGenerativeAIEmbeddings class silently ignores output_dimensionality parameter when passed to constructor (Python SDK). Prevention: Pass dimension parameter to embed_documents() method, not constructor. JavaScript users should verify new @google/genai SDK doesn't have similar behavior.

python
1# ❌ WRONG - parameter silently ignored 2embeddings = GoogleGenerativeAIEmbeddings( 3 model="gemini-embedding-001", 4 output_dimensionality=768 # IGNORED! 5) 6 7# ✅ CORRECT - pass to method 8embeddings = GoogleGenerativeAIEmbeddings(model="gemini-embedding-001") 9result = embeddings.embed_documents(["text"], output_dimensionality=768)

Issue #13: Single Requests Use Batch Endpoint (Community-sourced)

Error: Hitting rate limits faster than expected with single text embeddings Source: GitHub Issue #427 (Python SDK) Verified: Official issue in googleapis organization Why It Happens: The embed_content() function internally calls batchEmbedContents endpoint even for single texts. This causes higher rate limit consumption (batch endpoint has different limits). Prevention: Add delays between single embedding requests and implement exponential backoff for 429 errors.

typescript
1// Add delays to avoid rate limits 2async function embedWithDelay(text: string, delayMs: number = 100) { 3 const response = await ai.models.embedContent({ 4 model: 'gemini-embedding-001', 5 content: text, 6 config: { taskType: 'SEMANTIC_SIMILARITY' } 7 }); 8 await new Promise(resolve => setTimeout(resolve, delayMs)); 9 return response.embedding.values; 10}

8. Best Practices

Always Do

Specify Task Type

typescript
1// Task type optimizes embeddings for your use case 2const embedding = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 content: text, 5 config: { taskType: 'RETRIEVAL_QUERY' } // ← Always specify 6});

Match Dimensions with Vectorize

typescript
1// Ensure embeddings match your Vectorize index dimensions 2const embedding = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 content: text, 5 config: { outputDimensionality: 768 } // ← Match index 6});

Implement Rate Limiting

typescript
1// Use exponential backoff for 429 errors 2async function embedWithBackoff(text: string) { 3 // Implementation from Error Handling section 4}

Cache Embeddings

typescript
1// Cache embeddings to avoid redundant API calls 2const cache = new Map<string, number[]>(); 3 4async function getCachedEmbedding(text: string): Promise<number[]> { 5 if (cache.has(text)) { 6 return cache.get(text)!; 7 } 8 9 const response = await ai.models.embedContent({ 10 model: 'gemini-embedding-001', 11 content: text, 12 config: { taskType: 'SEMANTIC_SIMILARITY' } 13 }); 14 15 const embedding = response.embedding.values; 16 cache.set(text, embedding); 17 return embedding; 18}

Use Batch API for Multiple Texts

typescript
1// Single batch request vs multiple individual requests 2const embeddings = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 contents: texts, // Array of texts 5 config: { taskType: 'RETRIEVAL_DOCUMENT' } 6});

Never Do

Don't Skip Task Type

typescript
1// Reduces quality by 10-30% 2const embedding = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 content: text 5 // Missing taskType! 6});

Don't Mix Different Dimensions

typescript
1// Can't compare embeddings with different dimensions 2const emb1 = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 content: text1, 5 config: { outputDimensionality: 768 } 6}); 7 8const emb2 = await ai.models.embedContent({ 9 model: 'gemini-embedding-001', 10 content: text2, 11 config: { outputDimensionality: 1536 } // Different dimensions! 12}); 13 14// ❌ Can't calculate similarity between different dimensions 15const similarity = cosineSimilarity(emb1.embedding.values, emb2.embedding.values);

Don't Use Wrong Task Type for RAG

typescript
1// Reduces search quality 2const queryEmbedding = await ai.models.embedContent({ 3 model: 'gemini-embedding-001', 4 content: query, 5 config: { taskType: 'RETRIEVAL_DOCUMENT' } // Wrong! Should be RETRIEVAL_QUERY 6});

Using Bundled Resources

Templates (templates/)

  • package.json - Package configuration with verified versions
  • basic-embeddings.ts - Single text embedding with SDK
  • embeddings-fetch.ts - Fetch-based for Cloudflare Workers
  • batch-embeddings.ts - Batch processing with rate limiting
  • rag-with-vectorize.ts - Complete RAG implementation with Vectorize

References (references/)

  • model-comparison.md - Compare Gemini vs OpenAI vs Workers AI embeddings
  • vectorize-integration.md - Cloudflare Vectorize setup and patterns
  • rag-patterns.md - Complete RAG implementation strategies
  • dimension-guide.md - Choosing the right dimensions (768 vs 1536 vs 3072)
  • top-errors.md - 8 common errors and detailed solutions

Scripts (scripts/)

  • check-versions.sh - Verify @google/genai package version is current

Official Documentation


Related Skills

  • google-gemini-api - Main Gemini API for text/image generation
  • cloudflare-vectorize - Vector database for storing embeddings
  • cloudflare-workers-ai - Workers AI embeddings (BGE models)

Success Metrics

Token Savings: ~60% compared to manual implementation Errors Prevented: 13 documented errors with solutions (8 basic + 5 known issues) Production Tested: ✅ Verified in RAG applications Package Version: @google/genai@1.37.0 Last Updated: 2026-01-21 Changes: Added normalization requirement, batch API warnings (ordering bug, memory limits, rate limit anomaly), LangChain compatibility notes


License

MIT License - Free to use in personal and commercial projects.


Questions or Issues?

Related Skills

Looking for an alternative to google-gemini-embeddings or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

View All

widget-generator

Logo of f
f

widget-generator is an open-source AI agent skill for creating widget plugins that are injected into prompt feeds on prompts.chat. It supports two rendering modes: standard prompt widgets using default PromptCard styling and custom render widgets built as full React components.

149.6k
0
Design

chat-sdk

Logo of lobehub
lobehub

chat-sdk is a unified TypeScript SDK for building chat bots across multiple platforms, providing a single interface for deploying bot logic.

73.0k
0
Communication

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication