Firecrawl Property Extraction
Overview
Firecrawl transforms any real estate listing URL into structured JSON data for video generation. It handles JavaScript rendering, anti-bot measures, and image extraction automatically.
Quick Start
typescript1import Firecrawl from '@mendable/firecrawl-js'; 2import { z } from 'zod'; 3 4const firecrawl = new Firecrawl({ 5 apiKey: process.env.FIRECRAWL_API_KEY 6}); 7 8const result = await firecrawl.extract({ 9 urls: [listingUrl], 10 prompt: 'Extract property details for video generation', 11 schema: PropertySchema 12});
Supported Sites
| Site | URL Pattern | Data Quality |
|---|---|---|
| Zillow | zillow.com/homedetails/* | Excellent |
| Redfin | redfin.com//home/ | Excellent |
| Realtor.com | realtor.com/realestateandhomes-detail/* | Excellent |
| Trulia | trulia.com/home/* | Good |
| Homes.com | homes.com/property/* | Good |
| MLS Sites | Varies by region | Good |
| Broker Sites | Any | Variable |
Property Schema
See rules/property-extraction.md for complete schema.
typescript1const PropertySchema = z.object({ 2 address: z.string(), 3 city: z.string(), 4 state: z.string(), 5 zipCode: z.string(), 6 price: z.number(), 7 bedrooms: z.number(), 8 bathrooms: z.number(), 9 sqft: z.number(), 10 lotSize: z.string().optional(), 11 yearBuilt: z.number().optional(), 12 propertyType: z.string(), 13 description: z.string(), 14 features: z.array(z.string()), 15 images: z.array(z.string()), 16 agent: z.object({ 17 name: z.string(), 18 phone: z.string().optional(), 19 brokerage: z.string().optional(), 20 }).optional(), 21});
Advanced Extraction
Competitor Analysis
typescript1const CompetitorSchema = z.object({ 2 listings: z.array(z.object({ 3 address: z.string(), 4 price: z.number(), 5 daysOnMarket: z.number(), 6 pricePerSqft: z.number(), 7 })), 8 marketTrends: z.object({ 9 medianPrice: z.number(), 10 averageDaysOnMarket: z.number(), 11 inventoryCount: z.number(), 12 }), 13});
Market Data
Best Practices
- Rate Limiting: Max 10 requests/minute on standard plan
- Error Handling: Always wrap in try/catch
- Image Quality: Request high-res images when available
- Caching: Cache results for 24 hours to save credits
- Validation: Always validate extracted data with Zod
API Integration
Next.js Route Handler
typescript1// /app/api/scrape/route.ts 2export async function POST(request: Request) { 3 const { url } = await request.json(); 4 5 const firecrawl = new Firecrawl({ 6 apiKey: process.env.FIRECRAWL_API_KEY! 7 }); 8 9 const result = await firecrawl.extract({ 10 urls: [url], 11 prompt: 'Extract property listing data', 12 schema: PropertySchema, 13 }); 14 15 return Response.json({ 16 success: true, 17 data: result.data 18 }); 19}
Credit Usage
| Operation | Credits |
|---|---|
| /scrape (single page) | 1 |
| /crawl (per page) | 1 |
| /extract (AI) | Tokens-based |
| /map (URL discovery) | 1 per 100 URLs |
Error Handling
typescript1try { 2 const result = await firecrawl.extract({ ... }); 3} catch (error) { 4 if (error.statusCode === 429) { 5 // Rate limited - wait and retry 6 } else if (error.statusCode === 403) { 7 // Site blocked - try alternative approach 8 } else { 9 // Log and return fallback 10 } 11}