Name	Name	Last commit message	Last commit date
parent directory ..
lib	lib
test	test
.gitignore	.gitignore
Gemfile	Gemfile
LICENSE	LICENSE
README.md	README.md
Rakefile	Rakefile
firecrawl-sdk.gemspec	firecrawl-sdk.gemspec

Firecrawl Ruby SDK

Ruby SDK for the Firecrawl v2 web scraping API.

Prerequisites

Ruby >= 3.0

Installation

Add to your Gemfile:

gem "firecrawl-sdk", "~> 1.5"

Or install directly:

gem install firecrawl-sdk

Quick Start

require "firecrawl"

# Create a client
client = Firecrawl::Client.new(api_key: "fc-your-api-key")

# Or load from FIRECRAWL_API_KEY environment variable
client = Firecrawl::Client.from_env

# Scrape a single page
doc = client.scrape("https://example.com")
puts doc.markdown

Environment Setup

export FIRECRAWL_API_KEY="fc-your-api-key"
# Optional: custom API URL
export FIRECRAWL_API_URL="http://localhost:3002"

API Reference

Scrape

# Basic scrape
doc = client.scrape("https://example.com")
puts doc.markdown

# Scrape with options
doc = client.scrape("https://example.com",
  Firecrawl::Models::ScrapeOptions.new(
    formats: ["markdown", "html"],
    only_main_content: true,
    wait_for: 1000
  ))
puts doc.html

Video Extraction

Use the video format on supported video URLs, including YouTube and TikTok. The returned video field is a signed URL to the extracted video file.

doc = client.scrape("https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  Firecrawl::Models::ScrapeOptions.new(formats: ["video"]))

puts doc.video

Product Extraction

Use the product format on product pages to get structured product data (title, brand, category, and per-variant price, availability, and images). It is the deterministic counterpart to the LLM-based json format. The returned product field contains the extracted fields.

doc = client.scrape("https://example.com/products/widget",
  Firecrawl::Models::ScrapeOptions.new(formats: ["product"]))

puts doc.product

Parse

Upload a local file (html, pdf, docx, etc.) via multipart form data and parse it synchronously. Parse options intentionally exclude browser-only features such as change tracking, screenshot, branding, audio, video, product, actions, wait_for, location, and mobile. The proxy option only accepts "auto" or "basic".

# From disk
file = Firecrawl::Models::ParseFile.from_path("./document.pdf")

# Or from memory
file = Firecrawl::Models::ParseFile.new(
  filename: "upload.html",
  content: "<html>hi</html>",
  content_type: "text/html"
)

doc = client.parse(file,
  Firecrawl::Models::ParseOptions.new(formats: ["markdown"]))
puts doc.markdown

Crawl

# Crawl with auto-polling (blocks until complete)
job = client.crawl("https://example.com",
  Firecrawl::Models::CrawlOptions.new(limit: 50))
job.data.each { |doc| puts doc.markdown }

# Async crawl
response = client.start_crawl("https://example.com",
  Firecrawl::Models::CrawlOptions.new(limit: 10))
puts response.id

# Check status
status = client.get_crawl_status(response.id)
puts status.status

# Cancel
client.cancel_crawl(response.id)

Batch Scrape

urls = ["https://example.com/page1", "https://example.com/page2"]

# Batch scrape with auto-polling
job = client.batch_scrape(urls,
  Firecrawl::Models::BatchScrapeOptions.new(
    options: Firecrawl::Models::ScrapeOptions.new(formats: ["markdown"])
  ))
job.data.each { |doc| puts doc.markdown }

Map

# Discover URLs on a website
result = client.map("https://example.com")
result.links.each { |link| puts link["url"] }

# With options
result = client.map("https://example.com",
  Firecrawl::Models::MapOptions.new(limit: 100, search: "blog"))

Search

# Web search
results = client.search("firecrawl web scraping")
results.web&.each { |r| puts r["url"] }

# With options
results = client.search("latest news",
  Firecrawl::Models::SearchOptions.new(limit: 5, location: "US"))

Agent

# Run an AI agent task (blocks until complete)
status = client.agent(
  Firecrawl::Models::AgentOptions.new(
    prompt: "Find the pricing information",
    urls: ["https://example.com"]
  ))
puts status.data

Usage & Metrics

# Check concurrency
concurrency = client.get_concurrency
puts concurrency.concurrency

# Check credit usage
usage = client.get_credit_usage
puts usage.remaining_credits

Configuration

client = Firecrawl::Client.new(
  api_key: "fc-your-api-key",
  api_url: "https://api.firecrawl.dev",  # custom API URL
  timeout: 300,                           # HTTP timeout in seconds
  max_retries: 3,                         # automatic retries
  backoff_factor: 0.5                     # exponential backoff factor
)

Error Handling

begin
  doc = client.scrape("https://example.com")
rescue Firecrawl::AuthenticationError => e
  puts "Invalid API key: #{e.message}"
rescue Firecrawl::RateLimitError => e
  puts "Rate limited: #{e.message}"
rescue Firecrawl::JobTimeoutError => e
  puts "Job #{e.job_id} timed out after #{e.timeout_seconds}s"
rescue Firecrawl::FirecrawlError => e
  puts "Error (#{e.status_code}): #{e.message}"
end

Development

Building from Source

cd apps/ruby-sdk
bundle install

Running Tests

# Unit tests
bundle exec rake test

# With API key for E2E tests
FIRECRAWL_API_KEY=fc-your-key bundle exec rake test

License

MIT License - see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Firecrawl Ruby SDK

Prerequisites

Installation

Quick Start

Environment Setup

API Reference

Scrape

Video Extraction

Product Extraction

Parse

Crawl

Batch Scrape

Map

Search

Agent

Usage & Metrics

Configuration

Error Handling

Development

Building from Source

Running Tests

License

FilesExpand file tree

ruby-sdk

Directory actions

More options

Directory actions

More options

Latest commit

History

ruby-sdk

Folders and files

parent directory

README.md

Firecrawl Ruby SDK

Prerequisites

Installation

Quick Start

Environment Setup

API Reference

Scrape

Video Extraction

Product Extraction

Parse

Crawl

Batch Scrape

Map

Search

Agent

Usage & Metrics

Configuration

Error Handling

Development

Building from Source

Running Tests

License