mcp firecrawl tool
A server that provides tools to scrape websites and extract structured data from them using Firecrawl's APIs, supporting both basic website scraping in multiple formats and custom schema-based data extraction.
A server that provides tools to scrape websites and extract structured data from them using Firecrawl's APIs, supporting both basic website scraping in multiple formats and custom schema-based data extraction.
This is a simple MCP server that provides tools to scrape websites and extract structured data using Firecrawl's APIs.
Install dependencies:
npm install
Create a .env
file in the root directory with the following variables:
FIRECRAWL_API_TOKEN=your_token_here
SENTRY_DSN=your_sentry_dsn_here
FIRECRAWL_API_TOKEN
(required): Your Firecrawl API token
SENTRY_DSN
(optional): Sentry DSN for error tracking and performance monitoring
Start the server:
npm start
Alternatively, you can set environment variables directly when running the server:
FIRECRAWL_API_TOKEN=your_token_here npm start
The server exposes two tools:
1. scrape-website
: Basic website scraping with multiple format options
2. extract-data
: Structured data extraction based on prompts and schemas
This tool scrapes a website and returns its content in the requested formats.
Parameters:
- url
(string, required): The URL of the website to scrape
- formats
(array of strings, optional): Array of desired output formats. Supported formats are:
- "markdown"
(default)
- "html"
- "text"
Example usage with MCP Inspector:
# Basic usage (defaults to markdown)
mcp-inspector --tool scrape-website --args '{
"url": "https://example.com"
}'
# Multiple formats
mcp-inspector --tool scrape-website --args '{
"url": "https://example.com",
"formats": ["markdown", "html", "text"]
}'
This tool extracts structured data from websites based on a provided prompt and schema.
Parameters:
- urls
(array of strings, required): Array of URLs to extract data from
- prompt
(string, required): The prompt describing what data to extract
- schema
(object, required): Schema definition for the data to extract
The schema definition should be an object where keys are field names and values are types. Supported types are:
- "string"
: For text fields
- "boolean"
: For true/false fields
- "number"
: For numeric fields
- Arrays: Specified as ["type"]
where type is one of the above
- Objects: Nested objects with their own type definitions
Example usage with MCP Inspector:
# Basic example extracting company information
mcp-inspector --tool extract-data --args '{
"urls": ["https://example.com"],
"prompt": "Extract the company mission, whether it supports SSO, and whether it is open source.",
"schema": {
"company_mission": "string",
"supports_sso": "boolean",
"is_open_source": "boolean"
}
}'
# Complex example with nested data
mcp-inspector --tool extract-data --args '{
"urls": ["https://example.com/products", "https://example.com/pricing"],
"prompt": "Extract product information including name, price, and features.",
"schema": {
"products": [{
"name": "string",
"price": "number",
"features": ["string"]
}]
}
}'
Both tools will return appropriate error messages if the scraping or extraction fails and automatically log errors to Sentry if configured.
If you encounter issues: