crawlab mcp
A Model Context Protocol server that allows AI applications to interact with Crawlab's functionality through natural language, enabling spider management, task execution, and file operations.
A Model Context Protocol server that allows AI applications to interact with Crawlab's functionality through natural language, enabling spider management, task execution, and file operations.
This is a Model Context Protocol (MCP) server for Crawlab, allowing AI applications to interact with Crawlab's functionality.
The MCP server provides a standardized way for AI applications to access Crawlab's features, including:
The MCP Server/Client architecture facilitates communication between AI applications and Crawlab:
graph TB
User[User] --> Client[MCP Client]
Client --> LLM[LLM Provider]
Client <--> Server[MCP Server]
Server <--> Crawlab[Crawlab API]
subgraph "MCP System"
Client
Server
end
subgraph "Crawlab System"
Crawlab
DB[(Database)]
Crawlab <--> DB
end
class User,LLM,Crawlab,DB external;
class Client,Server internal;
%% Flow annotations
LLM -.-> |Tool calls| Client
Client -.-> |Executes tool calls| Server
Server -.-> |API requests| Crawlab
Crawlab -.-> |API responses| Server
Server -.-> |Tool results| Client
Client -.-> |Human-readable response| User
classDef external fill:#f9f9f9,stroke:#333,stroke-width:1px;
classDef internal fill:#d9edf7,stroke:#31708f,stroke-width:1px;
You can install the MCP server as a Python package, which provides a convenient CLI:
# Install from source
pip install -e .
# Or install from GitHub (when available)
# pip install git+https://github.com/crawlab-team/crawlab-mcp-server.git
After installation, you can use the CLI:
# Start the MCP server
crawlab_mcp-mcp server [--spec PATH_TO_SPEC] [--host HOST] [--port PORT]
# Start the MCP client
crawlab_mcp-mcp client SERVER_URL
Copy the .env.example
file to .env
:
cp .env.example .env
Edit the .env
file with your Crawlab API details:
CRAWLAB_API_BASE_URL=http://your-crawlab-instance:8080/api
CRAWLAB_API_TOKEN=your_api_token_here
Install dependencies:
pip install -r requirements.txt
Run the server:
python server.py
Build the Docker image:
docker build -t crawlab-mcp-server .
Run the container:
docker run -p 8000:8000 --env-file .env crawlab-mcp-server
To add the MCP server to your existing Crawlab Docker Compose setup, add the following service to your docker-compose.yml
:
services:
# ... existing Crawlab services
mcp-server:
build: ./backend/mcp-server
ports:
- "8000:8000"
environment:
- CRAWLAB_API_BASE_URL=http://backend:8000/api
- CRAWLAB_API_TOKEN=your_api_token_here
depends_on:
- backend
The MCP server enables AI applications to interact with Crawlab through natural language. Following the architecture diagram above, here's how to use the MCP system:
http://localhost:8000
)Based on our architecture, here are example interactions with the system:
Create a Spider:
User: "Create a new spider named 'Product Scraper' for the e-commerce project"
↓
LLM identifies intent and calls the create_spider tool
↓
MCP Server executes the API call to Crawlab
↓
Spider is created and details are returned to the user
Run a Task:
User: "Run the 'Product Scraper' spider on all available nodes"
↓
LLM calls the run_spider tool with appropriate parameters
↓
MCP Server sends the command to Crawlab API
↓
Task is started and confirmation is returned to the user
You can interact with the system using natural language commands like:
These are the underlying tools that power the natural language interactions:
spiders
: List all spiderstasks
: List all tasksget_spider
: Get details of a specific spidercreate_spider
: Create a new spiderupdate_spider
: Update an existing spiderdelete_spider
: Delete a spiderget_task
: Get details of a specific taskrun_spider
: Run a spidercancel_task
: Cancel a running taskrestart_task
: Restart a taskget_task_logs
: Get logs for a taskget_spider_files
: List files for a spiderget_spider_file
: Get content of a specific filesave_spider_file
: Save content to a file