mcp jinaai reader

Local 2025-08-31 23:31:29 0

Integrates Jina.ai's Reader API with LLMs for efficient and structured web content extraction, optimized for documentation and web content analysis.



⚠️ Notice

This repository is no longer maintained.

The functionality of this tool is now available in mcp-omnisearch, which combines multiple MCP tools in one unified package.

Please use mcp-omnisearch instead.


A Model Context Protocol (MCP) server for integrating Jina.ai's Reader API with LLMs. This server provides efficient and comprehensive web content extraction capabilities, optimized for documentation and web content analysis.

Features

  • ? Advanced web content extraction through Jina.ai Reader API
  • ? Fast and efficient content retrieval
  • ? Complete text extraction with preserved structure
  • ? Clean format optimized for LLMs
  • ? Support for various content types including documentation
  • ?️ Built on the Model Context Protocol

Configuration

This server requires configuration through your MCP client. Here are examples for different environments:

Cline Configuration

Add this to your Cline MCP settings:

{
    "mcpServers": {
        "jinaai-reader": {
            "command": "node",
            "args": ["-y", "mcp-jinaai-reader"],
            "env": {
                "JINAAI_API_KEY": "your-jinaai-api-key"
            }
        }
    }
}

Claude Desktop with WSL Configuration

For WSL environments, add this to your Claude Desktop configuration:

{
    "mcpServers": {
        "jinaai-reader": {
            "command": "wsl.exe",
            "args": [
                "bash",
                "-c",
                "JINAAI_API_KEY=your-jinaai-api-key npx mcp-jinaai-reader"
            ]
        }
    }
}

Environment Variables

The server requires the following environment variable:

  • JINAAI_API_KEY: Your Jina.ai API key (required)

API

The server implements a single MCP tool with configurable parameters:

read_url

Convert any URL to LLM-friendly text using Jina.ai Reader.

Parameters:

  • url (string, required): URL to process
  • no_cache (boolean, optional): Bypass cache for fresh results. Defaults to false
  • format (string, optional): Response format ("json" or "stream"). Defaults to "json"
  • timeout (number, optional): Maximum time in seconds to wait for webpage load
  • target_selector (string, optional): CSS selector to focus on specific elements
  • wait_for_selector (string, optional): CSS selector to wait for specific elements
  • remove_selector (string, optional): CSS selector to exclude specific elements
  • with_links_summary (boolean, optional): Gather all links at the end of response
  • with_images_summary (boolean, optional): Gather all images at the end of response
  • with_generated_alt (boolean, optional): Add alt text to images lacking captions
  • with_iframe (boolean, optional): Include iframe content in response

Development

Setup

  1. Clone the repository
  2. Install dependencies:
npm install
  1. Build the project:
npm run build
  1. Run in development mode:
npm run dev

Publishing

  1. Update version in package.json
  2. Build the project:
npm run build
  1. Publish to npm:
npm publish

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see the LICENSE file for details.

Acknowledgments

[
  {
    "description": "Convert any URL to LLM-friendly text using Jina.ai Reader",
    "inputSchema": {
      "properties": {
        "format": {
          "default": "json",
          "description": "Response format (json or stream)",
          "enum": [
            "json",
            "stream"
          ],
          "type": "string"
        },
        "no_cache": {
          "default": false,
          "description": "Bypass cache for fresh results",
          "type": "boolean"
        },
        "remove_selector": {
          "description": "CSS selector to exclude specific elements",
          "type": "string"
        },
        "target_selector": {
          "description": "CSS selector to focus on specific elements",
          "type": "string"
        },
        "timeout": {
          "description": "Maximum time in seconds to wait for webpage load",
          "type": "number"
        },
        "url": {
          "description": "URL to process",
          "type": "string"
        },
        "wait_for_selector": {
          "description": "CSS selector to wait for specific elements",
          "type": "string"
        },
        "with_generated_alt": {
          "description": "Add alt text to images lacking captions",
          "type": "boolean"
        },
        "with_iframe": {
          "description": "Include iframe content in response",
          "type": "boolean"
        },
        "with_images_summary": {
          "description": "Gather all images at the end of response",
          "type": "boolean"
        },
        "with_links_summary": {
          "description": "Gather all links at the end of response",
          "type": "boolean"
        }
      },
      "required": [
        "url"
      ],
      "type": "object"
    },
    "name": "read_url"
  }
]