google ocr mcp server

Local 2025-09-01 00:52:37 0

Cloud Platforms @Zerohertz/google-ocr-mcp-server

This is a server implementation for performing Optical Character Recognition (OCR) using the Google Cloud Vision API. It is built on top of the **FastMCP** framework, which allows for the creation of modular and extensible command processing tools.

Components

Resources

The server implements a simple note storage system with:

Custom note:// URI scheme for accessing individual notes
Each note resource has a name, description and text/plain mimetype

Prompts

The server provides a single prompt:

summarize-notes: Creates summaries of all stored notes
Optional "style" argument to control detail level (brief/detailed)
Generates prompt combining all current notes with style preference

Tools

The server implements one tool:

add-note: Adds a new note to the server
Takes "name" and "content" as required string arguments
Updates server state and notifies clients of resource changes

Configuration

[TODO: Add configuration details specific to your implementation]

Quickstart

Install

Claude Desktop

On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
On Windows: %APPDATA%/Claude/claude_desktop_config.json

Development/Unpublished Servers Configuration

{
  "mcpServers": {
    "google-ocr-mcp-server": {
      "command": "uv",
      "args": ["run", "google-ocr-mcp-server"],
      "env": {
        "GOOGLE_APPLICATION_CREDENTIALS": "/path/to/google-application-credentials.json",
        "SAVE_RESULTS": false
      }
    }
  }
}

Published Servers Configuration

{
  "mcpServers": {
    "google-ocr-mcp-server": {
      "command": "uvx",
      "args": ["google-ocr-mcp-server"],
      "env": {
        "GOOGLE_APPLICATION_CREDENTIALS": "/path/to/google-application-credentials.json",
        "SAVE_RESULTS": false
      }
    }
  }
}

Installing via Smithery

To install google-ocr-mcp-server for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @Zerohertz/google-ocr-mcp-server --client claude

Development

Building and Publishing

To prepare the package for distribution:

Sync dependencies and update lockfile:

uv sync

Build package distributions:

uv build

This will create source and wheel distributions in the dist/ directory.

Publish to PyPI:

uv publish

Note: You'll need to set PyPI credentials via environment variables or command flags:

Token: --token or UV_PUBLISH_TOKEN
Or username/password: --username/UV_PUBLISH_USERNAME and --password/UV_PUBLISH_PASSWORD

Debugging

Since MCP servers run over stdio, debugging can be challenging. For the best debugging experience, we strongly recommend using the MCP Inspector.

You can launch the MCP Inspector via npm with this command:

npx @modelcontextprotocol/inspector uv --directory /Users/zerohertz/Downloads/google-ocr-mcp-server run google-ocr-mcp-server

Upon launching, the Inspector will display a URL that you can access in your browser to begin debugging.

[
  {
    "description": "     Perform Optical Character Recognition (OCR) on the provided image file.      Args:         path (str): The absolute file path to the image on which OCR will be performed.      Returns:         str: The extracted text from the image.      Raises:         Exception: If an error occurs during the OCR process, it will be logged.      Notes:         - The function uses Google Cloud Vision API for text detection.         - If SAVE_RESULTS is enabled, the OCR results will be saved as a JSON file           in the same directory as the input image, with the same name but a .json extension.     ",
    "inputSchema": {
      "properties": {
        "path": {
          "title": "Path",
          "type": "string"
        }
      },
      "required": [
        "path"
      ],
      "title": "ocrArguments",
      "type": "object"
    },
    "name": "ocr"
  }
]