ib mcp cache server
A Model Context Protocol server that reduces token consumption by efficiently caching data between language model interactions, automatically storing and retrieving information to minimize redundant token usage.
A Model Context Protocol server that reduces token consumption by efficiently caching data between language model interactions, automatically storing and retrieving information to minimize redundant token usage.
A Model Context Protocol (MCP) server that reduces token consumption by efficiently caching data between language model interactions. Works with any MCP client and any language model that uses tokens.
Clone the repository:
git clone [email protected]:ibproduct/ib-mcp-cache-server
cd ib-mcp-cache-server
Install dependencies:
npm install
Build the project:
npm run build
Add to your MCP client settings:
{
"mcpServers": {
"memory-cache": {
"command": "node",
"args": ["/path/to/ib-mcp-cache-server/build/index.js"]
}
}
}
The server will automatically start when you use your MCP client
When the server is running properly, you'll see: 1. A message in the terminal: "Memory Cache MCP server running on stdio" 2. Improved performance when accessing the same data multiple times 3. No action required from you - the caching happens automatically
You can verify the server is running by: 1. Opening your MCP client 2. Looking for any error messages in the terminal where you started the server 3. Performing operations that would benefit from caching (like reading the same file multiple times)
The server can be configured through config.json
or environment variables:
{
"maxEntries": 1000, // Maximum number of items in cache
"maxMemory": 104857600, // Maximum memory usage in bytes (100MB)
"defaultTTL": 3600, // Default time-to-live in seconds (1 hour)
"checkInterval": 60000, // Cleanup interval in milliseconds (1 minute)
"statsInterval": 30000 // Stats update interval in milliseconds (30 seconds)
}
When exceeded, oldest unused items are removed first
maxMemory (default: 100MB)
When exceeded, least recently used items are removed
defaultTTL (default: 1 hour)
Prevents stale data from consuming memory
checkInterval (default: 1 minute)
Higher values reduce CPU usage
statsInterval (default: 30 seconds)
The memory cache server reduces token consumption by automatically storing data that would otherwise need to be re-sent between you and the language model. You don't need to do anything special - the caching happens automatically when you interact with any language model through your MCP client.
Here are some examples of what gets cached:
When reading a file multiple times: - First time: Full file content is read and cached - Subsequent times: Content is retrieved from cache instead of re-reading the file - Result: Fewer tokens used for repeated file operations
When performing calculations or analysis: - First time: Full computation is performed and results are cached - Subsequent times: Results are retrieved from cache if the input is the same - Result: Fewer tokens used for repeated computations
When the same data is needed multiple times: - First time: Data is processed and cached - Subsequent times: Data is retrieved from cache until TTL expires - Result: Fewer tokens used for accessing the same information
The server automatically manages the caching process by: - Storing data when first encountered - Serving cached data when available - Removing old/unused data based on settings - Tracking effectiveness through statistics
You can override config.json settings using environment variables in your MCP settings:
{
"mcpServers": {
"memory-cache": {
"command": "node",
"args": ["/path/to/build/index.js"],
"env": {
"MAX_ENTRIES": "5000",
"MAX_MEMORY": "209715200", // 200MB
"DEFAULT_TTL": "7200", // 2 hours
"CHECK_INTERVAL": "120000", // 2 minutes
"STATS_INTERVAL": "60000" // 1 minute
}
}
}
}
You can also specify a custom config file location:
{
"env": {
"CONFIG_PATH": "/path/to/your/config.json"
}
}
The server will: 1. Look for config.json in its directory 2. Apply any environment variable overrides 3. Use default values if neither is specified
To see the cache in action, try these scenarios:
The second response should be faster as the file content is cached
Data Analysis Test
The second analysis should use cached results
Project Navigation Test
The cache is working when you notice: - Faster responses for repeated operations - Consistent answers about unchanged content - No need to re-read files that haven't changed
[
{
"description": "Store data in the cache with optional TTL",
"inputSchema": {
"properties": {
"key": {
"description": "Unique identifier for the cached data",
"type": "string"
},
"ttl": {
"description": "Time-to-live in seconds (optional)",
"type": "number"
},
"value": {
"description": "Data to cache",
"type": "any"
}
},
"required": [
"key",
"value"
],
"type": "object"
},
"name": "store_data"
},
{
"description": "Retrieve data from the cache",
"inputSchema": {
"properties": {
"key": {
"description": "Key of the cached data to retrieve",
"type": "string"
}
},
"required": [
"key"
],
"type": "object"
},
"name": "retrieve_data"
},
{
"description": "Clear specific or all cache entries",
"inputSchema": {
"properties": {
"key": {
"description": "Specific key to clear (optional - clears all if not provided)",
"type": "string"
}
},
"type": "object"
},
"name": "clear_cache"
},
{
"description": "Get cache statistics",
"inputSchema": {
"properties": {},
"type": "object"
},
"name": "get_cache_stats"
}
]