sourcesyncai mcp
A Model Context Protocol server that enables AI models to interact with SourceSync.ai's knowledge management platform for managing documents, ingesting content from various sources, and performing semantic searches.
A Model Context Protocol server that enables AI models to interact with SourceSync.ai's knowledge management platform for managing documents, ingesting content from various sources, and performing semantic searches.
A Model Context Protocol (MCP) server implementation for the SourceSync.ai API. This server allows AI models to interact with SourceSync.ai's knowledge management platform through a standardized interface.
# Install and run with your API key and tenant ID
env SOURCESYNC_API_KEY=your_api_key npx -y sourcesyncai-mcp
To install sourcesyncai-mcp for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @pbteja1998/sourcesyncai-mcp --client claude
# Clone the repository
git clone https://github.com/yourusername/sourcesyncai-mcp.git
cd sourcesyncai-mcp
# Install dependencies
npm install
# Build the project
npm run build
# Run the server
node dist/index.js
To configure SourceSync.ai MCP in Cursor:
Features > MCP Servers
+ Add New MCP Server
sourcesyncai-mcp
(or your preferred name)command
env SOURCESYNCAI_API_KEY=your-api-key npx -y sourcesyncai-mcp
After adding, you can use SourceSync.ai tools with Cursor's AI features by describing your knowledge management needs.
Add this to your ./codeium/windsurf/model_config.json
:
{
"mcpServers": {
"sourcesyncai-mcp": {
"command": "npx",
"args": ["-y", "soucesyncai-mcp"],
"env": {
"SOURCESYNC_API_KEY": "your_api_key",
"SOURCESYNC_NAMESPACE_ID": "your_namespace_id",
"SOURCESYNC_TENANT_ID": "your_tenant_id"
}
}
}
}
To use this MCP server with Claude Desktop:
Locate the Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
%APPDATA%Claudeclaude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
Edit the configuration file to add the SourceSync.ai MCP server:
{
"mcpServers": {
"sourcesyncai-mcp": {
"command": "npx",
"args": ["-y", "sourcesyncai-mcp"],
"env": {
"SOURCESYNC_API_KEY": "your_api_key",
"SOURCESYNC_NAMESPACE_ID": "your_namespace_id",
"SOURCESYNC_TENANT_ID": "your_tenant_id"
}
}
}
}
SOURCESYNC_API_KEY
: Your SourceSync.ai API key (required)SOURCESYNC_NAMESPACE_ID
: Default namespace ID to use for operationsSOURCESYNC_TENANT_ID
: Your tenant ID (optional)Basic configuration with default values:
export SOURCESYNC_API_KEY=your_api_key
export SOURCESYNC_TENANT_ID=your_tenant_id
export SOURCESYNC_NAMESPACE_ID=your_namespace_id
validate_api_key
: Validate a SourceSync.ai API key{
"name": "validate_api_key",
"arguments": {}
}
create_namespace
: Create a new namespacelist_namespaces
: List all namespacesget_namespace
: Get details of a specific namespaceupdate_namespace
: Update a namespacedelete_namespace
: Delete a namespace{
"name": "create_namespace",
"arguments": {
"name": "my-namespace",
"fileStorageConfig": {
"provider": "S3_COMPATIBLE",
"config": {
"endpoint": "s3.amazonaws.com",
"accessKey": "your_access_key",
"secretKey": "your_secret_key",
"bucket": "your_bucket",
"region": "us-east-1"
}
},
"vectorStorageConfig": {
"provider": "PINECONE",
"config": {
"apiKey": "your_pinecone_api_key",
"environment": "your_environment",
"index": "your_index"
}
},
"embeddingModelConfig": {
"provider": "OPENAI",
"config": {
"apiKey": "your_openai_api_key",
"model": "text-embedding-3-small"
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "list_namespaces",
"arguments": {
"tenantId": "tenant_XXX"
}
}
{
"name": "get_namespace",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX"
}
}
{
"name": "update_namespace",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"name": "updated-namespace-name"
}
}
{
"name": "delete_namespace",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX"
}
}
ingest_text
: Ingest text contentingest_urls
: Ingest content from URLsingest_sitemap
: Ingest content from a sitemapingest_website
: Ingest content from a websiteingest_notion
: Ingest content from Notioningest_google_drive
: Ingest content from Google Driveingest_dropbox
: Ingest content from Dropboxingest_onedrive
: Ingest content from OneDriveingest_box
: Ingest content from Boxget_ingest_job_run_status
: Get the status of an ingestion job run{
"name": "ingest_text",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "TEXT",
"config": {
"name": "example-document",
"text": "This is an example document for ingestion.",
"metadata": {
"category": "example",
"author": "AI Assistant"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_urls",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "URLS",
"config": {
"urls": ["https://example.com/page1", "https://example.com/page2"],
"metadata": {
"source": "web",
"category": "documentation"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_sitemap",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "SITEMAP",
"config": {
"url": "https://example.com/sitemap.xml",
"metadata": {
"source": "sitemap",
"website": "example.com"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_website",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "WEBSITE",
"config": {
"url": "https://example.com",
"maxDepth": 3,
"maxPages": 100,
"metadata": {
"source": "website",
"domain": "example.com"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_notion",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "NOTION",
"config": {
"connectionId": "your_notion_connection_id",
"metadata": {
"source": "notion",
"workspace": "My Workspace"
}
}
},
"tenantId": "your_tenant_id"
}
}
{
"name": "ingest_google_drive",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "GOOGLE_DRIVE",
"config": {
"connectionId": "connection_XXX",
"metadata": {
"source": "google_drive",
"owner": "[email protected]"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_dropbox",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "DROPBOX",
"config": {
"connectionId": "connection_XXX",
"metadata": {
"source": "dropbox",
"account": "[email protected]"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_onedrive",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "ONEDRIVE",
"config": {
"connectionId": "connection_XXX",
"metadata": {
"source": "onedrive",
"account": "[email protected]"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_box",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "BOX",
"config": {
"connectionId": "connection_XXX",
"metadata": {
"source": "box",
"owner": "[email protected]"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "get_ingest_job_run_status",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestJobRunId": "ingest_job_run_XXX",
"tenantId": "tenant_XXX"
}
}
getDocuments
: Retrieve documents with optional filtersupdateDocuments
: Update document metadatadeleteDocuments
: Delete documentsresyncDocuments
: Resync documentsfetchUrlContent
: Fetch text content from document URLs{
"name": "getDocuments",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"filterConfig": {
"documentTypes": ["PDF"]
},
"includeConfig": {
"parsedTextFileUrl": true
}
}
}
{
"name": "updateDocuments",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"documentIds": ["doc_XXX", "doc_YYY"],
"filterConfig": {
"documentIds": ["doc_XXX", "doc_YYY"]
},
"data": {
"metadata": {
"status": "reviewed",
"category": "technical"
}
}
}
}
{
"name": "deleteDocuments",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"documentIds": ["doc_XXX", "doc_YYY"],
"filterConfig": {
"documentIds": ["doc_XXX", "doc_YYY"]
}
}
}
{
"name": "resyncDocuments",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"documentIds": ["doc_XXX", "doc_YYY"],
"filterConfig": {
"documentIds": ["doc_XXX", "doc_YYY"]
}
}
}
{
"name": "fetchUrlContent",
"arguments": {
"url": "https://api.sourcesync.ai/v1/documents/doc_XXX/content?format=text",
"apiKey": "your_api_key",
"tenantId": "tenant_XXX"
}
}
semantic_search
: Perform semantic searchhybrid_search
: Perform hybrid search (semantic + keyword){
"name": "semantic_search",
"arguments": {
"namespaceId": "your_namespace_id",
"query": "example document",
"topK": 5,
"tenantId": "tenant_XXX"
}
}
{
"name": "hybrid_search",
"arguments": {
"namespaceId": "your_namespace_id",
"query": "example document",
"topK": 5,
"tenantId": "tenant_XXX",
"hybridConfig": {
"semanticWeight": 0.7,
"keywordWeight": 0.3
}
}
}
create_connection
: Create a new connection to an external servicelist_connections
: List all connectionsget_connection
: Get details of a specific connectionupdate_connection
: Update a connectionrevoke_connection
: Revoke a connection{
"name": "create_connection",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX",
"name": "My Connection",
"connector": "GOOGLE_DRIVE",
"clientRedirectUrl": "https://your-app.com/callback"
}
}
{
"name": "list_connections",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX"
}
}
{
"name": "get_connection",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX",
"connectionId": "connection_XXX"
}
}
{
"name": "update_connection",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX",
"connectionId": "connection_XXX",
"name": "Updated Connection Name",
"clientRedirectUrl": "https://your-app.com/updated-callback"
}
}
{
"name": "revoke_connection",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX",
"connectionId": "connection_XXX"
}
}
Here are some example prompts you can use with Claude or Cursor after configuring the MCP server:
If you encounter issues connecting the SourceSync.ai MCP server:
chmod +x dist/index.js
).node /path/to/sourcesyncai-mcp/dist/index.js
For detailed logging, add the DEBUG environment variable:
src/index.ts
: Main entry point and server setupsrc/schemas.ts
: Schema definitions for all toolssrc/sourcesync.ts
: Client for interacting with SourceSync.ai APIsrc/sourcesync.types.ts
: TypeScript type definitions# Build the project
npm run build
# Run tests
npm test
MIT
Document content retrieval workflow:
getDocuments
with includeConfig.parsedTextFileUrl: true
to get documents with their content URLsfetchUrlContent
to retrieve the actual content:{
"name": "fetchUrlContent",
"arguments": {
"url": "https://example.com"
}
}
[
{
"description": "Validates the API key by attempting to list namespaces. Returns the list of namespaces if successful.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {},
"type": "object"
},
"name": "validateApiKey"
},
{
"description": "Creates a new namespace with the provided configuration. Requires a name, file storage configuration, vector storage configuration, and embedding model configuration.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"embeddingModelConfig": {
"anyOf": [
{
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"model": {
"enum": [
"text-embedding-3-small",
"text-embedding-3-large",
"text-embedding-ada-002"
],
"type": "string"
},
"provider": {
"const": "OPENAI",
"type": "string"
}
},
"required": [
"provider",
"model",
"apiKey"
],
"type": "object"
},
{
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"model": {
"enum": [
"embed-english-v3.0",
"embed-multilingual-v3.0",
"embed-english-light-v3.0",
"embed-multilingual-light-v3.0",
"embed-english-v2.0",
"embed-english-light-v2.0",
"embed-multilingual-v2.0"
],
"type": "string"
},
"provider": {
"const": "COHERE",
"type": "string"
}
},
"required": [
"provider",
"model",
"apiKey"
],
"type": "object"
},
{
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"model": {
"enum": [
"jina-embeddings-v3"
],
"type": "string"
},
"provider": {
"const": "JINA",
"type": "string"
}
},
"required": [
"provider",
"model",
"apiKey"
],
"type": "object"
}
]
},
"fileStorageConfig": {
"additionalProperties": false,
"properties": {
"bucket": {
"type": "string"
},
"credentials": {
"additionalProperties": false,
"properties": {
"accessKeyId": {
"type": "string"
},
"secretAccessKey": {
"type": "string"
}
},
"required": [
"accessKeyId",
"secretAccessKey"
],
"type": "object"
},
"endpoint": {
"type": "string"
},
"region": {
"type": "string"
},
"type": {
"enum": [
"S3_COMPATIBLE"
],
"type": "string"
}
},
"required": [
"type",
"bucket",
"region",
"endpoint",
"credentials"
],
"type": "object"
},
"name": {
"type": "string"
},
"tenantId": {
"type": "string"
},
"vectorStorageConfig": {
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"indexHost": {
"type": "string"
},
"provider": {
"enum": [
"PINECONE"
],
"type": "string"
}
},
"required": [
"provider",
"apiKey",
"indexHost"
],
"type": "object"
},
"webScraperConfig": {
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"provider": {
"enum": [
"FIRECRAWL",
"JINA",
"SCRAPINGBEE"
],
"type": "string"
}
},
"required": [
"provider",
"apiKey"
],
"type": "object"
}
},
"required": [
"name",
"fileStorageConfig",
"vectorStorageConfig",
"embeddingModelConfig"
],
"type": "object"
},
"name": "createNamespace"
},
{
"description": "Lists all namespaces available for the current API key and optional tenant ID.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"tenantId": {
"type": "string"
}
},
"type": "object"
},
"name": "listNamespaces"
},
{
"description": "Retrieves a specific namespace by its ID.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"type": "object"
},
"name": "getNamespace"
},
{
"description": "Updates an existing namespace with the provided configuration parameters.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"boxConfig": {
"additionalProperties": false,
"properties": {
"clientId": {
"type": "string"
},
"clientSecret": {
"type": "string"
}
},
"required": [
"clientId",
"clientSecret"
],
"type": "object"
},
"dropboxConfig": {
"additionalProperties": false,
"properties": {
"clientId": {
"type": "string"
},
"clientSecret": {
"type": "string"
}
},
"required": [
"clientId",
"clientSecret"
],
"type": "object"
},
"embeddingModelConfig": {
"anyOf": [
{
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"model": {
"enum": [
"text-embedding-3-small",
"text-embedding-3-large",
"text-embedding-ada-002"
],
"type": "string"
},
"provider": {
"const": "OPENAI",
"type": "string"
}
},
"required": [
"provider",
"model",
"apiKey"
],
"type": "object"
},
{
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"model": {
"enum": [
"embed-english-v3.0",
"embed-multilingual-v3.0",
"embed-english-light-v3.0",
"embed-multilingual-light-v3.0",
"embed-english-v2.0",
"embed-english-light-v2.0",
"embed-multilingual-v2.0"
],
"type": "string"
},
"provider": {
"const": "COHERE",
"type": "string"
}
},
"required": [
"provider",
"model",
"apiKey"
],
"type": "object"
},
{
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"model": {
"enum": [
"jina-embeddings-v3"
],
"type": "string"
},
"provider": {
"const": "JINA",
"type": "string"
}
},
"required": [
"provider",
"model",
"apiKey"
],
"type": "object"
}
]
},
"fileStorageConfig": {
"additionalProperties": false,
"properties": {
"bucket": {
"type": "string"
},
"credentials": {
"additionalProperties": false,
"properties": {
"accessKeyId": {
"type": "string"
},
"secretAccessKey": {
"type": "string"
}
},
"required": [
"accessKeyId",
"secretAccessKey"
],
"type": "object"
},
"endpoint": {
"type": "string"
},
"region": {
"type": "string"
},
"type": {
"enum": [
"S3_COMPATIBLE"
],
"type": "string"
}
},
"required": [
"type",
"bucket",
"region",
"endpoint",
"credentials"
],
"type": "object"
},
"googleDriveConfig": {
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"clientId": {
"type": "string"
},
"clientSecret": {
"type": "string"
}
},
"required": [
"clientId",
"clientSecret",
"apiKey"
],
"type": "object"
},
"namespaceId": {
"type": "string"
},
"notionConfig": {
"additionalProperties": false,
"properties": {
"clientId": {
"type": "string"
},
"clientSecret": {
"type": "string"
}
},
"required": [
"clientId",
"clientSecret"
],
"type": "object"
},
"onedriveConfig": {
"additionalProperties": false,
"properties": {
"clientId": {
"type": "string"
},
"clientSecret": {
"type": "string"
}
},
"required": [
"clientId",
"clientSecret"
],
"type": "object"
},
"sharepointConfig": {
"additionalProperties": false,
"properties": {
"clientId": {
"type": "string"
},
"clientSecret": {
"type": "string"
}
},
"required": [
"clientId",
"clientSecret"
],
"type": "object"
},
"tenantId": {
"type": "string"
},
"vectorStorageConfig": {
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"indexHost": {
"type": "string"
},
"provider": {
"enum": [
"PINECONE"
],
"type": "string"
}
},
"required": [
"provider",
"apiKey",
"indexHost"
],
"type": "object"
},
"webScraperConfig": {
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"provider": {
"enum": [
"FIRECRAWL",
"JINA",
"SCRAPINGBEE"
],
"type": "string"
}
},
"required": [
"provider",
"apiKey"
],
"type": "object"
}
},
"type": "object"
},
"name": "updateNamespace"
},
{
"description": "Permanently deletes a namespace by its ID.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"type": "object"
},
"name": "deleteNamespace"
},
{
"description": "Ingests raw text content into the namespace. Supports optional metadata and chunk configuration.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"ingestConfig": {
"additionalProperties": false,
"properties": {
"chunkConfig": {
"additionalProperties": false,
"description": "Optional Chunk config. When not passed, default chunk config will be used.",
"properties": {
"chunkOverlap": {
"type": "number"
},
"chunkSize": {
"type": "number"
}
},
"required": [
"chunkSize",
"chunkOverlap"
],
"type": "object"
},
"config": {
"additionalProperties": false,
"properties": {
"metadata": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
},
"name": {
"type": "string"
},
"text": {
"type": "string"
}
},
"required": [
"text"
],
"type": "object"
},
"source": {
"const": "TEXT",
"type": "string"
}
},
"required": [
"source",
"config"
],
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"ingestConfig"
],
"type": "object"
},
"name": "ingestText"
},
{
"description": "Ingests a file into the namespace. Supports various file formats with automatic parsing.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"chunkConfig": {
"additionalProperties": false,
"description": "Optional Chunk config. When not passed, default chunk config will be used.",
"properties": {
"chunkOverlap": {
"type": "number"
},
"chunkSize": {
"type": "number"
}
},
"required": [
"chunkSize",
"chunkOverlap"
],
"type": "object"
},
"file": {},
"metadata": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"file"
],
"type": "object"
},
"name": "ingestFile"
},
{
"description": "Ingests content from a list of URLs. Supports scraping options and metadata.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"ingestConfig": {
"additionalProperties": false,
"properties": {
"chunkConfig": {
"additionalProperties": false,
"description": "Optional Chunk config. When not passed, default chunk config will be used.",
"properties": {
"chunkOverlap": {
"type": "number"
},
"chunkSize": {
"type": "number"
}
},
"required": [
"chunkSize",
"chunkOverlap"
],
"type": "object"
},
"config": {
"additionalProperties": false,
"properties": {
"metadata": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
},
"scrapeOptions": {
"additionalProperties": false,
"properties": {
"excludeSelectors": {
"items": {
"type": "string"
},
"type": "array"
},
"includeSelectors": {
"items": {
"type": "string"
},
"type": "array"
}
},
"type": "object"
},
"urls": {
"items": {
"type": "string"
},
"type": "array"
}
},
"required": [
"urls"
],
"type": "object"
},
"source": {
"const": "URLS_LIST",
"type": "string"
}
},
"required": [
"source",
"config"
],
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"ingestConfig"
],
"type": "object"
},
"name": "ingestUrls"
},
{
"description": "Ingests content from a website using its sitemap.xml. Supports path filtering and link limits.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"ingestConfig": {
"additionalProperties": false,
"properties": {
"chunkConfig": {
"additionalProperties": false,
"description": "Optional Chunk config. When not passed, default chunk config will be used.",
"properties": {
"chunkOverlap": {
"type": "number"
},
"chunkSize": {
"type": "number"
}
},
"required": [
"chunkSize",
"chunkOverlap"
],
"type": "object"
},
"config": {
"additionalProperties": false,
"properties": {
"excludePaths": {
"items": {
"type": "string"
},
"type": "array"
},
"includePaths": {
"items": {
"type": "string"
},
"type": "array"
},
"maxLinks": {
"type": "number"
},
"metadata": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
},
"url": {
"type": "string"
}
},
"required": [
"url"
],
"type": "object"
},
"source": {
"const": "SITEMAP",
"type": "string"
}
},
"required": [
"source",
"config"
],
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"ingestConfig"
],
"type": "object"
},
"name": "ingestSitemap"
},
{
"description": "Crawls and ingests content from a website recursively. Supports depth control and path filtering.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"ingestConfig": {
"additionalProperties": false,
"properties": {
"chunkConfig": {
"additionalProperties": false,
"description": "Optional Chunk config. When not passed, default chunk config will be used.",
"properties": {
"chunkOverlap": {
"type": "number"
},
"chunkSize": {
"type": "number"
}
},
"required": [
"chunkSize",
"chunkOverlap"
],
"type": "object"
},
"config": {
"additionalProperties": false,
"properties": {
"excludePaths": {
"items": {
"type": "string"
},
"type": "array"
},
"includePaths": {
"items": {
"type": "string"
},
"type": "array"
},
"maxDepth": {
"type": "number"
},
"maxLinks": {
"type": "number"
},
"metadata": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
},
"url": {
"type": "string"
}
},
"required": [
"url"
],
"type": "object"
},
"source": {
"const": "WEBSITE",
"type": "string"
}
},
"required": [
"source",
"config"
],
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"ingestConfig"
],
"type": "object"
},
"name": "ingestWebsite"
},
{
"description": "Ingests all documents in the connector that are in backlog or failed status. No need to provide the document ids or file ids for the ingestion. Ids are already in the backlog when picked thorough the picker. If not, the user has to go through the authorization flow again, where they will be asked to pick the documents again.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"ingestConfig": {
"additionalProperties": false,
"properties": {
"chunkConfig": {
"additionalProperties": false,
"description": "Optional Chunk config. When not passed, default chunk config will be used.",
"properties": {
"chunkOverlap": {
"type": "number"
},
"chunkSize": {
"type": "number"
}
},
"required": [
"chunkSize",
"chunkOverlap"
],
"type": "object"
},
"config": {
"additionalProperties": false,
"properties": {
"connectionId": {
"type": "string"
},
"metadata": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
}
},
"required": [
"connectionId"
],
"type": "object"
},
"source": {
"type": "string"
}
},
"required": [
"source",
"config"
],
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"ingestConfig"
],
"type": "object"
},
"name": "ingestConnector"
},
{
"description": "Checks the status of a previously submitted ingestion job.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"ingestJobRunId": {
"type": "string"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"ingestJobRunId"
],
"type": "object"
},
"name": "getIngestJobRunStatus"
},
{
"description": "Fetches documents from the namespace based on filter criteria. Supports pagination and including specific document properties.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"documentIds": {
"items": {
"type": "string"
},
"type": "array"
},
"filterConfig": {
"additionalProperties": false,
"properties": {
"documentConnectionIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentExternalIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentIngestionSources": {
"items": {
"enum": [
"TEXT",
"URLS_LIST",
"SITEMAP",
"WEBSITE",
"LOCAL_FILE",
"NOTION",
"GOOGLE_DRIVE",
"DROPBOX",
"ONEDRIVE",
"BOX",
"SHAREPOINT"
],
"type": "string"
},
"type": "array"
},
"documentIngestionStatuses": {
"items": {
"enum": [
"BACKLOG",
"QUEUED",
"QUEUED_FOR_RESYNC",
"PROCESSING",
"SUCCESS",
"FAILED",
"CANCELLED"
],
"type": "string"
},
"type": "array"
},
"documentTypes": {
"items": {
"enum": [
"TEXT",
"URL",
"FILE",
"NOTION_DOCUMENT",
"GOOGLE_DRIVE_DOCUMENT",
"DROPBOX_DOCUMENT",
"ONEDRIVE_DOCUMENT",
"BOX_DOCUMENT",
"SHAREPOINT_DOCUMENT"
],
"type": "string"
},
"type": "array"
},
"metadata": {
"additionalProperties": {
"type": "string"
},
"type": "object"
}
},
"type": "object"
},
"includeConfig": {
"additionalProperties": false,
"properties": {
"documents": {
"type": "boolean"
},
"parsedTextFileUrl": {
"type": "boolean"
},
"rawFileUrl": {
"type": "boolean"
},
"stats": {
"type": "boolean"
},
"statsBySource": {
"type": "boolean"
},
"statsByStatus": {
"type": "boolean"
}
},
"type": "object"
},
"namespaceId": {
"type": "string"
},
"pagination": {
"additionalProperties": false,
"properties": {
"cursor": {
"type": "string"
},
"pageSize": {
"type": "number"
}
},
"type": "object"
},
"tenantId": {
"type": "string"
}
},
"required": [
"filterConfig"
],
"type": "object"
},
"name": "fetchDocuments"
},
{
"description": "Updates metadata for documents that match the specified filter criteria.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"data": {
"additionalProperties": false,
"properties": {
"$metadata": {
"additionalProperties": false,
"properties": {
"$append": {
"additionalProperties": {
"items": {
"type": "string"
},
"type": "array"
},
"type": "object"
},
"$remove": {
"additionalProperties": {
"items": {
"type": "string"
},
"type": "array"
},
"type": "object"
},
"$set": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
}
},
"type": "object"
},
"metadata": {
"additionalProperties": {
"type": "string"
},
"type": "object"
}
},
"type": "object"
},
"documents": {
"items": {
"additionalProperties": false,
"properties": {
"documentId": {
"type": "string"
},
"metadata": {
"additionalProperties": {
"type": "string"
},
"type": "object"
}
},
"required": [
"documentId"
],
"type": "object"
},
"type": "array"
},
"filterConfig": {
"additionalProperties": false,
"properties": {
"documentConnectionIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentExternalIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentIngestionSources": {
"items": {
"enum": [
"TEXT",
"URLS_LIST",
"SITEMAP",
"WEBSITE",
"LOCAL_FILE",
"NOTION",
"GOOGLE_DRIVE",
"DROPBOX",
"ONEDRIVE",
"BOX",
"SHAREPOINT"
],
"type": "string"
},
"type": "array"
},
"documentIngestionStatuses": {
"items": {
"enum": [
"BACKLOG",
"QUEUED",
"QUEUED_FOR_RESYNC",
"PROCESSING",
"SUCCESS",
"FAILED",
"CANCELLED"
],
"type": "string"
},
"type": "array"
},
"documentTypes": {
"items": {
"enum": [
"TEXT",
"URL",
"FILE",
"NOTION_DOCUMENT",
"GOOGLE_DRIVE_DOCUMENT",
"DROPBOX_DOCUMENT",
"ONEDRIVE_DOCUMENT",
"BOX_DOCUMENT",
"SHAREPOINT_DOCUMENT"
],
"type": "string"
},
"type": "array"
},
"metadata": {
"additionalProperties": {
"type": "string"
},
"type": "object"
}
},
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"documents",
"filterConfig",
"data"
],
"type": "object"
},
"name": "updateDocuments"
},
{
"description": "Permanently deletes documents that match the specified filter criteria.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"documentIds": {
"items": {
"type": "string"
},
"type": "array"
},
"filterConfig": {
"additionalProperties": false,
"properties": {
"documentConnectionIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentExternalIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentIngestionSources": {
"items": {
"enum": [
"TEXT",
"URLS_LIST",
"SITEMAP",
"WEBSITE",
"LOCAL_FILE",
"NOTION",
"GOOGLE_DRIVE",
"DROPBOX",
"ONEDRIVE",
"BOX",
"SHAREPOINT"
],
"type": "string"
},
"type": "array"
},
"documentIngestionStatuses": {
"items": {
"enum": [
"BACKLOG",
"QUEUED",
"QUEUED_FOR_RESYNC",
"PROCESSING",
"SUCCESS",
"FAILED",
"CANCELLED"
],
"type": "string"
},
"type": "array"
},
"documentTypes": {
"items": {
"enum": [
"TEXT",
"URL",
"FILE",
"NOTION_DOCUMENT",
"GOOGLE_DRIVE_DOCUMENT",
"DROPBOX_DOCUMENT",
"ONEDRIVE_DOCUMENT",
"BOX_DOCUMENT",
"SHAREPOINT_DOCUMENT"
],
"type": "string"
},
"type": "array"
},
"metadata": {
"additionalProperties": {
"type": "string"
},
"type": "object"
}
},
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"filterConfig"
],
"type": "object"
},
"name": "deleteDocuments"
},
{
"description": "Reprocesses documents that match the specified filter criteria. Useful for updating after schema changes.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"documentIds": {
"items": {
"type": "string"
},
"type": "array"
},
"filterConfig": {
"additionalProperties": false,
"properties": {
"documentConnectionIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentExternalIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentIds": {
"items": {
"type": "string"
},
"type": "array"
},
"documentIngestionSources": {
"items": {
"enum": [
"TEXT",
"URLS_LIST",
"SITEMAP",
"WEBSITE",
"LOCAL_FILE",
"NOTION",
"GOOGLE_DRIVE",
"DROPBOX",
"ONEDRIVE",
"BOX",
"SHAREPOINT"
],
"type": "string"
},
"type": "array"
},
"documentIngestionStatuses": {
"items": {
"enum": [
"BACKLOG",
"QUEUED",
"QUEUED_FOR_RESYNC",
"PROCESSING",
"SUCCESS",
"FAILED",
"CANCELLED"
],
"type": "string"
},
"type": "array"
},
"documentTypes": {
"items": {
"enum": [
"TEXT",
"URL",
"FILE",
"NOTION_DOCUMENT",
"GOOGLE_DRIVE_DOCUMENT",
"DROPBOX_DOCUMENT",
"ONEDRIVE_DOCUMENT",
"BOX_DOCUMENT",
"SHAREPOINT_DOCUMENT"
],
"type": "string"
},
"type": "array"
},
"metadata": {
"additionalProperties": {
"type": "string"
},
"type": "object"
}
},
"type": "object"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"filterConfig"
],
"type": "object"
},
"name": "resyncDocuments"
},
{
"description": "Performs semantic search across the namespace to find relevant content based on meaning rather than exact keyword matches.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"filter": {
"additionalProperties": false,
"properties": {
"metadata": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
}
},
"type": "object"
},
"namespaceId": {
"type": "string"
},
"query": {
"type": "string"
},
"scoreThreshold": {
"type": "number"
},
"searchType": {
"enum": [
"SEMANTIC",
"HYBRID"
],
"type": "string"
},
"tenantId": {
"type": "string"
},
"topK": {
"type": "number"
}
},
"required": [
"query"
],
"type": "object"
},
"name": "semanticSearch"
},
{
"description": "Performs a combined keyword and semantic search, balancing between exact matches and semantic similarity. Requires hybridConfig with weights for both search types.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"filter": {
"additionalProperties": false,
"properties": {
"metadata": {
"additionalProperties": {
"anyOf": [
{
"type": "string"
},
{
"items": {
"type": "string"
},
"type": "array"
}
]
},
"type": "object"
}
},
"type": "object"
},
"hybridConfig": {
"additionalProperties": false,
"properties": {
"keywordWeight": {
"type": "number"
},
"semanticWeight": {
"type": "number"
}
},
"required": [
"semanticWeight",
"keywordWeight"
],
"type": "object"
},
"namespaceId": {
"type": "string"
},
"query": {
"type": "string"
},
"scoreThreshold": {
"type": "number"
},
"searchType": {
"enum": [
"SEMANTIC",
"HYBRID"
],
"type": "string"
},
"tenantId": {
"type": "string"
},
"topK": {
"type": "number"
}
},
"required": [
"query",
"hybridConfig"
],
"type": "object"
},
"name": "hybridSearch"
},
{
"description": "Creates a new connection to a specific source. The connector parameter should be a valid SourceSync connector enum value. The clientRedirectUrl parameter is optional and can be used to specify a custom redirect URL for the connection. This will give you a authorization url which you can redirect the user to. The user will then be asked to pick the documents they want to ingest.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"clientRedirectUrl": {
"type": "string"
},
"connector": {
"enum": [
"NOTION",
"GOOGLE_DRIVE",
"DROPBOX",
"ONEDRIVE",
"BOX",
"SHAREPOINT"
],
"type": "string"
},
"name": {
"type": "string"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"name",
"connector"
],
"type": "object"
},
"name": "createConnection"
},
{
"description": "Lists all connections for the current namespace, optionally filtered by connector type.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"connector": {
"enum": [
"NOTION",
"GOOGLE_DRIVE",
"DROPBOX",
"ONEDRIVE",
"BOX",
"SHAREPOINT"
],
"type": "string"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"type": "object"
},
"name": "listConnections"
},
{
"description": "Retrieves details for a specific connection by its ID.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"connectionId": {
"type": "string"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"connectionId"
],
"type": "object"
},
"name": "getConnection"
},
{
"description": "Updates a connection to a specific source. The connector parameter should be a valid SourceSync connector enum value. The clientRedirectUrl parameter is optional and can be used to specify a custom redirect URL for the connection. This will give you a authorization url which you can redirect the user to. The user will then be asked to pick the documents they want to ingest. This is useful if you want to update the connection to a different source or if you want to update the clientRedirectUrl or if you want to pick a different or new set of documents.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"clientRedirectUrl": {
"type": "string"
},
"connectionId": {
"type": "string"
},
"name": {
"type": "string"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"connectionId"
],
"type": "object"
},
"name": "updateConnection"
},
{
"description": "Revokes access for a specific connection, removing the integration with the external service.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"connectionId": {
"type": "string"
},
"namespaceId": {
"type": "string"
},
"tenantId": {
"type": "string"
}
},
"required": [
"connectionId"
],
"type": "object"
},
"name": "revokeConnection"
},
{
"description": "Fetches the content of a URL. Particularly useful for fetching parsed text file URLs.",
"inputSchema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"apiKey": {
"type": "string"
},
"tenantId": {
"type": "string"
},
"url": {
"format": "uri",
"type": "string"
}
},
"required": [
"url"
],
"type": "object"
},
"name": "fetchUrlContent"
}
]