V1.2.0 is now live

Stop Prompting.
Start Shipping.

IntentRouter is a cognitive gateway that turns LLM capabilities into infrastructure. Push logic to Git repositories, call one OpenAI-compatible endpoint, and execute on any provider with zero prompt drift. It's basically is Skills runtime.

IntentRouter Visualization

The "Prompt Spaghetti" Problem

Base prompts grow uncontrollable. Tokens are wasted on every request. Nobody knows exactly what's inside the system prompt anymore. Copy-paste engineering is not high-scale architecture.

  • Token Bloat:

    Thousands of repetitive tokens in every payload.

  • No Versioning:

    One tiny tweak can silently degrade performance.

Skills = Capabilities as Code

1. Push

Sync reasoning logic and policies via Git.

2. Call

Send messages to a single standard endpoint.

3. Run

Our JIT compiler matches intents, applies schemas, and routes to models.

One Unified Interface

Bring your own keys. Run any model. Zero lock-in.

POST /v1/chat/completions
// Client Request
01 {
02 "provider": "anthropic",
03 "skill": { "ref": "repo://user/agent@v1" }
04 }
// Service Response (with Metadata)
01 {
02 "choices": [...],
03 "skillz_metadata": {
04 "trace_id": "a1-b2-c3",
05 "duration_ms": 1240,
06 "skill_version": "1.0.0"
07 }
08 }

BYO Keys

We never store your secrets. You maintain direct relationships with LLM providers.

Model Agnostic

Switch between OpenAI, Anthropic, Gemini, and Ollama with a single parameter change.

Slim Context

Only inject necessary logic. Drastically reduce your token consumption and latency.

Immutable Git

Load skills directly from Git repos per request. Pin versions via SHAs or tags for production stability.

Full Request Observability

Every response includes skillz_metadata. Track intent matches, provider latencies, and total duration across your cognitive stack.

Trace ID
Universal request tracking
Skill Version
Immutable SHA/Tag used
PROVIDER LATENCY 1,120ms
SYSTEM OVERHEAD 12ms
TOTAL DURATION 1,132ms
main.py
import requests

payload = {
    "provider": "openai",
    "model": "gpt-4",
    "api_key": "YOUR_OPENAI_KEY",
    "messages": [
        {"role": "user", "content": "Triage this support ticket: Refund request for #8812"}
    ],
    "skill": {
        "ref": "local://examples/skills/customer-support-agent"
    }
}

response = requests.post(
    "https://skill.intentrouter.com/v1/chat/completions",
    json=payload
)

print(response.json())
        

Pre-built Skill Registry

Reference these production-ready skills directly in your API calls.

🎧

Customer Support

local://examples/skills/customer-support-agent

Full ticket triage, sentiment analysis, and solution routing logic.

🔍

Code Reviewer

local://examples/skills/code-review-assistant

Security-first linting and architectural review logic.

🗄️

SQL Architect

local://examples/skills/sql-query-builder

Natural language to secure, optimized SQL query generation.

📊

Data Analyst

local://examples/skills/data-analysis-assistant

Automated EDA and business metric extraction logic.

Live Playground

Test your skills in real-time. Results are fetched directly from the runtime gateway.

Request Payload
JSON EDITOR
Gateway Response IDLE
Normalized OpenAI Format
Waiting for execution...

Frequently Asked Questions

Does IntentRouter store my API keys?

No. We act as a stateless processing layer. Your keys are passed in the request and used immediately to communicate with the provider, then discarded.

What models are supported?

Everything. Support includes OpenAI, Anthropic, Gemini, Groq, and local models via Ollama. If it has a chat completion interface, we can route to it.

How do I update a skill?

Just push to your Git repo. The runtime pulls the latest version (or a specific tag/commit if specified) when your request hits our endpoint.

Is there a latency overhead?

Minimal. We use a high-performance JIT compiler to prepare the skill logic before streaming it to the model. In many cases, the reduced token count leads to faster Time-To-First-Token (TTFT).