Context Gateway
A transparent proxy that sits between your AI agent and the LLM API, automatically compressing conversation history in the background so you never wait.
View on GitHubWhat is Context Gateway?
A transparent proxy that manages context automatically:
- Monitors token usage across the conversation
- Pre-computes summaries at 75% of context limit (configurable)
- Instant compaction when limit is reached — no waiting
- Compresses large tool outputs on the fly
- Logs compression events to
logs/history_compaction.jsonl
Installation
Quick Install
curl -fsSL https://compresr.ai/api/install | shManual Install Options
- Download the binary directly from GitHub Releases
- Build from source (requires Go 1.21+):
go build -o context-gateway ./cmd/gateway
Quick Start
1. Launch the interactive wizard
context-gateway2. Follow the wizard
The TUI wizard will guide you through:
- Select your agent (Claude Code, OpenClaw, OpenCode, or Custom)
- Enter your LLM provider API key (Anthropic, OpenAI, etc.)
- Enter your Compresr API key
- Configure compression settings (threshold, model, etc.)
3. Use your agent as usual
The gateway runs as a local proxy on http://localhost:8080. Your agent's API calls are routed through it automatically. No code changes needed.
Supported Agents
Works with any LLM provider (OpenAI, Anthropic, Ollama, Bedrock). Start or stop the gateway anytime — agents auto-detect it.
Claude Code
Codex
OpenHands
OpenClaw
Custom
Usage
# Terminal 1: start the gateway
context-gateway
# Terminal 2: use your agent as usual
claude # Claude Code
codex # Codex
openhands # OpenHandsOpenClaw Integration
# Install the OpenClaw plugin
openclaw plugin install context-gateway
# Start the gateway
context-gateway
# Running or new agents auto-detect the gatewayFor custom deployments, point your agent's LLM API base URL to http://localhost:8080.
Configuration
Configuration is saved to ~/.config/context-gateway/.env after running the interactive wizard. You can edit or update your config at any time using the CLI:
Re-configure via CLI
context-gateway -cOr edit it manually at ~/.config/context-gateway/.env:
Environment Variables
# Required
COMPRESR_API_KEY=cmp_your_api_key # Your Compresr API key
LLM_API_KEY=sk-xxx # Your LLM provider API key
# Agent Configuration
AGENT_TYPE=claude_code # claude_code | openclaw | opencode | custom
PROXY_PORT=8080 # Local proxy port (default: 8080)
# Compression Settings
CONTEXT_THRESHOLD=0.75 # Trigger compression at 75% of context limit
COMPRESSION_MODEL=espresso_v1 # Model used for history compression
TARGET_COMPRESSION_RATIO=0.5 # How aggressively to compress (0.2-0.9)
# Optional
SLACK_WEBHOOK_URL=https://hooks.slack.com/... # Slack notifications
LOG_LEVEL=info # debug | info | warn | errorBenefits
Zero latency
Compression happens in the background.
Transparent
No code changes needed.
Cost savings
Reduce token usage by 30-70%.
Observable
Full metrics in logs.
Logs & Monitoring
The gateway creates detailed logs for every compression event:
logs/history_compaction.jsonlWhen and how conversations are compressed
logs/tool_output_compression.jsonlTool output compression metrics and results
logs/telemetry.jsonlRequest/response timing and performance data
Example log entry
{
"timestamp": "2026-03-06T14:30:00Z",
"event": "history_compaction",
"agent": "claude_code",
"original_tokens": 180000,
"compressed_tokens": 54000,
"compression_ratio": 0.7,
"model": "espresso_v1",
"latency_ms": 1200
}Remote Deployment
Deploy the gateway as a service for team-wide usage:
Deploy as a service
# Using Docker
docker run -d \
-p 8080:8080 \
-e COMPRESR_API_KEY=cmp_your_api_key \
-e LLM_API_KEY=sk-xxx \
-e AGENT_TYPE=claude_code \
compresr/context-gateway:latest
# Or using the binary directly
COMPRESR_API_KEY=cmp_xxx LLM_API_KEY=sk-xxx context-gateway --port 8080Environment Variables for Deployment
All configuration options from the ~/.config/context-gateway/.env file can be passed as environment variables. This makes it easy to deploy via Docker, Kubernetes, or any container orchestration platform.