Three steps. Under two minutes. No SDK to install.
1. Get your API key
Sign up at your dashboard and create an API key. Keys start with comp_ and look like comp_xxxxxxxxxxxx.
You'll also add your LLM provider key (OpenAI, Anthropic, etc.) in the dashboard. Compresh needs it to forward requests on your behalf.
2. Point your client at Compresh
Change your base_url to https://api.compre.sh/v1. That's it.
from openai import OpenAI
client = OpenAI(
api_key="comp_your_key",
base_url="https://api.compre.sh/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
) import OpenAI from "openai";
const client = new OpenAI({
apiKey: "comp_your_key",
baseURL: "https://api.compre.sh/v1"
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hello!" }
]
}); curl https://api.compre.sh/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer comp_your_key" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
}' 3. Done
Seriously — that's it. Compresh proxies your requests, compresses context on multi-turn conversations, and returns responses in the exact same format your app already expects.
Tip
Compression kicks in automatically as conversations deepen. Single-turn requests pass through untouched with negligible overhead.
What works out of the box
- Streaming — SSE responses work identically
- Function calling & tools — tool definitions and calls are preserved
- All OpenAI-compatible models — GPT-4o, GPT-4o-mini, o1, o3, etc.
- System prompts — compressed separately with their own strategy
Zero code changes beyond the base_url.
Next steps
- Authentication — API key details, provider key setup, security
- Overview — How Episodic Memory Architecture works under the hood