DocsOverview

Overview

Compresr reduces LLM token costs through intelligent context compression.

Compression Models

espresso_v1

General-purpose compression — no query needed. Removes redundant tokens while preserving meaning. Ideal for pre-compressing documents, system prompts, or any context you want to reuse across multiple queries.

latte_v1

Query-specific compression that preserves tokens relevant to a given query. Ideal for RAG pipelines and Q&A systems where you want to keep answer-relevant information while compressing the rest.

Quick Start

  1. Get your API key from the Dashboard
  2. Install the SDK: pip install compresr
  3. Start compressing — see the Quick Start guide