Compression Models
Choose the right compression model based on your use case and performance requirements.
Agnostic Models
Use with CompressionClient for general-purpose compression without a specific question.
| Model | Description |
|---|---|
A_CMPRSR_V1 | LLM-based abstractive compression (default). Best quality with semantic preservation. |
A_CMPRSR_V1_FLASH | Fast extractive compression. Lower latency for speed-critical applications. |
Question-Specific Models
Use with QSCompressionClient to compress context based on a specific question.
| Model | Description |
|---|---|
QS_CMPRSR_V1 | Question-specific abstractive compression (default). Best for RAG and Q&A systems. |
QSR_CMPRSR_V1 | Question-specific extractive compression. Faster processing for high-throughput RAG. |
Model Selection Guide
When to use Agnostic
- System prompts and instructions
- Static documentation
- General context without specific queries
- Long-form content compression
When to use Question-Specific
- RAG (Retrieval-Augmented Generation)
- Q&A systems with user queries
- Dynamic question-based filtering
- Context-aware compressions