Compression Models

Choose the right compression model based on your use case and performance requirements.

Agnostic Models

Use with CompressionClient for general-purpose compression without a specific question.

ModelDescription
A_CMPRSR_V1LLM-based abstractive compression (default). Best quality with semantic preservation.
A_CMPRSR_V1_FLASHFast extractive compression. Lower latency for speed-critical applications.

Question-Specific Models

Use with QSCompressionClient to compress context based on a specific question.

ModelDescription
QS_CMPRSR_V1Question-specific abstractive compression (default). Best for RAG and Q&A systems.
QSR_CMPRSR_V1Question-specific extractive compression. Faster processing for high-throughput RAG.

Model Selection Guide

When to use Agnostic

  • System prompts and instructions
  • Static documentation
  • General context without specific queries
  • Long-form content compression

When to use Question-Specific

  • RAG (Retrieval-Augmented Generation)
  • Q&A systems with user queries
  • Dynamic question-based filtering
  • Context-aware compressions