Context Sufficiency (ctx_suff)
Contents
Metric Description
Context sufficiency measures whether the retrieved context is adequate for correctly addressing the user's query. It is crucial for RAG (Retrieval-Augmented Generation) systems, where the Large Language Model (LLM) is provided with relevant information from a knowledge base before generating its answer. If the context is incomplete or poorly matched, the model risks producing an incorrect or hallucinated response that is not grounded in the provided data.
The score runs from 0 (insufficient context) to 100 (fully sufficient context). The implementation uses an LLM-as-a-Judge approach.
How to interpret the score
- Closer to 100: the context contains enough relevant information to fully answer all aspects of the user's question.
- Closer to 0: the context provides little or no information needed to answer the user's question.
Context sufficiency evaluates whether the retrieved context is enough to answer the query—it does not evaluate the model's actual output. Pair this with context faithfulness and answer relevancy to assess the full RAG pipeline.
API usage
Prerequisites
After the environment variables are configured, the next step is to create a JSON payload for the custom-runs request. For a field-by-field description of the payload (top-level keys, evaluations, and each row in data), see Custom run request body.
Shortname: ctx_suff
Default threshold: 80
Inputs (each object in data)
input(strrequired): The user's question or instruction (what the context should help answer).context(strorlistrequired): The retrieved context or source documents (e.g., chunks from a knowledge base).
Evaluation metadata
On successful evaluation, the metric returns eval_metadata summarizing coverage gaps:
missing_details(list[str]): Details on what details were missing from the context to be able to fully answer the input.
Example
import json
import os
import requests
from dotenv import load_dotenv
load_dotenv(override=True)
_API_KEY = os.getenv("AEGIS_API_KEY")
_BASE_URL = os.getenv("AEGIS_API_BASE_URL")
_CUSTOM_RUN_URL = f"{_BASE_URL}/runs/custom"
def post_custom_run(payload: dict) -> requests.Response:
"""POST JSON payload to Aegis custom runs; returns the raw response."""
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {_API_KEY}",
}
return requests.post(
_CUSTOM_RUN_URL,
headers=headers,
data=json.dumps(payload),
)
if __name__ == "__main__":
context = [
"The new laptop features a 14-inch OLED display, 32GB RAM, and an M3 chip.",
"OLED screens are uncommon in mid-range laptops."
]
data = [
{
"input": "What features does the new laptop have and are they rare?",
"context": context,
},
]
payload = {
"threshold": 80,
"model_slug": "o4-mini",
"is_blocking": True,
"data_collection_id": None,
"evaluations": [
{
"metrics": ["ctx_suff"],
"threshold": 80,
"model_slug": "o4-mini",
"data": data,
}
],
}
response = post_custom_run(payload)
response.raise_for_status()
print(json.dumps(response.json(), indent=2))