Skip to main content

Get run details

Endpoint: GET /runs/{run_id}

Description
Returns a single run you can access: yours or shared with your organization. The payload includes every evaluation and a compact representation of each record. If the id is not visible to your API key, the server responds with 404 (same as “not found”).

run_type and run_source are human-readable labels (for example "Dataset" and "API Call"), not internal enum names. user is the owning account's email, or null for org-only runs. author_email is the email of whoever created the run.

Each record inside evaluations is truncated to 100 characters per string field (prompt, input, context, output, golden_answer). metric is the metric shortname (for example ans_corr).

Parameters

  • Path: run_id — integer run id.

Error responses

  • 401 — Authentication failed.
  • 404 — Run with id run_id not found.
  • 500 — Error retrieving the run.

Responses

  • 200 — JSON object with the shape below.

Example response (200)

{
"id": 991,
"user": "analyst@acme.com",
"author_email": "analyst@acme.com",
"run_type": "Dataset",
"run_source": "API Call",
"dataset": "Support QA - March",
"data_collection": "Customer Support",
"org_id": null,
"number_of_metrics": 2,
"result": 83.5,
"threshold": 70,
"model_slug": "gpt-4o",
"alias": "support-march",
"aggregate_results": {
"ans_corr": 86,
"faith": 81
},
"total_cost": "0.001200000000000",
"started_at": "2026-04-01T09:12:00Z",
"finished_at": "2026-04-01T09:13:12Z",
"is_gte_threshold": true,
"evaluations": [
{
"id": 1542,
"run_id": 991,
"record_index": 0,
"metric": "ans_corr",
"record": {
"id": 1001,
"prompt": "What is 2+2?",
"input": null,
"context": null,
"output": "4",
"golden_answer": "4"
},
"result": 100,
"threshold": 70,
"explanation": "Exact match.",
"model_slug": "gpt-4o",
"metric_args": null,
"started_at": "2026-04-01T09:12:01Z",
"finished_at": "2026-04-01T09:12:02Z",
"evaluation_cost": "0.0004",
"is_success": true,
"is_gte_threshold": true,
"eval_metadata": null
}
]
}
{
"id": 0,
"user": "string | null",
"author_email": "string",
"run_type": "string",
"run_source": "string",
"dataset": "string | null",
"data_collection": "string | null",
"org_id": "integer | null",
"number_of_metrics": 0,
"result": "number | null",
"threshold": 0,
"model_slug": "string | null",
"alias": "string | null",
"aggregate_results": "object | null",
"total_cost": "number | string | null",
"started_at": "date",
"finished_at": "date | null",
"is_gte_threshold": "boolean | null",
"evaluations": []
}

Each element of evaluations has this shape:

{
"id": 0,
"run_id": 0,
"record_index": 0,
"metric": "string",
"record": {
"id": 0,
"prompt": "string | null",
"input": "string | null",
"context": "string | null",
"output": "string | null",
"golden_answer": "string | null"
},
"result": "number | null",
"threshold": 0,
"explanation": "string | null",
"model_slug": "string | null",
"metric_args": "object | null",
"started_at": "date",
"finished_at": "date | null",
"evaluation_cost": "string | null",
"is_success": "boolean | null",
"is_gte_threshold": "boolean | null",
"eval_metadata": "object | null"
}

metric_args is the per-metric argument map this evaluation actually ran with (dataset-level defaults plus any per-run override applied at run creation). eval_metadata is a metric-specific JSON object with extra signals or null.

curl

curl "https://api.aegisevals.ai/api/v1/runs/1" \
-H "Authorization: Bearer sk_00000000000000000000000000000000"