Get run details

Endpoint: GET /runs/{run_id}

Description
Returns a single run you can access: yours or shared with your organization. The payload includes every evaluation and a compact representation of each record. If the id is not visible to your API key, the server responds with 404 (same as “not found”).

run_type and run_source are human-readable labels (for example "Dataset" and "API Call"), not internal enum names. user is the owning account's email, or null for org-only runs. author_email is the email of whoever created the run.

Each record inside evaluations is truncated to 100 characters per string field (prompt, input, context, output, golden_answer). metric is the metric shortname (for example ans_corr).

Parameters

Path: run_id — integer run id.

Error responses

401 — Authentication failed.
404 — Run with id run_id not found.
500 — Error retrieving the run.

Responses

200 — JSON object with the shape below.

Example response (200)

{
  "id": 991,
  "user": "analyst@acme.com",
  "author_email": "analyst@acme.com",
  "run_type": "Dataset",
  "run_source": "API Call",
  "dataset": "Support QA - March",
  "data_collection": "Customer Support",
  "org_id": null,
  "number_of_metrics": 2,
  "result": 83.5,
  "threshold": 70,
  "model_slug": "gpt-4o",
  "alias": "support-march",
  "aggregate_results": {
    "ans_corr": 86,
    "faith": 81
  },
  "total_cost": "0.001200000000000",
  "started_at": "2026-04-01T09:12:00Z",
  "finished_at": "2026-04-01T09:13:12Z",
  "is_gte_threshold": true,
  "evaluations": [
    {
      "id": 1542,
      "run_id": 991,
      "record_index": 0,
      "metric": "ans_corr",
      "record": {
        "id": 1001,
        "prompt": "What is 2+2?",
        "input": null,
        "context": null,
        "output": "4",
        "golden_answer": "4"
      },
      "result": 100,
      "threshold": 70,
      "explanation": "Exact match.",
      "model_slug": "gpt-4o",
      "metric_args": null,
      "started_at": "2026-04-01T09:12:01Z",
      "finished_at": "2026-04-01T09:12:02Z",
      "evaluation_cost": "0.0004",
      "is_success": true,
      "is_gte_threshold": true,
      "eval_metadata": null
    }
  ]
}

{
  "id": 0,
  "user": "string | null",
  "author_email": "string",
  "run_type": "string",
  "run_source": "string",
  "dataset": "string | null",
  "data_collection": "string | null",
  "org_id": "integer | null",
  "number_of_metrics": 0,
  "result": "number | null",
  "threshold": 0,
  "model_slug": "string | null",
  "alias": "string | null",
  "aggregate_results": "object | null",
  "total_cost": "number | string | null",
  "started_at": "date",
  "finished_at": "date | null",
  "is_gte_threshold": "boolean | null",
  "evaluations": []
}

Each element of evaluations has this shape:

{
  "id": 0,
  "run_id": 0,
  "record_index": 0,
  "metric": "string",
  "record": {
    "id": 0,
    "prompt": "string | null",
    "input": "string | null",
    "context": "string | null",
    "output": "string | null",
    "golden_answer": "string | null"
  },
  "result": "number | null",
  "threshold": 0,
  "explanation": "string | null",
  "model_slug": "string | null",
  "metric_args": "object | null",
  "started_at": "date",
  "finished_at": "date | null",
  "evaluation_cost": "string | null",
  "is_success": "boolean | null",
  "is_gte_threshold": "boolean | null",
  "eval_metadata": "object | null"
}

metric_args is the per-metric argument map this evaluation actually ran with (dataset-level defaults plus any per-run override applied at run creation). eval_metadata is a metric-specific JSON object with extra signals or null.

curl

curl "https://api.aegisevals.ai/api/v1/runs/1" \
  -H "Authorization: Bearer sk_00000000000000000000000000000000"