Introduction
The Aegis API Server is the HTTP API for programmatic access to Aegis. This section is organized into Introduction, Evaluations, and Data so you can quickly find endpoint behavior, request payloads, and response examples.
Base URL
All documented routes live under a single prefix:
https://api.aegisevals.ai/api/v1
Core concepts
Before calling endpoints, these are the key terms used across the API:
- Evaluation: one scoring task where Aegis checks model output against one or more metrics (for example: correctness, safety, formatting).
- Run: a stored execution of an evaluation job. A run usually contains many evaluations and their metric scores.
- Dataset: uploaded CSV data that Aegis stores as rows, with chosen metrics and thresholds, so you can run evaluations repeatedly without resending the file.
- Dataset type: a lookup that classifies a dataset (for example custom user uploads versus proprietary catalog data). Types have integer ids used when filtering dataset lists.
- Model (catalog): a registered LLM in Aegis with
id,slug,name, and a supplier. See Models for the list of validslugvalues to use in runs and evaluations. - Dataset run: a run created from a saved dataset that already exists in Aegis.
- Custom run: a run created by sending rows directly in the request body, without needing a pre-saved dataset.
- Data collection: a container that groups datasets and related runs so teams can organize and review evaluation work in one place.
Endpoint guide
Evaluate
POST /evaluate (Single Evaluation)
- Runs an evaluation request and returns the computed scoring output.
- Use this for quick or direct evaluation execution from your app/backend.
Runs
POST /runs/dataset (Create run from dataset)
- Starts a new run using a dataset already stored in Aegis.
- Use this for repeatable evaluations over curated data.
POST /runs/custom (Create custom run)
- Starts a new run by sending metric config and rows directly in the request.
- Use this when data is generated on the fly and not saved as a dataset first.
GET /runs/{run_id} (Get run)
- Retrieves one existing run with its summary and row-level evaluation data.
- Use this to get information about a certain run.
GET /runs/{run_id}/download (Download run)
- Exports run results as CSV for analysis.
- Use this for reporting, sharing, and offline analysis.
Dataset types
GET /dataset-types + GET /dataset-types/{dataset_type_id} (Get dataset types)
- Lists all dataset types or returns one type by id (
name,label,description). - Use this to discover type ids for
dataset_type_idon Get datasets.
Models
Models (reference) — the supported LLM slugs, display names, and flags (thinking, latest) are listed in the docs; the API server does not expose a models HTTP endpoint. Use a documented slug in model_slug when you create runs or evaluations.
Datasets
POST /datasets (Create dataset)
- Uploads a CSV and creates a custom dataset with metric thresholds.
- Use this to persist evaluation data for dataset runs and data collections.
GET /datasets/all-partial, GET /datasets, GET /datasets/{dataset_id} (Get datasets)
- Lists dataset ids/names, paginated summaries, or one full dataset (records, runs, evaluations).
- Use this to discover datasets and inspect stored rows.
PUT /datasets/{dataset_id} (Update dataset)
- Updates name, metrics, column mappings, or data collection membership.
- Use this to keep dataset information accurate over time.
DELETE /datasets/{dataset_id} (Delete dataset)
- Deletes a custom dataset you own.
GET /datasets/{dataset_id}/download (Download dataset)
- Exports dataset rows as CSV.
Data collections
POST /data-collections (Create data collection)
- Creates a new data collection container.
- Use this when grouping datasets or runs is necessary.
GET /data-collections + GET /data-collections/{data_collection_id} (Get data collections)
- Lists collections (paginated) and fetches one collection by id.
- Use this to get information about a certain collection.
PUT data-collections/{data_collection_id} (Update data collection)
- Updates collection metadata (for example name, aliases, linked references).
- Use this to keep collection information accurate over time.
DELETE /data-collections/{data_collection_id} (Delete data collection)
- Permanently removes a data collection.
- Use this when a collection is no longer needed.
Getting started
- Use your organization’s Aegis web app to sign in and create an API key intended for integrations.
- Send the key on every request to the routes in this section, as described below.
Authentication
Every route under /runs, /evaluate, /data-collections, /dataset-types, and /datasets requires:
Authorization: Bearer <token>
Credits and billable work
Operations that enqueue real model work generally check your credit balance before proceeding. That includes starting a dataset run, a custom run, and calling POST /evaluate. If you lack credits, the server responds with 402 Payment Required (see HTTP status codes and error responses).
HTTP status codes and error responses
Responses use this shape: JSON with a detail field.
200— Success forGET/PUT/DELETEwhere a body is returned.201— Resource created (POSTruns,POSTevaluate,POSTdata collections,POSTdatasets).204— Success with no body (DELETEdata collection,DELETEdataset).400— Bad request: invalid payload, inactive metrics, missing, or inconsistent IDs, duplicate alias/name, database constraint, etc.401— Missing/invalidAuthorizationheader, or API key not accepted.402— Insufficient credits - returned beforePOST /runs/dataset,POST /runs/custom, orPOST /evaluateruns if your balance is too low.404— Resource not found (run, metric, model, dataset, dataset type, data collection, no rows/CSV data, etc.).422— Validation failed - invalid JSON body or field types on aPOST/PUT(e.g.POST /runs/dataset,POST /runs/custom), or out-of-range query params (e.g.page/page_size).500— Unexpected server error.