Get datasets
This page documents read endpoints for custom datasets:
- Partial list (summary fields per dataset):
GET /datasets/all-partial - Paginated list:
GET /datasets - One dataset by id:
GET /datasets/{dataset_id}
Get datasets (partial)
Endpoint: GET /datasets/all-partial
Description
Returns a lightweight list of your non-proprietary datasets, newest first. Each element has: id, name, user, org_id, author_email, created_at, and updated_at (not full rows, metrics, or records).
Parameters
shared— optional boolean.falsereturns only your own datasets;truereturns only datasets shared with you; omit to return all you can access.
Error responses
401— Authentication failed.500— Server error.
Responses
200— JSON array of partial dataset objects (see Get dataset by id)
Example response (200)
[
{
"id": 3,
"name": "support_qa_march.csv",
"user": { "id": 7, "email": "analyst@acme.com" },
"org_id": null,
"author_email": "analyst@acme.com",
"created_at": "2026-03-30T10:00:00Z",
"updated_at": null
}
]
curl
curl "https://api.aegisevals.ai/api/v1/datasets/all-partial" \
-H "Authorization: Bearer sk_00000000000000000000000000000000"
Get all datasets (paginated)
Endpoint: GET /datasets
Description
Returns a paginated list of dataset summaries. If you omit dataset_type_id, results are limited to the CUSTOM type (user-created datasets), not every dataset type. Pass a type id explicitly to list proprietary or other types your account may access.
Parameters
page— integer, default1, minimum1.page_size— integer, default10, between1and10000.dataset_type_id— optional integer. When set, only datasets with that type id are returned. When omitted, behavior matches filtering to theCUSTOMtype id (same query shape as passing that id). Valid ids are returned byGET /dataset-types.shared— optional boolean.falsereturns only your own datasets;truereturns only datasets shared with you; omit to return all you can access.search— optional string (max 255 characters). Case-insensitive substring match on dataset name; if the value is all digits, also matches dataset id exactly.metric_shortnames— optional comma-separated list of metric shortnames (for exampleans_corr,faith). Returns datasets whoseselected_metricsincludes at least one of the listed metrics.data_collection_ids— optional comma-separated list of positive integers (for example14,22). Returns datasets whosedata_collection_idis one of the listed ids. Invalid tokens (non-integers, zero, or negative values) yield422.
Error responses
401,422— Auth, invalidpage/page_size, or invaliddata_collection_ids.404— Referenced dataset type not found (whendataset_type_idis set).500— Server error.
Responses
200— JSON withitemsandmeta:
Example response (200)
{
"items": [
{
"id": 3,
"user": { "id": 7, "email": "analyst@acme.com" },
"dataset_type_id": 1,
"name": "support_qa_march.csv",
"selected_metrics": {
"1": { "threshold": 70, "metric_args": null },
"2": { "threshold": 80, "metric_args": null }
},
"structure": ["prompt", "output", "golden_answer"],
"column_mappings": null,
"data_collection_id": 14,
"org_id": null,
"created_at": "2026-04-01T10:00:00Z",
"updated_at": null
}
],
"meta": {
"current_page": 1,
"page_size": 10,
"total_items": 1,
"items_on_page": 1,
"total_pages": 1,
"has_next": false,
"has_previous": false,
"next_page": null,
"previous_page": null
}
}
{
"items": [
{
"id": 0,
"user": { "id": 0, "email": "string" },
"dataset_type_id": 0,
"name": "string",
"selected_metrics": "object | null",
"structure": ["string"],
"column_mappings": "object | null",
"data_collection_id": "integer | null",
"org_id": "integer | null",
"created_at": "date",
"updated_at": "date | null"
}
],
"meta": {
"current_page": 1,
"page_size": 10,
"total_items": 1,
"items_on_page": 1,
"total_pages": 1,
"has_next": false,
"has_previous": false,
"next_page": "integer | null",
"previous_page": "integer | null"
}
}
curl
curl "https://api.aegisevals.ai/api/v1/datasets?page=1&page_size=20" \
-H "Authorization: Bearer sk_00000000000000000000000000000000"
With filters:
curl "https://api.aegisevals.ai/api/v1/datasets?page=1&page_size=20&search=support&metric_shortnames=answer_correctness&data_collection_ids=14" \
-H "Authorization: Bearer sk_00000000000000000000000000000000"
Get dataset by id
Endpoint: GET /datasets/{dataset_id}
Description
Returns one dataset with dataset type, records, runs (your runs on this dataset), and evaluations. Evaluations are scoped to the most recent run that matches the API dataset run source and dataset run type (when such a run exists); otherwise the evaluations array may be empty even if older runs exist.
Parameters
- Path:
dataset_id— integer.
Error responses
401— Authentication failed.404— Dataset not found or not accessible.500— Server error.
Responses
200— JSON with the full dataset object; see Example response (200) and field shapes below.
Example response (200)
{
"id": 42,
"user": { "id": 7, "email": "analyst@acme.com" },
"dataset_type_id": 1,
"name": "My evaluation set",
"selected_metrics": {
"1": { "threshold": 70, "metric_args": null },
"2": { "threshold": 80, "metric_args": { "ignore_extra_keys": true } }
},
"structure": ["prompt", "output", "golden_answer"],
"column_mappings": null,
"data_collection_id": null,
"org_id": null,
"created_at": "2026-04-02T12:00:00Z",
"updated_at": null,
"dataset_type": {
"id": 1,
"name": "CUSTOM",
"label": "Custom",
"description": null
},
"records": [
{
"id": 1001,
"user_id": 7,
"dataset_id": 42,
"prompt": "What is the refund policy?",
"input": null,
"context": null,
"output": "Refunds are available within 30 days.",
"golden_answer": "30-day refund window.",
"created_at": "2026-04-02T12:00:00Z",
"updated_at": null
}
],
"runs": [],
"evaluations": []
}
{
"id": 0,
"user": { "id": 0, "email": "string" },
"dataset_type_id": 0,
"name": "string",
"selected_metrics": "object | null",
"structure": ["string"],
"column_mappings": "object | null",
"data_collection_id": "integer | null",
"org_id": "integer | null",
"created_at": "date",
"updated_at": "date | null",
"dataset_type": {
"id": 0,
"name": "string",
"label": "string",
"description": "string | null"
},
"records": [
{
"id": 0,
"user_id": 0,
"dataset_id": "integer | null",
"prompt": "string | null",
"input": "string | object | array | null",
"context": "string | array | null",
"output": "string | object | array | null",
"golden_answer": "string | object | array | null",
"created_at": "date",
"updated_at": "date | null"
}
],
"runs": [],
"evaluations": []
}
Each element of runs has this shape:
{
"id": 0,
"user": { "id": 0, "email": "string" },
"run_type_id": 0,
"run_source_id": 0,
"api_key_id": 0,
"dataset_id": "integer | null",
"data_collection_id": "integer | null",
"org_id": "integer | null",
"run_type": {
"id": 0,
"name": "string",
"label": "string",
"description": "string"
},
"run_source": {
"id": 0,
"name": "string",
"label": "string",
"description": "string"
},
"dataset": "DatasetResponse | null",
"number_of_metrics": 0,
"result": "number | null",
"threshold": 0,
"model_slug": "string | null",
"alias": "string | null",
"aggregate_results": "object | null",
"started_at": "date",
"finished_at": "date | null",
"created_at": "date",
"updated_at": "date | null",
"is_gte_threshold": "boolean | null"
}
When dataset on a run response is not null, it matches DatasetResponse:
{
"id": 0,
"user": { "id": 0, "email": "string" },
"dataset_type_id": 0,
"name": "string",
"structure": ["string"],
"selected_metrics": "object | null",
"column_mappings": "object | null",
"data_collection_id": "integer | null",
"org_id": "integer | null",
"created_at": "date",
"updated_at": "date | null"
}
Each element of evaluations has this shape:
{
"id": 0,
"user_id": 0,
"run_id": 0,
"record_id": 0,
"metric_id": 0,
"dataset_id": "integer | null",
"result": "number | null",
"threshold": 0,
"explanation": "string | null",
"model_slug": "string | null",
"metric_args": "object | null",
"started_at": "date",
"finished_at": "date | null",
"input_tokens": "integer | null",
"output_tokens": "integer | null",
"evaluation_cost": "string | null",
"is_success": "boolean | null",
"is_gte_threshold": "boolean | null",
"created_at": "date",
"updated_at": "date | null",
"eval_metadata": "object | null"
}
metric_args reflects the per-metric arguments the evaluation actually ran with (resolved from dataset-level defaults plus any per-run override). eval_metadata is metric-specific extra signals (or null).
curl
curl "https://api.aegisevals.ai/api/v1/datasets/3" \
-H "Authorization: Bearer sk_00000000000000000000000000000000"