XML Equal (xml_equal)
Contents
Metric description
XML equal compares output and golden answer as XML. Options cover child order, extra attributes, and whitespace in text nodes. Parsing uses the same family of XML safety limits as other XML structural metrics.
How to interpret the score
- 100: documents are equal under the chosen comparison rules.
- 0: parse failure or mismatch.
API usage
Prerequisites
After the environment variables are configured, the next step is to create a JSON payload for the custom-runs request. For a field-by-field description of the payload (top-level keys, evaluations, and each row in data), see Custom run request body.
Shortname: xml_equal
Default threshold: 100
Structural metrics run without an LLM (deterministic checks). Your run may still include model_slug where the API expects it; scoring does not depend on it for this category.
Inputs (each object in data)
output(str, required): XML text.golden_answer(str, required): Reference XML text.
metric_args
-
ignore_order(booleanoptional): Ignore order of child elements when comparing. Default:false. -
ignore_extra_attributes(booleanoptional): Ignore attributes present only on the output side. Default:false. -
ignore_whitespace(booleanoptional): Ignore whitespace differences in text content. Default:true.
Example
import json
import os
import requests
from dotenv import load_dotenv
load_dotenv(override=True)
_API_KEY = os.getenv("AEGIS_API_KEY")
_BASE_URL = os.getenv("AEGIS_API_BASE_URL")
_CUSTOM_RUN_URL = f"{_BASE_URL}/runs/custom"
def post_custom_run(payload: dict) -> requests.Response:
"""POST JSON payload to Aegis custom runs; returns the raw response."""
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {_API_KEY}",
}
return requests.post(
_CUSTOM_RUN_URL,
headers=headers,
data=json.dumps(payload),
)
if __name__ == "__main__":
data = [
{"output": "<r/>", "golden_answer": "<r/>"}
]
payload = {
"threshold": 100,
"model_slug": "o4-mini",
"is_blocking": True,
"data_collection_id": None,
"evaluations": [
{
"metrics": [
{
"metric": "xml_equal",
"metric_args": {"ignore_whitespace": True},
},
],
"threshold": 100,
"model_slug": "o4-mini",
"data": data,
}
],
}
response = post_custom_run(payload)
response.raise_for_status()
print(json.dumps(response.json(), indent=2))