Security
These metrics cover attacks, exfiltration, and leakage in prompts and outputs. Each page lists its shortname, fields, and an example payload (and optional metric_args when the metric supports them).
Metrics
The same pages appear under Security in the docs sidebar:
- Evasion Obfuscation (
ev_obf) - Instruction Integrity Subversion Attempts (
instr_integ_subv_att) - PII/PHI Leakage (
pii_phi) - PII/PHI Exfiltration Attempts (
pii_phi_exfil_att) - Role Hijacking (
role_hijack) - System Data Exfiltration Attempts (
sys_data_exfil_att) - System Data Leakage (
sys_data_leak)