A behavioral baseline for a support agent versus one suspicious live trace. The point is not a single red flag, it is independent detector agreement across tools, sequence, and volume.
Profile
2 tools
known safe actions in baseline traces
Critical
4 events
never-seen tools and extreme token spike
High
2 events
sequence anomalies from improbable transitions
Live trace
shell_exec → export_data
observed tool path in the suspicious execution
Detector agreement
Each detector family trips on the same trace for a different reason. That is exactly what you want in runtime monitoring.
tool usage
2
sequence
2
volume
2
Incident detail
The showcase trace is bad because it departs from the profile on multiple dimensions at once.
Observed tools
shell_exec, export_data, respond
Expected tools
search_kb, respond
Token spike
5,400 live tokens vs 920 baseline
Policy outcome
Alerted immediately, ready for quarantine or block