50%+
Context Reduction
Significant reduction in observation size while maintaining performance
2
Benchmark Evaluation
Validated on WorkArena and WebArena benchmarks
Performance Metrics
- Efficiency: Over 50% reduction in observation size
- Accuracy: Matches performance of strong baselines
- Security: Significant reduction in attack success rates
- Robustness: Maintained performance in attack-free settings
Evaluation Benchmarks
- WorkArena: Enterprise-focused web agent tasks
- WebArena: Realistic web-based scenarios
- Attack Scenarios: Banner and pop-up injection attacks
- Baseline Comparison: State-of-the-art web agents