682
Total Tasks
Comprehensive task coverage across multiple enterprise scenarios
5
Core Capabilities
Planning, Reasoning, Retrieval, Context, Problem-Solving
Task Categories
- Compositional Planning: Multi-step workflow coordination
- Arithmetic Reasoning: Mathematical and quantitative analysis
- Logical Reasoning: Complex inference and deduction
- Information Retrieval: Knowledge base search and synthesis
- Contextual Understanding: Enterprise domain comprehension
Model Evaluation
- LLMs: State-of-the-art language models
- VLMs: Vision-language multimodal models
- Human Baselines: Knowledge worker performance
- Fine-tuning Data: Thousands of action traces