This page is the curation log behind TabArena and
BeyondArena: one record per candidate tabular dataset, tracking whether it
belongs in the benchmark, why, how it should be split, and how it is processed. Curators triage
the backlog here (edit β commit β PR); an AI assistant can draft a provisional triage
(π€) that a human then verifies.
Goal: assemble a high-quality, representative collection of real-world tabular
ML tasks for an open, living benchmark β and keep the curation reasoning transparent and reproducible.
Links
tabarena.ai β the living benchmark & leaderboard