interp-lab attribution graph
the next token should be a physical measurement unit
Model:
distilgpt2
Filter features
Role
All roles
Path status
All path statuses
Graph
Candidate Paths
Strong Features
Feature Cards
Agent Next Actions