Robot data model — real episodes & proposed wire format

Three real robot classes. Pick a robot + tick: the two cards below show the concrete messages the proposed SDK would send — the control-plane layout (once) and the data-plane sample (this tick). Descriptor colour = provenance.

▸ Proposed transfer protocol (v2) — control plane vs data plane
CONTROL PLANE — once, on Describe (bidi Connect stream)
  • SDK answers Describe with one SourceDiscovered{RobotLayout} per source; re-sent verbatim on every reconnect.
  • RobotLayout = source identity + rates + floating_base + an ordered list of groups; each group has a role and an ordered list of per-dimension channels (name·unit·dtype·rotation_type·range).
  • The vector order is fixed here: values[] = groups in declared order × channels in declared order.
  • Changes are rare/structural → a new Describe cycle, never per tick.
DATA PLANE — every tick (client-streaming StreamSamples)
  • Per source per tick: one Sample{ timestamp_us, sequence, RobotPayload{ values[] } }.
  • values[i] ↔ channel i of the layout. Nothing self-describing on the wire — decoded entirely through the layout.
  • One flat vector carries measured + commanded + teleop_input together (no per-tick keys, no nesting).
  • Multi-rate (G1 control 50 / planner 10 Hz): each source streams at its own control_hz; slower signals are held or sent as their own source.

Why grouped (not a flat joint list)

  • Humanoid ≠ arm+gripper — G1 is legs+waist+arms+hands+floating base (43-DOF). rotation_type is first-class (quat/6D/euler). teleop_input is a third role. names count must == shape (G1 eef_state ships 4 names for 14 dims). Action is heterogeneous (joint targets + latent token).
source — from the dataset's own metadata (real)
inferred — our heuristic, NOT in the source
none — source provides nothing
▸ full source metadata (real)

① Control plane — RobotLayout

sent once at Describe

    

② Data plane — Sample


    
Full channel table ▸ every channel with per-attribute provenance (source / inferred / none)
REAL · source data CHANNEL DESCRIPTOR · colour = provenance
source fieldidxvalue @ tickrole channel namegrouprotationunitrange

Real episodes at product/recorder_data_samples/ (Franka, Vega = recorder parquet; G1 = LeRobot/GR00T). Extracted by extract_real_examples.py. Messages are illustrative of the v2 proposal in docs/spec_discrepancies.md §A — values are real; descriptor colour = source/inferred/none.