Embodied Dataset Pipeline

env / user / ai 三轨道统一格式 | SSTable 骨架 + Parquet 信息 + per-episode MP4
查看代码架构 →
14
数据集 (4 来源)
37,129
Episodes
183h
总时长
300+ GB
磁盘占用

数据集总览

数据集来源Episodes时长 StateActionenv.obsai.action模态
RH20T (cfg1~7 合计)
cfg1 · cfg2 · cfg3 · cfg4 · cfg5 · cfg6 · cfg7
RH20T11,952160.2h88cartesian_position, force_torquecartesian_position视频+音频+力
fuseTFRecord24,08119.6h77cartesian_position, imudelta_cartesian_position视频+音频+IMU
aloha_sim_insertionLeRobot500.1h1414joint_positionjoint_position视频
pushtLeRobot2060.7h22cartesian_positioncartesian_position视频
xarm_push_mediumLeRobot8000.4h43joint_positiondelta_joint_position视频
robomind_failure (tiangong)RoboMind300.7h1414joint_positionjoint_position多视频 (quality=0)
robomind_puppetRoboMind50.1h77joint_positionjoint_position多视频
robomind_frankaRoboMind50.1h88joint_positionjoint_position多视频

数据集详情 (点击查看完整信息)

存储架构

sstable/ ├── skeleton_episode/ Episode 级骨架 (SSTable/JSONL) │ ├── pusht/ 每条 = 一个 Episode (env/user/ai 消息列表) │ ├── rh20t_cfg1/ 包含所有 Ref 引用 (VideoRef / TimeseriesRef / AudioRef) │ └── fuse/ Pydantic 序列化, ~2-3KB/episode │ ├── skeleton_single_frame/ 帧级骨架 (从 episode 骨架展开) │ ├── pusht/ 每条 = 一帧的训练样本 │ └── rh20t_cfg1/ from_ts/to_ts 指定时间范围 │ ├── data/ Parquet 信息表 (一行一个 episode 的完整 list) │ ├── {dataset}/state/ 关节状态 [T, state_dim] │ ├── {dataset}/action/ 动作标签 [T, action_dim] │ ├── {dataset}/force/ 力传感器 [T, 6] @100Hz (RH20T) │ ├── {dataset}/audio_env/ 环境音频 WAV bytes │ └── {dataset}/imu/ IMU 数据 (fuse) │ ├── videos/ per-episode H.264 MP4 │ └── {dataset}/{cam}/episode_XXXXXX.mp4 │ └── meta/ info.json + stats.json (mean/std/min/max)

数据类型 (Pydantic)

Episode ├── messages: list[EnvMessage | UserMessage | AiMessage] │ │ EnvMessage (环境感知) │ ├── MetaContent: dataset_name, task, task_id, quality_rating, user_id, scene_id, ... │ ├── VideoContent: ref → per-episode MP4 (from_ts, to_ts, fps, w, h) │ ├── AudioContent: ref → Parquet binary 列 (row, from_ts, to_ts) │ └── TimeseriesContent: ref → Parquet list 列 (row, from_ts, to_ts, fps) │ name: TimeseriesName (env.obs.* / ai.action.*) │ │ UserMessage (人类实时干预) │ ├── TextContent: 语音纠正文本 ("往左一点") │ ├── AudioContent: 语音指令 │ └── TimeseriesContent: 遥操作动作 │ │ AiMessage (机器人输出 / 训练标签) │ ├── TextContent: CoT 推理链 (可带 twt 逐字时间戳) │ ├── AudioContent: 语音回复 │ └── TimeseriesContent: 关节动作 (action chunk)

代码结构

文件作用
modelbest_robo_dataset/data_types.pyPydantic 类型定义 (Episode, Ref, Content, Message, TimeseriesName)
modelbest_robo_dataset/writer.py通用写入器: RawEpisode → SSTable/JSONL + Parquet + MP4 + stats
modelbest_robo_dataset/reader.py读取验证器: SSTable/JSONL + Parquet + MP4 → 训练样本
modelbest_robo_dataset/validator.pyActionValidator: state/action 一致性校验
modelbest_robo_dataset/sources/base.pyEpisodeSource 接口 + RawEpisode 中间表示
modelbest_robo_dataset/sources/lerobot.pyLeRobot v3.0 → RawEpisode
modelbest_robo_dataset/sources/rh20t.pyRH20T → RawEpisode (含 force+audio)
modelbest_robo_dataset/sources/fuse.pyTFRecord/RLDS → RawEpisode (含 IMU+audio)
modelbest_robo_dataset/sources/robomind.pyRoboMIND HDF5 → RawEpisode (puppet/franka/tiangong 三变体)
modelbest_robo_dataset/embodied_dataset.py训练侧 PyTorch Dataset 封装,按 timestamp 组装视频与时序样本
modelbest_robo_dataset/scripts/convert.py统一转换入口 (--source lerobot/rh20t/fuse/robomind)
modelbest_robo_dataset/scripts/expand_skeleton.pyepisode 骨架展开为单帧 / context 骨架
modelbest_robo_dataset/scripts/generate_viz.py生成交互式 HTML 可视化页面与 datasets.json

转换流水线

原始数据 EpisodeSource 适配器 EmbodiedWriter ┌─────────────┐ ┌──────────────────┐ ┌─────────────────────┐ │ LeRobot │ ──────→ │ LeRobotSource │ │ │ │ Parquet+MP4 │ │ list_episodes() │ ──┐ │ write_episode(raw) │ └─────────────┘ │ load_episode() │ │ │ ├── video → MP4 │ └──────────────────┘ │ │ ├── state → Parquet │ ┌─────────────┐ ┌──────────────────┐ ├───→ │ ├── action → Parquet│ │ RH20T │ ──────→ │ RH20TSource │ │ │ ├── force → Parquet│ │ npy+MP4+WAV │ │ (直接 copy MP4) │ ──┘ │ ├── audio → Parquet│ └─────────────┘ └──────────────────┘ │ └── skeleton → SST │ ┌──────────────────┐ │ │ ┌─────────────┐ │ FuseSource │ │ finalize() │ │ fuse │ ──────→ │ (惰性 TFRecord) │ ──────→ │ ├── flush Parquet │ │ TFRecord │ └──────────────────┘ │ ├── write stats │ └─────────────┘ │ └── close SSTable │ └─────────────────────┘