From 2fdcec6da0328a7c94e2dac65a28c7eab83b3fd3 Mon Sep 17 00:00:00 2001 From: olivame Date: Mon, 19 Jan 2026 16:39:49 +0800 Subject: [PATCH] Delete README.md --- README.md | 228 ------------------------------------------------------ 1 file changed, 228 deletions(-) delete mode 100644 README.md diff --git a/README.md b/README.md deleted file mode 100644 index 15aa4be..0000000 --- a/README.md +++ /dev/null @@ -1,228 +0,0 @@ -# UnifoLM-WMA-0: A World-Model-Action (WMA) Framework under UnifoLM Family -

- Project Page | - Models | - Dataset -

-
-

- 🌎English | πŸ‡¨πŸ‡³δΈ­ζ–‡ -

-
-
- UnifoLM-WMA-0 is Unitreeβ€˜s open-source world-model–action architecture spanning multiple types of robotic embodiments, designed specifically for general-purpose robot learning. Its core component is a world-model capable of understanding the physical interactions between robots and the environments. This world-model provides two key functions: (a) Simulation Engine – operates as an interactive simulator to generate synthetic data for robot learning; (b) Policy Enhancement – connects with an action head and, by predicting future interaction processes with the world-model, further optimizes decision-making performance. -
- -## 🦾 Real-Robot Demonstrations -| | | -|:---:|:---:| -| | | - -**Note: the top-right window shows the world model’s pretion of future action videos.** - -## πŸ”₯ News - -* Sep 22, 2025: πŸš€ We released the deployment code for assisting experiments with [Unitree](https://www.unitree.com/) robots. -* Sep 15, 2025: πŸš€ We released the training and inference code along with the model weights of [**UnifoLM-WMA-0**](https://huggingface.co/collections/unitreerobotics/unifolm-wma-0-68ca23027310c0ca0f34959c). - -## πŸ“‘ Opensource Plan -- [x] Training -- [x] Inference -- [x] Checkpoints -- [x] Deployment - -## βš™οΈ Installation -``` -conda create -n unifolm-wma python==3.10.18 -conda activate unifolm-wma - -conda install pinocchio=3.2.0 -c conda-forge -y -conda install ffmpeg=7.1.1 -c conda-forge - -git clone --recurse-submodules https://github.com/unitreerobotics/unifolm-world-model-action.git - -# If you already downloaded the repo: -cd unifolm-world-model-action -git submodule update --init --recursive - -pip install -e . - -cd external/dlimp -pip install -e . -``` -## 🧰 Model Checkpoints -| Model | Description | Link| -|---------|-------|------| -|$\text{UnifoLM-WMA-0}_{Base}$| Fine-tuned on [Open-X](https://robotics-transformer-x.github.io/) dataset. | [HuggingFace](https://huggingface.co/unitreerobotics/UnifoLM-WMA-0-Base)| -|$\text{UnifoLM-WMA-0}_{Dual}$| Fine-tuned on five [Unitree opensource dataset](https://huggingface.co/collections/unitreerobotics/g1-dex1-datasets-68bae98bf0a26d617f9983ab) in both decision-making and simulation modes. | [HuggingFace](https://huggingface.co/unitreerobotics/UnifoLM-WMA-0-Dual)| - -## πŸ›’οΈ Dataset -In our experiments, we consider the following three opensource dataset: -| Dataset | Robot | Link | -|---------|-------|------| -|Z1_StackBox| [Unitree Z1](https://www.unitree.com/z1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/Z1_StackBox_Dataset/tree/v2.1)| -|Z1_DualArm_StackBox|[Unitree Z1](https://www.unitree.com/z1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/Z1_Dual_Dex1_StackBox_Dataset/tree/v2.1)| -|Z1_DualArm_StackBox_V2|[Unitree Z1](https://www.unitree.com/z1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/Z1_Dual_Dex1_StackBox_Dataset_V2/tree/v2.1)| -|Z1_DualArm_Cleanup_Pencils|[Unitree Z1](https://www.unitree.com/z1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/Z1_Dual_Dex1_CleanupPencils_Dataset/tree/v2.1)| -|G1_Pack_Camera|[Unitree G1](https://www.unitree.com/g1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/G1_Dex1_MountCameraRedGripper_Dataset/tree/v2.1)| - -To train on your own dataset, first to have the data following the [Huggingface LeRobot V2.1](https://github.com/huggingface/lerobot) dataset format. Assume the dataset’s source directory structure is as follows: -``` -source_dir/ - β”œβ”€β”€ dataset1_name - β”œβ”€β”€ dataset2_name - β”œβ”€β”€ dataset3_name - └── ... -``` -Then, convert a dataset to the required format using the command below: -```python -cd prepare_data -python prepare_training_data.py \ - --source_dir /path/to/your/source_dir \ - --target_dir /path/to/save/the/converted/data \ - --dataset_name "dataset1_name" \ - --robot_name "a tag of the robot in the dataset" # e.g, Unitree Z1 Robot Arm or Unitree G1 Robot with Gripper. -``` -The resulting data structure (Note: model training only supports input from the main-view camera. If the dataset includes multiple views, remove the corresponding values from the ```data_dir``` column in the CSV file. -``` -target_dir/ - β”œβ”€β”€ videos - β”‚ β”œβ”€β”€dataset1_name - β”‚ β”‚ β”œβ”€β”€camera_view_dir - β”‚ β”‚ β”œβ”€β”€ 0.mp4 - β”‚ β”‚ β”œβ”€β”€ 1.mp4 - β”‚ β”‚ └── ... - β”‚ └── ... - β”œβ”€β”€ transitions - β”‚ β”œβ”€β”€ dataset1_name - β”‚ β”œβ”€β”€ meta_data - β”‚ β”œβ”€β”€ 0.h5 - β”‚ β”œβ”€β”€ 1.h5 - β”‚ └── ... - └── dataset1_name.csv -``` -## πŸš΄β€β™‚οΈ Training -A. Our training strategy is outlined as follows: -- **Step 1**: Fine-tune a video generation model as the world model using the [Open-X](https://robotics-transformer-x.github.io/) dataset; -- **Step 2**: Post-train $\text{UnifoLM-WMA}$ in decision-making mode on the downstream task dataset; -
- -
-- **Step 3**: Post-train $\text{UnifoLM-WMA}$ in simulation mode on the downstream task dataset. -
- -
-**Note**: If you only require $\text{UnifoLM-WMA}$ to operate in a single mode, you may skip the corresponding step. - -B. To conduct training on a single or multiple datasets, please follow the steps below: -- **Step 1**: The maximum DoF is assumed to be 16, if you have more than 16 DoF, update ```agent_state_dim``` and ```agent_action_dim``` in [configs/train/config.yaml](https://github.com/unitreerobotics/unifolm-wma/blob/working/configs/train/config.yaml) ; -- **Step 2**: Set up the input shapes for each modality in [configs/train/meta.json](https://github.com/unitreerobotics/unitree-world-model/blob/main/configs/train/meta.json); -- **Step 3**: Configure the training parameters in [configs/train/config.yaml](https://github.com/unitreerobotics/unitree-world-model/blob/main/configs/train/config.yaml). For the ```pretrained_checkpoint```, we recommend using the checkpoint " $\text{UnifoLM-WMA-0}_{Base}$ " fine-tuned on the [Open-X](https://robotics-transformer-x.github.io/) dataset; - ```yaml - model: - pretrained_checkpoint: /path/to/pretrained/checkpoint; - ... - decision_making_only: True # Train the world model only in decision-making mode. If False, jointly train it in both decision-making and simulation modes. - ... - data: - ... - train: - ... - data_dir: /path/to/training/dataset/directory - dataset_and_weights: # list the name of each dataset below and make sure the summation of weights is 1.0 - dataset1_name: 0.2 - dataset2_name: 0.2 - dataset3_name: 0.2 - dataset4_name: 0.2 - dataset5_name: 0.2 - ``` -- **Step 4**: Setup ```experiment_name```, ```save_root``` variables in [scripts/train.sh](https://github.com/unitreerobotics/unitree-world-model/blob/main/scripts/train.sh); -- **Step 5**: Launch the training with the command: -``` -bash scripts/train.sh -``` -## 🌏 Inference under Interactive Simulation Mode -To run the world model in an interactive simulation mode, follow these steps: -- **Step 1**: (Skip this step if you just would like to test using the examples we provided) Prepare your own prompt following the format used in the [examples/world_model_interaction_prompts](https://github.com/unitreerobotics/unitree-world-model/tree/main/examples/world_model_interaction_prompts): - ``` - world_model_interaction_prompts/ - β”œβ”€β”€ images - β”‚ β”œβ”€β”€ dataset1_name - β”‚ β”‚ β”œβ”€β”€ 0.png # Image prompt - β”‚ β”‚ └── ... - β”‚ └── ... - β”œβ”€β”€ transitions - β”‚ β”œβ”€β”€ dataset1_name - β”‚ β”‚ β”œβ”€β”€ meta_data # Used for normalization - β”‚ β”‚ β”œβ”€β”€ 0.h # Robot state and action data; in interaction mode, - β”‚ β”‚ β”‚ # only used to retrieve the robot state corresponding - β”‚ β”‚ β”‚ # to the image prompt - β”‚ β”‚ └── ... - β”‚ └── ... - β”œβ”€β”€ dataset1_name.csv # File for loading image prompts, text instruction and corresponding robot states - └── ... - ``` -- **Step 2**: Specify the correct paths for ```pretrained_checkpoint```(e.g, $\text{UnifoLM-WMA-0}_{Dual}$) and ```data_dir``` in [configs/inference/world_model_interaction.yaml](https://github.com/unitreerobotics/unitree-world-model/blob/main/configs/inference/world_model_interaction.yaml) -- **Step 3**: Set the paths for ```checkpoint```, ```res_dir``` and ```prompt_dir``` in [scripts/run_world_model_interaction.sh](https://github.com/unitreerobotics/unitree-world-model/blob/main/scripts/run_world_model_interaction.sh), and specify all the dataset's name in ```datasets=(...)```. Then, launch the inference with the command: - ``` - bash scripts/run_world_model_interaction.sh - ``` - -## 🧠 Inference and Deployment under Decision-Making Mode - -In this setup, inference is performed on a server, while a robot client gathers observations from the real-robot and sends them to the server to query actions. The process unfolds through the following steps: - -### Server Setup: -- **Step-1**: Specify ```ckpt```, ```res_dir```, ```datasets``` in [scripts/run_real_eval_server.sh](https://github.com/unitreerobotics/unifolm-world-model-action/blob/main/scripts/run_real_eval_server.sh); -- **Step-2**: Configure ```data_dir``` and ```dataset_and_weights``` in [config/inference/world_model_decision_making.yaml](https://github.com/unitreerobotics/unifolm-world-model-action/blob/f12b4782652ca00452941d851b17446e4ee7124a/configs/inference/world_model_decision_making.yaml#L225); -- **Step-3**: Launch the server: -``` -conda activate unifolm-wma -cd unifolm-world-model-action -bash scripts/run_real_eval_server.sh -``` - -### Client Setup -- **Step-1**: Follow the instructions in [unitree_deploy/README.md](https://github.com/unitreerobotics/unifolm-world-model-action/blob/main/unitree_deploy/README.md) to create the ```unitree_deploy``` conda environment, install the required packages, launch the controllers or services on the real-robot. -- **Step-2**: Open a new terminal and establish a tunnel connection from the client to the server: -``` -ssh user_name@remote_server_IP -CNg -L 8000:127.0.0.1:8000 -``` -- **Step-3**: Run the ```unitree_deploy/robot_client.py``` script to start inference: -``` -cd unitree_deploy -python scripts/robot_client.py --robot_type "g1_dex1" --action_horizon 16 --exe_steps 16 --observation_horizon 2 --language_instruction "pack black camera into box" --output_dir ./results --control_freq 15 -``` - -## πŸ“ Codebase Architecture -Here's a high-level overview of the project's code structure and core components: -``` -unitree-world-model/ - β”œβ”€β”€ assets # Media assets such as GIFs, images, and demo videos - β”œβ”€β”€ configs # Configuration files for training and inference - β”‚ β”œβ”€β”€ inference - β”‚ └── train - β”œβ”€β”€ examples # Example inputs and prompts for running inference - β”œβ”€β”€ external # External packages - β”œβ”€β”€ prepare_data # Scripts for dataset preprocessing and format conversion - β”œβ”€β”€ scripts # Main scripts for training, evaluation, and deployment - β”œβ”€β”€ src - β”‚ β”œβ”€β”€unitree_worldmodel # Core Python package for the Unitree world model - β”‚ β”‚ β”œβ”€β”€ data # Dataset loading, transformations, and dataloaders - β”‚ β”‚ β”œβ”€β”€ models # Model architectures and backbone definitions - β”‚ β”‚ β”œβ”€β”€ modules # Custom model modules and components - β”‚ β”‚ └── utils # Utility functions and common helpers - └── unitree_deploy # Deployment code -``` - -## πŸ™ Acknowledgement -Lots of code are inherited from [DynamiCrafter](https://github.com/Doubiiu/DynamiCrafter), [Diffusion Policy](https://github.com/real-stanford/diffusion_policy), [ACT](https://github.com/MarkFzp/act-plus-plus) and [HPT](https://github.com/liruiw/HPT). - -## πŸ“ Citation -``` -@misc{unifolm-wma-0, - author = {Unitree}, - title = {UnifoLM-WMA-0: A World-Model-Action (WMA) Framework under UnifoLM Family}, - year = {2025}, -} -```