From ca231f9f9d9ab041034b6d05e90b6e04bd6cff82 Mon Sep 17 00:00:00 2001 From: quentinll Date: Thu, 26 Mar 2026 22:53:48 -0400 Subject: [PATCH] updating readme with hugging face datasets --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 9b1aa9e..6bc5813 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ **Abstract:** Joint Embedding Predictive Architectures (JEPAs) offer a compelling framework for learning world models in compact latent spaces, yet existing methods remain fragile, relying on complex multi-term losses, exponential moving averages, pretrained encoders, or auxiliary supervision to avoid representation collapse. In this work, we introduce LeWorldModel (LeWM), the first JEPA that trains stably end-to-end from raw pixels using only two loss terms: a next-embedding prediction loss and a regularizer enforcing Gaussian-distributed latent embeddings. This reduces tunable loss hyperparameters from six to one compared to the only existing end-to-end alternative. With ~15M parameters trainable on a single GPU in a few hours, LeWM plans up to 48× faster than foundation-model-based world models while remaining competitive across diverse 2D and 3D control tasks. Beyond control, we show that LeWM's latent space encodes meaningful physical structure through probing of physical quantities. Surprise evaluation confirms that the model reliably detects physically implausible events.

- [ Paper | Data | Website ] + [ Paper | Checkpoints | Data | Website ]


@@ -38,7 +38,7 @@ uv pip install stable-worldmodel[train,env] ## Data -Datasets use the HDF5 format for fast loading. Download the data from the [Drive](https://drive.google.com/drive/folders/1r31os0d4-rR0mdHc7OlY_e5nh3XT4r4e?usp=sharing) and decompress with: +Datasets use the HDF5 format for fast loading. Download the data from [HuggingFace](https://huggingface.co/collections/quentinll/lewm) and decompress with: ```bash tar --zstd -xvf archive.tar.zst