unifolm-world-model-action

Author	SHA1	Message	Date
qhy	ef56e5dcdb	Revert "tensorRT engines尝试精度没过，暂时先提交代码，后续再继续调试" This reverts commit `e1f8a83648`.	2026-02-19 20:22:19 +08:00
qhy	e1f8a83648	tensorRT engines尝试精度没过，暂时先提交代码，后续再继续调试	2026-02-18 18:22:12 +08:00
qhy	5e0e21d91b	复原sh为原始版本	2026-02-18 14:11:55 +08:00
qhy	d5bec53f61	优化后的全部结果	2026-02-11 19:21:06 +08:00
qhy	508b91f5a2	延迟 decode，只解码 CLIP 需要的 1 帧 - world model 调用 decode_video=False，跳过 16 帧全量 decode - 只 decode 最后 1 帧给 CLIP embedding / observation queue - 存 raw latent，循环结束后统一 batch decode 生成最终视频 - 每轮省 15 次 VAE decode，8 轮共省 120 次 - 跳过中间迭代的 wm tensorboard/mp4 保存 psnr微弱下降	2026-02-11 17:07:33 +08:00
qhy	3101252c25	速度变化不明显psnr显著提升	2026-02-11 16:38:21 +08:00
qhy	f386a5810b	补充上次提交	2026-02-11 16:24:40 +08:00
qhy	352a79035f	主干部分fp16,最敏感psnr=25.21,可以考虑对主干部分太敏感的部分回退fp32	2026-02-11 16:23:21 +08:00
qhy	9a08e27a19	KV 融合实现完成。改动总结：速度微弱提升psnr略微上升 attention.py — 3处改动： 1. __init__ 添加 _kv_fused = False 标志 2.新增 fuse_kv() 方法：将 to_k + to_v → to_kv，同时处理 _ip/_as/_aa 辅助 KV 对 2. bmm_forward 两个分支加_kv_fused 判断，用to_kv().chunk(2, dim=-1) 替代分别调用	2026-02-11 12:36:38 +08:00
qhy	b558856e1e	fix bugs	2026-02-10 22:35:45 +08:00
qhy	dcbcb2c377	- state_unet 放到一个独立的 CUDA stream 上执行 - action_unet 在默认 stream 上同时执行 - 用 wait_stream 确保两者都完成后再返回两个 1D UNet 输入完全独立，共享的 hs_a 和 context_action 都是只读的。GPU 利用率只有 ~31%，小张量 kernel 不会打满 GPU，两个 stream 可以真正并行。	2026-02-10 21:41:48 +08:00
qhy	ff43432ef9	结果	2026-02-10 20:01:25 +08:00
qhy	afa12ba031	每步迭代保存异步	2026-02-10 19:54:53 +08:00
qhy	bf4d66c874	跳过模型加载	2026-02-10 19:36:17 +08:00
qhy	9347a4ebe5	实现了Context 预计算和缓存功能，提升了采样效率。 psnr不下降	2026-02-10 17:47:46 +08:00
qhy	223a50f9e0	添加CrossAttention kv缓存，减少重复计算，提升性能，psnr=25.1201dB	2026-02-10 17:35:03 +08:00
qhy	2a6068f9e4	减少了一路视频vae解码	2026-02-10 17:13:45 +08:00
qhy	91a9b0febc	DDIM loop 内小张量分配优化，attention mask 缓存到 GPU	2026-02-10 16:53:00 +08:00
qhy	ed637c972b	tf32推理	2026-02-10 16:39:14 +08:00
olivame	fffc5a9956	init	2026-02-08 03:29:15 +00:00
UniGen-X	cbaebc016f	Update README.md	2025-10-01 10:13:04 +08:00
hengguo	05d2d82236	update readme	2025-09-23 16:57:54 +08:00
yuchen-x	50e8c3ed55	update readme_cn	2025-09-23 16:15:59 +08:00
yuchen-x	ddb5848d86	Merge branch 'main' of github.com:unitreerobotics/unifolm-world-model-action	2025-09-23 16:11:47 +08:00
yuchen-x	cb0cf4a353	updata readme_cn	2025-09-23 16:11:29 +08:00
hengguo	54f61a4336	update readme_cn	2025-09-23 16:00:53 +08:00
yuchen-x	118ada7c35	update readme	2025-09-23 15:22:27 +08:00
yuchen-x	8d5546d322	update readme	2025-09-23 15:21:12 +08:00
yuchen-x	eccd1680c1	update readme	2025-09-23 15:19:02 +08:00
yuchen-x	f12b478265	upload real-robot deployment code	2025-09-23 15:13:22 +08:00
yuchen-x	5dcd1ca503	fix a typo on COLUMNS definition	2025-09-22 17:33:28 +08:00
UniGen-X	7b4d383611	Update README.md	2025-09-21 17:34:40 +08:00
UniGen-X	733e228bb8	Update README.md	2025-09-21 17:19:28 +08:00
UniGen-X	e9c60f6e62	Update README_cn.md	2025-09-19 10:18:38 +08:00
UniGen-X	2d4d79ab3a	Update README.md	2025-09-19 10:18:07 +08:00
UniGen-X	1f21fe7fd8	Update README.md	2025-09-19 10:16:58 +08:00
UniGen-X	a57037ab03	Update README_cn.md	2025-09-17 11:45:09 +08:00
UniGen-X	be43dfef9d	Update README.md	2025-09-17 11:44:32 +08:00
UniGen-X	884dcce130	Update README_cn.md	2025-09-17 10:49:42 +08:00
UniGen-X	29f3101a1f	Update README.md	2025-09-17 10:49:20 +08:00
UniGen-X	712a289c28	Update README_cn.md	2025-09-17 10:44:52 +08:00
UniGen-X	c0fad43420	Update README.md	2025-09-17 10:44:30 +08:00
UniGen-X	3b83374922	Merge pull request #8 from hu-po/patch-1 Fix typo in Acknowledgement section of README	2025-09-17 10:31:31 +08:00
UniGen-X	6a33fddb99	Update README_cn.md	2025-09-17 10:29:47 +08:00
UniGen-X	a0010c4036	Update README.md	2025-09-17 10:27:02 +08:00
UniGen-X	a1f6430c38	Update README.md	2025-09-17 10:26:51 +08:00
hu-po	ee45fe75e2	Fix typo in Acknowledgement section of README	2025-09-16 13:22:33 -05:00
yuchen-x	9c9942f1d4	upload eval_utils.py file	2025-09-16 21:48:15 +08:00
UniGen-X	a457f2f049	Update README.md	2025-09-16 20:55:49 +08:00
UniGen-X	ea7260a7e0	Update README_cn.md	2025-09-15 18:28:52 +08:00

1 2

67 Commits