DREAMSTEER: Latent World Models Can Steer VLA Policies During Deployment with Zero Finetuning

Pretrained vision-language-action (VLA) policies show promising zero-shot generalization, but often fail under deployment-time distribution shift, leading to decreased robustness and inconsistent instruction following. While prior work commonly tackles this by finetuning on in-distribution data, it assumes demonstrations collected on tasks in the target environment. In this work, we propose DreamSteer, a deployment-time steering framework for pretrained VLAs without any finetuning or parameter modifications. The key insight in DreamSteer is to leverage a latent world model and a value model to steer pretrained VLA policies. During deployment, DreamSteer samples candidate action chunks from a VLA policy and predefined motion primitives, imagines their outcomes using an action-conditioned latent world model, and ranks the imagined trajectories with a language-conditioned value model. Across four real-world manipulation benchmarks with unseen objects, DreamSteer improves task success rate from 23.75% to 66.25% and instruction-following accuracy from 38.75% to 56.25% over the base VLA policy.

Method	Phone	Mustard	Tape	Eraser	Average	95% CI
π₀	4/20	3/20	6/20	6/20	23.75	[15.84, 34.07]
π₀ + DreamSteer	7/20	6/20	11/20	10/20	42.50	[32.26, 53.43]
π₀ + primitives + random	0/20	0/20	0/20	0/20	0.00	[0.00, 4.58]
primitives + DreamSteer	0/20	0/20	0/20	0/20	0.00	[0.00, 4.58]
π₀ + primitives + DreamSteer	12/20	11/20	16/20	14/20	66.25	[55.39, 75.65]

Method	Sponge	Banana	Pencil	Apple	Average	95% CI
π₀	8/20	9/20	6/20	8/20	38.75	[28.78, 49.73]
π₀ + primitives + DreamSteer	14/20	13/20	9/20	9/20	56.25	[45.34, 66.57]

DREAMSTEER: Latent World Models Can Steer VLA Policies During Deployment with Zero Finetuning

Think before acting

Plug-and-play steering

Handle deployment shift

Latent world model overview

Model architecture

Abstract

DreamSteer framework

Real robot experiments

Out-of-distribution (OOD) results

Pick up the phone and place it in the brown box

Pick up the mustard and place it in the brown box

Pick up the whiteboard eraser and place it in the black bowl

Pick up the blue tape and place it in the black bowl

Instruction following (IF) accuracy

Pick up the banana and place it in the black bowl

Pick up the sponge and place it in the black bowl

Pick up the apple and place it in the brown box

Pick up the pencil case and place it in the brown box

Quantitative results