WorldAct: Activating Monolithic 3D Worlds into Interactive-Ready Object-Centric Scenes
Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.
WorldAct: Activating Monolithic 3D Worlds into Interactive-Ready Object-Centric Scenes
Abstract
WorldAct framework transforms static 3D generated environments into editable and interactive scenes through multimodal agents and geometric reconstruction techniques.
Recent 3D world modeling systems based on generative scene synthesis, such as Marble, can create coherent and explorable 3D environments, yet their outputs are typically static monolithic assets with limited editability and physical interaction. This restricts their use in immersive content creation and embodied simulation, where generated worlds must be actively modified and manipulated. To tackle this challenge, we present WorldAct, a framework that converts static generated 3D worlds into editable and interaction-ready scenes. WorldAct uses a multimodal agent to guide scene decomposition, identify actionable objects, reconstruct geometrically aligned object-level meshes for interaction, and restore the residual background via 3D inpainting. The resulting scenes support object-level editing, collision-aware manipulation, and embodied task execution while preserving global scene coherence. Experiments show that WorldAct enables richer interaction scenarios than the original generated scenes, suggesting a practical path toward editable and interactive 3D world models.
Get this paper in your agent:
hf papers read 2605.15843 curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 1
More from Hugging Face Daily Papers
-
Active Learners as Efficient PRP Rerankers
May 20
-
Overcoming Catastrophic Forgetting in Visual Continual Learning with Reinforcement Fine-Tuning
May 20
-
TideGS: Scalable Training of Over One Billion 3D Gaussian Splatting Primitives via Out-of-Core Optimization
May 20
-
Language-Switching Triggers Take a Latent Detour Through Language Models
May 20
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.