Hugging Face Daily Papers · · 3 min read

PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

<video src=\"https://cdn-uploads.huggingface.co/production/uploads/64796780bb9a5693c48c97c6/As_R58YJBKFaLEEl_Lyu2.mp4\" controls=\"\" class=\"max-w-full!\"></video></p>","updatedAt":"2026-05-21T03:10:52.008Z","author":{"_id":"64796780bb9a5693c48c97c6","avatarUrl":"/avatars/1226b17dc24265d863fb99befe0d2187.svg","fullname":"jjr5401","name":"JiaJinrang","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5726121664047241},"editors":["JiaJinrang"],"editorAvatarUrls":["/avatars/1226b17dc24265d863fb99befe0d2187.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.17916","authors":[{"_id":"6a0d1a5265eb30f20d962c00","user":{"_id":"64796780bb9a5693c48c97c6","avatarUrl":"/avatars/1226b17dc24265d863fb99befe0d2187.svg","isPro":false,"fullname":"jjr5401","user":"JiaJinrang","type":"user","name":"JiaJinrang"},"name":"Jinrang Jia","status":"claimed_verified","statusLastChangedAt":"2026-05-20T17:11:24.716Z","hidden":false},{"_id":"6a0d1a5265eb30f20d962c01","name":"Zhenjia Li","hidden":false},{"_id":"6a0d1a5265eb30f20d962c02","name":"Yijiang Hu","hidden":false},{"_id":"6a0d1a5265eb30f20d962c03","name":"Yifeng Shi","hidden":false}],"publishedAt":"2026-05-19T00:00:00.000Z","submittedOnDailyAt":"2026-05-21T00:00:00.000Z","title":"PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis","submittedOnDailyBy":{"_id":"64796780bb9a5693c48c97c6","avatarUrl":"/avatars/1226b17dc24265d863fb99befe0d2187.svg","isPro":false,"fullname":"jjr5401","user":"JiaJinrang","type":"user","name":"JiaJinrang"},"summary":"Generating a consistent whole-house VR tour from a floorplan and style reference requires both photorealistic panoramas and cross-view spatial coherence. Pure 2D generators produce appealing single panoramas but re-imagine geometry and materials when the viewpoint changes, whereas monolithic 3D generation becomes expensive and loses fine texture at multi-room scale. We introduce PanoWorld, a generative spatial world model that treats whole-house synthesis as autoregressive generation of node-based 360-degree panoramas, matching the discrete navigation used by real VR tour products. PanoWorld uses a floorplan-derived 3D shell as a global geometric proxy and a dynamic 3D Gaussian Splatting cache as renderable spatial memory. A feed-forward panoramic LRM designed for metric-scale multi-room 360-degree inputs lifts generated panoramas into local 3DGS updates, while Room-aware Group Attention suppresses cross-room feature interference. A topology-aware progressive caching strategy fuses these local updates without repeatedly reconstructing the full history. By decoupling shell-based geometry guidance from cache-rendered visual memory, PanoWorld preserves high-frequency 2D synthesis quality while improving cross-node layout and material consistency. The project link is https://jjrcn.github.io/PanoWorld-project-home/","upvotes":4,"discussionId":"6a0d1a5265eb30f20d962c04","ai_summary":"PanoWorld generates consistent VR tours by combining 3D geometric guidance with dynamic visual memory, enabling high-quality multi-room panoramas with spatial coherence.","ai_keywords":["panoramic LRM","3D Gaussian Splatting","Room-aware Group Attention","topology-aware progressive caching","autoregressive generation","360-degree panoramas","geometric proxy","spatial memory"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64796780bb9a5693c48c97c6","avatarUrl":"/avatars/1226b17dc24265d863fb99befe0d2187.svg","isPro":false,"fullname":"jjr5401","user":"JiaJinrang","type":"user"},{"_id":"67061c5304a5c834862dae98","avatarUrl":"/avatars/e0a63ac71aa6fe7fe96bbcc250ab0f25.svg","isPro":false,"fullname":"Shi","user":"YiFeng2933","type":"user"},{"_id":"64b8de642fccad9f5f0a4ce4","avatarUrl":"/avatars/be62765ece88b0a6517753d1329f2acb.svg","isPro":false,"fullname":"ismail codar","user":"ismail-codar","type":"user"},{"_id":"6407e5294edf9f5c4fd32228","avatarUrl":"/avatars/8e2d55460e9fe9c426eb552baf4b2cb0.svg","isPro":false,"fullname":"Stoney Kang","user":"sikang99","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.17916.md"}">
Papers
arxiv:2605.17916

PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis

Published on May 19
· Submitted by
jjr5401
on May 21
Authors:
,
,

Abstract

PanoWorld generates consistent VR tours by combining 3D geometric guidance with dynamic visual memory, enabling high-quality multi-room panoramas with spatial coherence.

AI-generated summary

Generating a consistent whole-house VR tour from a floorplan and style reference requires both photorealistic panoramas and cross-view spatial coherence. Pure 2D generators produce appealing single panoramas but re-imagine geometry and materials when the viewpoint changes, whereas monolithic 3D generation becomes expensive and loses fine texture at multi-room scale. We introduce PanoWorld, a generative spatial world model that treats whole-house synthesis as autoregressive generation of node-based 360-degree panoramas, matching the discrete navigation used by real VR tour products. PanoWorld uses a floorplan-derived 3D shell as a global geometric proxy and a dynamic 3D Gaussian Splatting cache as renderable spatial memory. A feed-forward panoramic LRM designed for metric-scale multi-room 360-degree inputs lifts generated panoramas into local 3DGS updates, while Room-aware Group Attention suppresses cross-room feature interference. A topology-aware progressive caching strategy fuses these local updates without repeatedly reconstructing the full history. By decoupling shell-based geometry guidance from cache-rendered visual memory, PanoWorld preserves high-frequency 2D synthesis quality while improving cross-node layout and material consistency. The project link is https://jjrcn.github.io/PanoWorld-project-home/

Community

Paper author Paper submitter about 10 hours ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.17916
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.17916 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.17916 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.17916 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers