Most text-driven 3D indoor scene synthesis methods generate rooms from object-centric prompts, asking what furniture should be placed rather than how the space is used. Yet in real interior design, a layout is judged by how well it supports its occupants, e.g., their activities and physical needs. We introduce Function2Scene, a framework for generating 3D indoor layouts from functional specifications, i.e., natural-language design briefs describing who will use a room and what they need to do there. Given such a specification, our system parses occupant personas and activities, derives a customized set of functional design constraints from a taxonomy of 17 criteria spanning spatial, ergonomic, activity, and environmental considerations, and uses these constraints to guide layout generation. Rather than relying on an LLM to directly produce a final scene, Function2Scene performs iterative evaluation and refinement through a tool-augmented check-and-repair loop, combining geometric measurements, LLM-based contextual reasoning, and VLM-based visual assessment. Experiments on 30 professionally written interior-design cases show that Function2Scene produces layouts that better satisfy functional requirements than recent LLM-based scene synthesis baselines, with our results preferred in 94.3% of pairwise comparisons. Our work reframes text-driven indoor scene synthesis from placing plausible objects to designing spaces that support human use.</p>\n","updatedAt":"2026-06-01T01:49:59.424Z","author":{"_id":"6441ccbf608bf7e3ad7941f1","avatarUrl":"/avatars/55f83f3bd1e49d0d74cc949cae54a57b.svg","fullname":"Xingguang Yan","name":"yanxg","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.885466992855072},"editors":["yanxg"],"editorAvatarUrls":["/avatars/55f83f3bd1e49d0d74cc949cae54a57b.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.30819","authors":[{"_id":"6a1ce31f808ddbc3c7d433c6","name":"Ruiqi Wang","hidden":false},{"_id":"6a1ce31f808ddbc3c7d433c7","name":"Qimin Chen","hidden":false},{"_id":"6a1ce31f808ddbc3c7d433c8","name":"Daniel Ritchie","hidden":false},{"_id":"6a1ce31f808ddbc3c7d433c9","name":"Angel X. Chang","hidden":false},{"_id":"6a1ce31f808ddbc3c7d433ca","name":"Manolis Savva","hidden":false},{"_id":"6a1ce31f808ddbc3c7d433cb","name":"Kai Wang","hidden":false},{"_id":"6a1ce31f808ddbc3c7d433cc","name":"Hao Zhang","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/6441ccbf608bf7e3ad7941f1/C0whZjdJMXZs8Hp2oGLWq.png"],"publishedAt":"2026-05-29T00:00:00.000Z","submittedOnDailyAt":"2026-06-01T00:00:00.000Z","title":"Function2Scene: 3D Indoor Scene Layout from Functional Specifications","submittedOnDailyBy":{"_id":"6441ccbf608bf7e3ad7941f1","avatarUrl":"/avatars/55f83f3bd1e49d0d74cc949cae54a57b.svg","isPro":false,"fullname":"Xingguang Yan","user":"yanxg","type":"user","name":"yanxg"},"summary":"Most text-driven 3D indoor scene synthesis methods generate rooms from object-centric prompts, asking what furniture should be placed rather than how the space is used. Yet in real interior design, a layout is judged by how well it supports its occupants, e.g., their activities and physical needs. We introduce Function2Scene, a framework for generating 3D indoor layouts from functional specifications, i.e., natural-language design briefs describing who will use a room and what they need to do there. Given such a specification, our system parses occupant personas and activities, derives a customized set of functional design constraints from a taxonomy of 17 criteria spanning spatial, ergonomic, activity, and environmental considerations, and uses these constraints to guide layout generation. Rather than relying on an LLM to directly produce a final scene, Function2Scene performs iterative evaluation and refinement through a tool-augmented check-and-repair loop, combining geometric measurements, LLM-based contextual reasoning, and VLM-based visual assessment. Experiments on 30 professionally written interior-design cases show that Function2Scene produces layouts that better satisfy functional requirements than recent LLM-based scene synthesis baselines, with our results preferred in 94.3% of pairwise comparisons. Our work reframes text-driven indoor scene synthesis from placing plausible objects to designing spaces that support human use.","upvotes":23,"discussionId":"6a1ce31f808ddbc3c7d433cd","projectPage":"https://function2scene.github.io/","ai_summary":"Function2Scene generates 3D indoor layouts from functional descriptions by parsing user needs and applying design constraints through an iterative refinement process combining geometric analysis, language modeling, and visual assessment.","ai_keywords":["text-driven 3D indoor scene synthesis","functional specifications","natural-language design briefs","occupant personas","activities","functional design constraints","taxonomy of 17 criteria","geometric measurements","LLM-based contextual reasoning","VLM-based visual assessment","check-and-repair loop"]},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6441ccbf608bf7e3ad7941f1","avatarUrl":"/avatars/55f83f3bd1e49d0d74cc949cae54a57b.svg","isPro":false,"fullname":"Xingguang Yan","user":"yanxg","type":"user"},{"_id":"6658d01c6f1a71ba56d6c273","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/tc4nZrMuZQLfgt5aVxtH4.jpeg","isPro":false,"fullname":"Tian Shulin","user":"shulin16","type":"user"},{"_id":"631db802f318ed8dfd3293fd","avatarUrl":"/avatars/698cb057456f7868fd3b06acbe08f095.svg","isPro":false,"fullname":"Qirui Wu","user":"qiruiw","type":"user"},{"_id":"64d97c5bfd0b55d501ba00cf","avatarUrl":"/avatars/47505f2a573acea7176a96f538226ecb.svg","isPro":false,"fullname":"Zhengqing Wang","user":"EricW123456","type":"user"},{"_id":"655bca95360e4f90cb61ba83","avatarUrl":"/avatars/1a187beb91a5e2fdc2303620b742aab1.svg","isPro":false,"fullname":"Jiacheng Chen","user":"cccjc","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"},{"_id":"633c605f9a0fb78266c2adb0","avatarUrl":"/avatars/445c9d67a7c2de25c6d4c27227ea320a.svg","isPro":false,"fullname":"Qinchan Li","user":"Wingli","type":"user"},{"_id":"63401df72c72fd4ca252641b","avatarUrl":"/avatars/bf17ad88181f94d71ec7e5ab8e3ca55b.svg","isPro":false,"fullname":"Mingrui Zhao","user":"zmrr","type":"user"},{"_id":"65f7893b933dd81304ade73d","avatarUrl":"/avatars/2881a4cd51ccb6e3ce89eee1f16b8609.svg","isPro":false,"fullname":"Yanshu Zhang","user":"victor678","type":"user"},{"_id":"66414e147d29ead2240899bc","avatarUrl":"/avatars/665908a6f1ae8873487852ac2f8f6014.svg","isPro":false,"fullname":"ziyu zhang","user":"czzzy","type":"user"},{"_id":"667b8de7a68bf81afe668afe","avatarUrl":"/avatars/aeff10805ff858332e6f6a58735dbbd9.svg","isPro":false,"fullname":"leoli","user":"lifuguan","type":"user"},{"_id":"64ac2cce5769f5f88bc25475","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64ac2cce5769f5f88bc25475/SMpNfbmDQd7IRIF0WBttV.jpeg","isPro":false,"fullname":"Mu","user":"MeYourHint","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.30819.md"}">
Function2Scene: 3D Indoor Scene Layout from Functional Specifications
Abstract
Function2Scene generates 3D indoor layouts from functional descriptions by parsing user needs and applying design constraints through an iterative refinement process combining geometric analysis, language modeling, and visual assessment.
AI-generated summary
Most text-driven 3D indoor scene synthesis methods generate rooms from object-centric prompts, asking what furniture should be placed rather than how the space is used. Yet in real interior design, a layout is judged by how well it supports its occupants, e.g., their activities and physical needs. We introduce Function2Scene, a framework for generating 3D indoor layouts from functional specifications, i.e., natural-language design briefs describing who will use a room and what they need to do there. Given such a specification, our system parses occupant personas and activities, derives a customized set of functional design constraints from a taxonomy of 17 criteria spanning spatial, ergonomic, activity, and environmental considerations, and uses these constraints to guide layout generation. Rather than relying on an LLM to directly produce a final scene, Function2Scene performs iterative evaluation and refinement through a tool-augmented check-and-repair loop, combining geometric measurements, LLM-based contextual reasoning, and VLM-based visual assessment. Experiments on 30 professionally written interior-design cases show that Function2Scene produces layouts that better satisfy functional requirements than recent LLM-based scene synthesis baselines, with our results preferred in 94.3% of pairwise comparisons. Our work reframes text-driven indoor scene synthesis from placing plausible objects to designing spaces that support human use.
Community
Most text-driven 3D indoor scene synthesis methods generate rooms from object-centric prompts, asking what furniture should be placed rather than how the space is used. Yet in real interior design, a layout is judged by how well it supports its occupants, e.g., their activities and physical needs. We introduce Function2Scene, a framework for generating 3D indoor layouts from functional specifications, i.e., natural-language design briefs describing who will use a room and what they need to do there. Given such a specification, our system parses occupant personas and activities, derives a customized set of functional design constraints from a taxonomy of 17 criteria spanning spatial, ergonomic, activity, and environmental considerations, and uses these constraints to guide layout generation. Rather than relying on an LLM to directly produce a final scene, Function2Scene performs iterative evaluation and refinement through a tool-augmented check-and-repair loop, combining geometric measurements, LLM-based contextual reasoning, and VLM-based visual assessment. Experiments on 30 professionally written interior-design cases show that Function2Scene produces layouts that better satisfy functional requirements than recent LLM-based scene synthesis baselines, with our results preferred in 94.3% of pairwise comparisons. Our work reframes text-driven indoor scene synthesis from placing plausible objects to designing spaces that support human use.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.30819 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.30819 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.30819 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.