A dynamic mesh generation framework that can model challenging 4D scenarios, including topology changes, deformation, shattering, melting, transparency, and thin structures.</p>\n","updatedAt":"2026-05-26T03:42:54.466Z","author":{"_id":"63daa44df03c3d71ef33da2d","avatarUrl":"/avatars/3193967d40b82a7d678b58a8e4d0ec1a.svg","fullname":"Jiraphon Yenphraphai","name":"domejiraphon","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7966886758804321},"editors":["domejiraphon"],"editorAvatarUrls":["/avatars/3193967d40b82a7d678b58a8e4d0ec1a.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.26109","authors":[{"_id":"6a150c7fb57a1823d5708abd","user":{"_id":"63daa44df03c3d71ef33da2d","avatarUrl":"/avatars/3193967d40b82a7d678b58a8e4d0ec1a.svg","isPro":false,"fullname":"Jiraphon Yenphraphai","user":"domejiraphon","type":"user","name":"domejiraphon"},"name":"Jiraphon Yenphraphai","status":"claimed_verified","statusLastChangedAt":"2026-05-26T07:09:18.803Z","hidden":false},{"_id":"6a150c7fb57a1823d5708abe","name":"Jianqi Chen","hidden":false},{"_id":"6a150c7fb57a1823d5708abf","name":"Jian Wang","hidden":false},{"_id":"6a150c7fb57a1823d5708ac0","name":"Gordon Qian","hidden":false},{"_id":"6a150c7fb57a1823d5708ac1","name":"Sergey Tulyakov","hidden":false},{"_id":"6a150c7fb57a1823d5708ac2","name":"Rameen Abdal","hidden":false},{"_id":"6a150c7fb57a1823d5708ac3","name":"Raymond A. Yeh","hidden":false},{"_id":"6a150c7fb57a1823d5708ac4","name":"Peter Wonka","hidden":false},{"_id":"6a150c7fb57a1823d5708ac5","name":"Chaoyang Wang","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/63daa44df03c3d71ef33da2d/eqz8t3SjqLkYBqiIyi2RY.mp4"],"publishedAt":"2026-05-25T00:00:00.000Z","submittedOnDailyAt":"2026-05-26T00:00:00.000Z","title":"Helix4D: Complex 4D Mesh Generation","submittedOnDailyBy":{"_id":"63daa44df03c3d71ef33da2d","avatarUrl":"/avatars/3193967d40b82a7d678b58a8e4d0ec1a.svg","isPro":false,"fullname":"Jiraphon Yenphraphai","user":"domejiraphon","type":"user","name":"domejiraphon"},"summary":"Current video-to-4D methods struggle with complex topology changes, transparent materials, thin structures, and inner surfaces. We present Helix4D, a dynamic mesh generation framework by inheriting the expressive representation of Trellis2, adapting it from image-to-3D to video-conditioned 4D generation. Our design arises from two key questions: (a) how to enable Trellis2's frame-local attention to share information across frames while preserving its pretrained quality on rare cases such as transparent objects and inner surfaces, and (b) how to inject temporal information into a purely 3D positional encoding without breaking pretrained capabilities. We address (a) with a sliding-window cross-frame attention and anchor on the first frame. The first frame is generated by the base Trellis2 model and injected into our model, letting it inherit Trellis2's quality in rare cases through cross-frame attention. We address (b) with a 4D temporal encoding that repurposes redundant low-frequency spatial RoPE bands for time, extending the encoding from 3D with no additional parameters. Extensive experiments show the effectiveness of Helix4D for high-quality dynamic mesh generation on ActionBench and our own challenging complex dynamics set.","upvotes":3,"discussionId":"6a150c7fb57a1823d5708ac6","projectPage":"https://snap-research.github.io/helix4d/","ai_summary":"Helix4D enables high-quality dynamic mesh generation by adapting Trellis2's frame-local attention across frames and extending 3D positional encoding with 4D temporal information.","ai_keywords":["dynamic mesh generation","Trellis2","frame-local attention","cross-frame attention","temporal encoding","4D temporal encoding","positional encoding","RoPE bands","ActionBench","complex dynamics set"],"organization":{"_id":"63c87c41cd6a490608ce31d1","name":"snap-research","fullname":"Snap Research","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674083325534-61f19829233c91cbd2f79e70.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"63daa44df03c3d71ef33da2d","avatarUrl":"/avatars/3193967d40b82a7d678b58a8e4d0ec1a.svg","isPro":false,"fullname":"Jiraphon Yenphraphai","user":"domejiraphon","type":"user"},{"_id":"619f9755da83161f25840698","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/619f9755da83161f25840698/FM421pE1mz5v1YhrxA8ZA.jpeg","isPro":false,"fullname":"Muhammad Umair","user":"umair894","type":"user"},{"_id":"6039478ab3ecf716b1a5fd4d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6039478ab3ecf716b1a5fd4d/_Thy4E7taiSYBLKxEKJbT.jpeg","isPro":true,"fullname":"taesiri","user":"taesiri","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"63c87c41cd6a490608ce31d1","name":"snap-research","fullname":"Snap Research","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674083325534-61f19829233c91cbd2f79e70.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.26109.md"}">
Helix4D: Complex 4D Mesh Generation
Abstract
Helix4D enables high-quality dynamic mesh generation by adapting Trellis2's frame-local attention across frames and extending 3D positional encoding with 4D temporal information.
AI-generated summary
Current video-to-4D methods struggle with complex topology changes, transparent materials, thin structures, and inner surfaces. We present Helix4D, a dynamic mesh generation framework by inheriting the expressive representation of Trellis2, adapting it from image-to-3D to video-conditioned 4D generation. Our design arises from two key questions: (a) how to enable Trellis2's frame-local attention to share information across frames while preserving its pretrained quality on rare cases such as transparent objects and inner surfaces, and (b) how to inject temporal information into a purely 3D positional encoding without breaking pretrained capabilities. We address (a) with a sliding-window cross-frame attention and anchor on the first frame. The first frame is generated by the base Trellis2 model and injected into our model, letting it inherit Trellis2's quality in rare cases through cross-frame attention. We address (b) with a 4D temporal encoding that repurposes redundant low-frequency spatial RoPE bands for time, extending the encoding from 3D with no additional parameters. Extensive experiments show the effectiveness of Helix4D for high-quality dynamic mesh generation on ActionBench and our own challenging complex dynamics set.
Community
A dynamic mesh generation framework that can model challenging 4D scenarios, including topology changes, deformation, shattering, melting, transparency, and thin structures.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.26109 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.26109 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.26109 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.