PhyGenHOI tackles the lack of physical realism in 4D generation. We model humans via MDM and objects as physical agents via MPM simulations, using 3DGS as a unified representation. With contact-driven re-simulation and Masked Video-SDS, we heavily enhance contact fidelity and physical consistency for text-driven actions.</p>\n","updatedAt":"2026-05-29T07:32:25.522Z","author":{"_id":"64543a1ccd09ceba0e14ecfd","avatarUrl":"/avatars/d4f3aca9aa8bb4188f68ffd9e0d1f881.svg","fullname":"Omer Benishu","name":"omerbenishu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9034119248390198},"editors":["omerbenishu"],"editorAvatarUrls":["/avatars/d4f3aca9aa8bb4188f68ffd9e0d1f881.svg"],"reactions":[],"isReport":false}},{"id":"6a1a40c5f4090276d1e08b8f","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false},"createdAt":"2026-05-30T01:43:33.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [HOIGS: Human-Object Interaction Gaussian Splatting](https://huggingface.co/papers/2604.04016) (2026)\n* [Real2Sim in HOI: Toward Physically Plausible HOI Reconstruction from Monocular Videos](https://huggingface.co/papers/2605.14462) (2026)\n* [Sketch2Motion: Text-driven 2D Sketch to 3D Animation via Diffusion-guided Skeleton Optimization](https://huggingface.co/papers/2605.28394) (2026)\n* [MaMi-HOI: Harmonizing Global Kinematics and Local Geometry for Human-Object Interaction Generation](https://huggingface.co/papers/2605.05756) (2026)\n* [THOM: Generating Physically Plausible Hand-Object Meshes From Text](https://huggingface.co/papers/2604.02736) (2026)\n* [InterPhys: Physics-aware Human Motion Synthesis in a Dynamic Scene](https://huggingface.co/papers/2605.01036) (2026)\n* [SAM3D-Phys: Towards Multi-Object Interactive Simulation in Real World](https://huggingface.co/papers/2605.30239) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"<p>This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. </p>\n<p>The following papers were recommended by the Semantic Scholar API </p>\n<ul>\n<li><a href=\"https://huggingface.co/papers/2604.04016\">HOIGS: Human-Object Interaction Gaussian Splatting</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.14462\">Real2Sim in HOI: Toward Physically Plausible HOI Reconstruction from Monocular Videos</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.28394\">Sketch2Motion: Text-driven 2D Sketch to 3D Animation via Diffusion-guided Skeleton Optimization</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.05756\">MaMi-HOI: Harmonizing Global Kinematics and Local Geometry for Human-Object Interaction Generation</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.02736\">THOM: Generating Physically Plausible Hand-Object Meshes From Text</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.01036\">InterPhys: Physics-aware Human Motion Synthesis in a Dynamic Scene</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.30239\">SAM3D-Phys: Towards Multi-Object Interactive Simulation in Real World</a> (2026)</li>\n</ul>\n<p> Please give a thumbs up to this comment if you found it helpful!</p>\n<p> If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space</p>\n<p> You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{"user":"librarian-bot"}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/librarian-bot\">@<span class=\"underline\">librarian-bot</span></a></span> </span></span> recommend</code></p>\n","updatedAt":"2026-05-30T01:43:33.922Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":359,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7079785466194153},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.30268","authors":[{"_id":"6a19403e56b4bb14ec65d1a0","name":"Omer Benishu","hidden":false},{"_id":"6a19403e56b4bb14ec65d1a1","name":"Gal Fiebelman","hidden":false},{"_id":"6a19403e56b4bb14ec65d1a2","name":"Sagie Benaim","hidden":false}],"publishedAt":"2026-05-28T00:00:00.000Z","submittedOnDailyAt":"2026-05-29T00:00:00.000Z","title":"PhyGenHOI: Physically-Aware 4D Generation of Dynamic Human-Object Interactions","submittedOnDailyBy":{"_id":"64543a1ccd09ceba0e14ecfd","avatarUrl":"/avatars/d4f3aca9aa8bb4188f68ffd9e0d1f881.svg","isPro":false,"fullname":"Omer Benishu","user":"omerbenishu","type":"user","name":"omerbenishu"},"summary":"We address the task of generating physically accurate and visually faithful 4D Human-Object Interaction (HOI). Given a static 3D human and target object represented as 3D Gaussian Splats (3DGS), our goal is to synthesize dynamic scenes where the human actively engages with the object through actions, such as punching or kicking, in accordance with a given input text. To this end, we introduce PhyGenHOI, a novel framework that couples generative human motion with an explicit physical object simulation. We model the human as a semantic agent driven by a Motion Diffusion Model (MDM) and the object as a physical agent simulated via the Material Point Method (MPM), utilizing 3D Gaussians as a unified, differentiable representation. We supervise their interaction through three coupled mechanisms: (1) A Windowed Attraction Loss that temporally synchronizes generative motion to intercept the object; (2) A Contact-Driven Re-simulation step that triggers physically consistent momentum transfer upon impact; and (3) A Masked Video-SDS objective that injects video-based priors to enhance contact fidelity. Experiments show PhyGenHOI generates physically consistent 4D HOI across diverse actions, humans, and objects, outperforming baselines. Project page and videos: https://omerbenishu.github.io/PhyGenHOI/","upvotes":7,"discussionId":"6a19403e56b4bb14ec65d1a3","projectPage":"https://omerbenishu.github.io/PhyGenHOI/","ai_summary":"PhyGenHOI synthesizes physically accurate 4D human-object interactions by combining motion diffusion models with material point method simulations using 3D Gaussian representations.","ai_keywords":["3D Gaussian Splats","Motion Diffusion Model","Material Point Method","Windowed Attraction Loss","Contact-Driven Re-simulation","Masked Video-SDS"],"organization":{"_id":"65157bc51e7b9224c9c6d460","name":"HUJI-IL","fullname":"The Hebrew University of Jerusalem","avatar":"https://www.gravatar.com/avatar/fbf7c0844f4246fadde2c5ef9867ccaf?d=retro&size=100"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64543a1ccd09ceba0e14ecfd","avatarUrl":"/avatars/d4f3aca9aa8bb4188f68ffd9e0d1f881.svg","isPro":false,"fullname":"Omer Benishu","user":"omerbenishu","type":"user"},{"_id":"6659eb11f63bb17c7d70cb83","avatarUrl":"/avatars/53e78e909ef7a7a81e5f57f664cd9cc3.svg","isPro":false,"fullname":"Yosef Dayani","user":"jossefda","type":"user"},{"_id":"6735e0ae4fa652d9618eaf73","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/brcZLLgG2WHndKVbZXki5.png","isPro":false,"fullname":"David Shavin","user":"david-shavin","type":"user"},{"_id":"629f4948ebf8369a17f5dbb1","avatarUrl":"/avatars/9ad62a6b086cf39117847b500806c5fa.svg","isPro":false,"fullname":"Omri Benbenisty","user":"omribenben","type":"user"},{"_id":"697b80f027319f7874015f8a","avatarUrl":"/avatars/1fe6cfea0be79a2901c334687512c01d.svg","isPro":false,"fullname":"Hadar Davidson","user":"HadarD","type":"user"},{"_id":"630d180f3dc31beba6f061c3","avatarUrl":"/avatars/7cf70bff453b6e64fcac52f45c6b3730.svg","isPro":false,"fullname":"guy hadad","user":"guyhadad01","type":"user"},{"_id":"66c8916afafc0fc87cd6e9ca","avatarUrl":"/avatars/627cabfbe5fba7393c5e4bba4aa3f07f.svg","isPro":false,"fullname":"Niv Eckhaus","user":"nive-huji","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"65157bc51e7b9224c9c6d460","name":"HUJI-IL","fullname":"The Hebrew University of Jerusalem","avatar":"https://www.gravatar.com/avatar/fbf7c0844f4246fadde2c5ef9867ccaf?d=retro&size=100"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.30268.md"}">
PhyGenHOI: Physically-Aware 4D Generation of Dynamic Human-Object Interactions
Abstract
PhyGenHOI synthesizes physically accurate 4D human-object interactions by combining motion diffusion models with material point method simulations using 3D Gaussian representations.
AI-generated summary
We address the task of generating physically accurate and visually faithful 4D Human-Object Interaction (HOI). Given a static 3D human and target object represented as 3D Gaussian Splats (3DGS), our goal is to synthesize dynamic scenes where the human actively engages with the object through actions, such as punching or kicking, in accordance with a given input text. To this end, we introduce PhyGenHOI, a novel framework that couples generative human motion with an explicit physical object simulation. We model the human as a semantic agent driven by a Motion Diffusion Model (MDM) and the object as a physical agent simulated via the Material Point Method (MPM), utilizing 3D Gaussians as a unified, differentiable representation. We supervise their interaction through three coupled mechanisms: (1) A Windowed Attraction Loss that temporally synchronizes generative motion to intercept the object; (2) A Contact-Driven Re-simulation step that triggers physically consistent momentum transfer upon impact; and (3) A Masked Video-SDS objective that injects video-based priors to enhance contact fidelity. Experiments show PhyGenHOI generates physically consistent 4D HOI across diverse actions, humans, and objects, outperforming baselines. Project page and videos: https://omerbenishu.github.io/PhyGenHOI/
Community
PhyGenHOI tackles the lack of physical realism in 4D generation. We model humans via MDM and objects as physical agents via MPM simulations, using 3DGS as a unified representation. With contact-driven re-simulation and Masked Video-SDS, we heavily enhance contact fidelity and physical consistency for text-driven actions.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.30268 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.30268 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.30268 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.