Given a single image from a perspective, wide-FoV, fisheye, or panoramic camera, UniSHARP predicts a 3D Gaussian point cloud and renders high-quality novel views.</p>\n","updatedAt":"2026-06-08T02:11:04.038Z","author":{"_id":"68e786cbfa2b7fd74a46eb23","avatarUrl":"/avatars/8292af16a47ae15b389f17adb67f3e3a.svg","fullname":"Insta360-Research","name":"Insta360-Research","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":36,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8678358793258667},"editors":["Insta360-Research"],"editorAvatarUrls":["/avatars/8292af16a47ae15b389f17adb67f3e3a.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.07514","authors":[{"_id":"6a26234fe4c258a029491fb6","name":"Meixi Song","hidden":false},{"_id":"6a26234fe4c258a029491fb7","name":"Dizhe Zhang","hidden":false},{"_id":"6a26234fe4c258a029491fb8","name":"Hao Ren","hidden":false},{"_id":"6a26234fe4c258a029491fb9","name":"Ruiyang Zhang","hidden":false},{"_id":"6a26234fe4c258a029491fba","name":"Bo Du","hidden":false},{"_id":"6a26234fe4c258a029491fbb","name":"Ming-Hsuan Yang","hidden":false},{"_id":"6a26234fe4c258a029491fbc","name":"Lu Qi","hidden":false}],"publishedAt":"2026-06-05T00:00:00.000Z","submittedOnDailyAt":"2026-06-08T00:00:00.000Z","title":"UniSHARP: Universal Sharp Monocular View Synthesis","submittedOnDailyBy":{"_id":"68e786cbfa2b7fd74a46eb23","avatarUrl":"/avatars/8292af16a47ae15b389f17adb67f3e3a.svg","isPro":true,"fullname":"Insta360-Research","user":"Insta360-Research","type":"user","name":"Insta360-Research"},"summary":"In this work, we focus on extending SHARP, the popular photorealistic view synthesis method, for universal monocular rendering across a continuum of camera systems, from conventional perspective cameras to wide-field-of-view, fisheye and omnidirectional panoramic settings. To overcome the pinhole-specific assumptions of SHARP, our key idea is to align various images in a unified omnidirectional latent space. Thus, we propose UniSHARP, which performs implicit alignment in both feature and Gaussian spaces. Specifically, Gaussian primitives are arranged along rays and radial distances in a ray-based universal representation, while 2D semantic and 3D spatial features extracted from UniK3D-inspired encoders are jointly decoded to generate the complete Gaussian cloud. To comprehensively evaluate our method, we construct a benchmark covering diverse imaging systems across various scenes. The benchmark is further stratified by field of view (FoV) to enable fine-grained assessment of the universal monocular rendering task. Extensive experiments on the proposed benchmark demonstrate the effectiveness of UniSHARP, outperforming alternative methods by a large margin. The project page can be found at: https://insta360-research-team.github.io/Unisharp-website/","upvotes":11,"discussionId":"6a26234fe4c258a029491fbd","projectPage":"https://insta360-research-team.github.io/Unisharp-website/","githubRepo":"https://github.com/Insta360-Research-Team/UniSHARP","githubRepoAddedBy":"user","ai_summary":"UniSHARP extends SHARP for universal monocular rendering across different camera systems by aligning images in an omnidirectional latent space through joint feature and Gaussian space alignment.","ai_keywords":["photorealistic view synthesis","universal monocular rendering","omnidirectional latent space","Gaussian primitives","ray-based universal representation","UniK3D-inspired encoders","Gaussian cloud"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":9},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6812e92b59113438668d0d76","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/YoH9F13mxXeydN7NetXLz.png","isPro":false,"fullname":"Derek Zhang","user":"derekzzzz","type":"user"},{"_id":"68e786cbfa2b7fd74a46eb23","avatarUrl":"/avatars/8292af16a47ae15b389f17adb67f3e3a.svg","isPro":true,"fullname":"Insta360-Research","user":"Insta360-Research","type":"user"},{"_id":"68b69f1dae7231fbc481ecca","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/68b69f1dae7231fbc481ecca/-XKbUzjRnfPE9gA-PiDI6.png","isPro":false,"fullname":"HuXing","user":"CeFf1z","type":"user"},{"_id":"67fa181d7cfcb309de076b01","avatarUrl":"/avatars/42d613a5dd0d86949dd9d7fdff57e7cc.svg","isPro":false,"fullname":"Liu","user":"Richrq","type":"user"},{"_id":"67da87d1020ee588d62bde5e","avatarUrl":"/avatars/17402c63da6ccb0660c3bc1e2675f182.svg","isPro":false,"fullname":"Li","user":"JackCV","type":"user"},{"_id":"69f1cb405b28ceb06c92a782","avatarUrl":"/avatars/85cc668b03b6eebc7f546737c099381f.svg","isPro":false,"fullname":"weiying","user":"wing1212","type":"user"},{"_id":"64c1c77c245c55a21c6f5a13","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64c1c77c245c55a21c6f5a13/d9zlSksf3TxWpBbb-r0fd.jpeg","isPro":true,"fullname":"Reza Sayar","user":"Reza2kn","type":"user"},{"_id":"695dd093571794fdd8169052","avatarUrl":"/avatars/3cb9dadf5e985caf0c7d4ac17f684901.svg","isPro":false,"fullname":"Fan","user":"qingylai","type":"user"},{"_id":"66fe2ed1a77ec468fff148f8","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/66fe2ed1a77ec468fff148f8/mpl-gYiGq0V36IXLgAko2.jpeg","isPro":false,"fullname":"Quanzhu Niu","user":"QuanzhuNiu","type":"user"},{"_id":"68355c5ec0003bc40230b3f2","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/68355c5ec0003bc40230b3f2/fJjAPFtmAJskQJqxWUb-T.jpeg","isPro":false,"fullname":"jasmineWang","user":"Jessamine","type":"user"},{"_id":"66c7360df375ce3a32dd9fa0","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/elGBNDjw3WeV8ZEjgQ_HA.png","isPro":false,"fullname":"Zhe Cao","user":"MichaelCaoo","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.07514.md"}">
UniSHARP: Universal Sharp Monocular View Synthesis
Abstract
UniSHARP extends SHARP for universal monocular rendering across different camera systems by aligning images in an omnidirectional latent space through joint feature and Gaussian space alignment.
In this work, we focus on extending SHARP, the popular photorealistic view synthesis method, for universal monocular rendering across a continuum of camera systems, from conventional perspective cameras to wide-field-of-view, fisheye and omnidirectional panoramic settings. To overcome the pinhole-specific assumptions of SHARP, our key idea is to align various images in a unified omnidirectional latent space. Thus, we propose UniSHARP, which performs implicit alignment in both feature and Gaussian spaces. Specifically, Gaussian primitives are arranged along rays and radial distances in a ray-based universal representation, while 2D semantic and 3D spatial features extracted from UniK3D-inspired encoders are jointly decoded to generate the complete Gaussian cloud. To comprehensively evaluate our method, we construct a benchmark covering diverse imaging systems across various scenes. The benchmark is further stratified by field of view (FoV) to enable fine-grained assessment of the universal monocular rendering task. Extensive experiments on the proposed benchmark demonstrate the effectiveness of UniSHARP, outperforming alternative methods by a large margin. The project page can be found at: https://insta360-research-team.github.io/Unisharp-website/
Community
Given a single image from a perspective, wide-FoV, fisheye, or panoramic camera, UniSHARP predicts a 3D Gaussian point cloud and renders high-quality novel views.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.07514 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.07514 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2606.07514 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.