News / #video-gen Tag Video Gen 80 articles archived under #video-gen · RSS Sign in to follow Hugging Face Daily Papers research 1mo ago Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion Abstract Pantheon360 enables high-fidelity 360° video generation for digital twins by combining 3D-aware diffusion with explicit geometric caching to ensure spatial-temporal consistency. AI-generated summary Generating complete digital twins from videos requires precise camera… 14 Hugging Face Daily Papers research 1mo ago Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration Abstract A multi-agent framework called Soap2Soap is presented for long-horizon video-to-video generation that maintains narrative structure and character identity across extended sequences through consistent semantic backbone and visual reference anchors. AI-generated summary… 25 Hugging Face Daily Papers research 1mo ago Geo-Align: Video Generation Alignment via Metric Geometry Reward Abstract Geo-Align presents a reinforcement learning framework for camera-controlled video re-rendering that improves generalization through scale-aware perceptual rewards and metric 3D estimation for camera trajectory extraction. AI-generated summary Camera-controlled video… 20 r/LocalLLaMA community 1mo ago meituan-longcat/LongCat-Video-Avatar-1.5 · Hugging Face 🚀 Model Introduction We are excited to announce the release of LongCat-Video-Avatar 1.5, an upgraded open-source framework that prioritizes extreme empirical optimization and production-readiness for audio-driven human video generation. Built upon the LongCat-Video foundation… 21 Hugging Face Daily Papers research 1mo ago FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching Abstract A novel inference-time method for long video generation using overlapping sliding windows with Tweedie matching and stochastic early-phase sampling to improve temporal consistency and visual quality. AI-generated summary Extending the generation horizon of video… 11 Hugging Face Daily Papers research 1mo ago Bernini: Latent Semantic Planning for Video Diffusion Abstract A unified video generation and editing framework combines multimodal large language models for semantic planning with diffusion models for pixel rendering, achieving state-of-the-art performance through semantic interface separation and enhanced positional embeddings.… 32 Hugging Face Daily Papers research 1mo ago Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos Abstract MIGA addresses long video generation challenges by reducing training-inference gaps and enhancing temporal consistency through dual consistency mechanisms. AI-generated summary Without incurring significant computational overhead, train-free long video generation aims… 5 Hugging Face Daily Papers research 1mo ago Video Models Can Reason with Verifiable Rewards Abstract VideoRLVR optimizes video diffusion models for verifiable reasoning tasks using reinforcement learning with rule-based rewards, achieving better performance than supervised methods in constraint-satisfying video generation. AI-generated summary Video diffusion models… 11 Hugging Face Daily Papers research 1mo ago CogOmniControl: Reasoning-Driven Controllable Video Generation via Creative Intent Cognition Abstract Diffusion models applied in compressed image space generate high-quality images with lower computational cost and support flexible inputs like text or boxes. AI-generated summary Recent diffusion models achieve strong photorealism and fluency in video generation, yet… 37 Hugging Face Daily Papers research 1mo ago MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation Abstract MSAVBench presents the first comprehensive benchmark and adaptive evaluation framework for multi-shot audio-video generation, addressing limitations in existing benchmarks through diverse task settings and advanced evaluation mechanisms. AI-generated summary Video… 25 Hugging Face Daily Papers research 1mo ago Echo-Forcing: A Scene Memory Framework for Interactive Long Video Generation Abstract Echo-Forcing addresses limitations in interactive long-video generation by decoupling historical memory and recent dynamics through hierarchical temporal memory, scene recall frames, and difference-aware memory decay mechanisms. AI-generated summary Autoregressive video… 5 Hugging Face official-blog 1mo ago Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation Back to Articles Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation Enterprise + Article Published May 18, 2026 Upvote - Ting-Yun Chang ting-yunc nvidia Miguel Martin miguelmartin-nv nvidia Jonathan Allen nv-spectralflight nvidia Ke Ding kding1… 11 Hugging Face Daily Papers research 1mo ago OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation Abstract OmniHumanoid enables cross-embodiment video generation by factorizing motion transfer and embodiment-specific adaptation, allowing scalable adaptation to new humanoid embodiments using unpaired data. AI-generated summary Cross-embodiment video generation aims to… 7 TechCrunch — AI news-outlet 1mo ago Runway started by helping filmmakers. Now it wants to beat Google at AI. AI video generation startup Runway is betting that video generation is the path to world models. And that being an AI outsider is an advantage, not a liability. 20 Hugging Face Daily Papers research 1mo ago Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation Abstract A novel causal consistency distillation method enables efficient frame-wise video generation with reduced latency and improved quality compared to existing chunk-wise approaches. AI-generated summary Real-time interactive video generation requires low-latency,… 6 Hugging Face Daily Papers research 1mo ago Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video Abstract A novel approach called Warp-as-History enables camera-controlled video generation by transforming camera-induced warps into pseudo-history representations, achieving zero-shot capability without training or test-time optimization. AI-generated summary Camera-controlled… 18 Hugging Face Daily Papers research 1mo ago PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation Abstract PhyMotion introduces a physics-grounded reward system for human motion generation that evaluates kinematic plausibility, contact consistency, and dynamic feasibility to improve video quality. AI-generated summary Generating realistic human motion is a central yet… 12 Hugging Face Daily Papers research 1mo ago RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-model GRPO Abstract RAVEN enables real-time video generation through causal autoregressive extrapolation with improved training alignment, while CM-GRPO enhances performance via reinforcement learning applied to consistency model sampling. AI-generated summary Causal autoregressive video… 20 Hugging Face Daily Papers research 1mo ago RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data Abstract RoboEvolve combines vision-language and video generation models in a co-evolutionary framework to enable scalable robotic manipulation with improved data efficiency and continuous learning capabilities. AI-generated summary The scalability of robotic manipulation is… 29 The Information — AI news-outlet 1mo ago Rent the Runway Cofounder to Step Down as CEO Rent the Runway cofounder Jennifer Hyman will step down from the CEO role at the end of this week, the clothing rental company said Wednesday. Hyman will remain an advisor to the company through early 2027, and current board member and former Nordstrom executive Teri Bariquit… 35 Hugging Face Daily Papers research 1mo ago FaithfulFaces: Pose-Faithful Facial Identity Preservation for Text-to-Video Generation Abstract FaithfulFaces is a pose-faithful facial identity preservation framework that improves identity consistency in text-to-video generation through pose-shared alignment and explicit Euler angle embeddings. AI-generated summary Identity-preserving text-to-video generation… 38 arXiv — NLP / Computation & Language research 1mo ago PresentAgent-2: Towards Generalist Multimodal Presentation Agents arXiv:2605.11363v1 Announce Type: cross Abstract: Presentation generation is moving beyond static slide creation toward end-to-end presentation video generation with research grounding, multimodal media, and interactive delivery. We introduce PresentAgent-2, an agentic framework… 30 Hugging Face Daily Papers research 1mo ago CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives Abstract CausalCine enables interactive, multi-shot video generation by addressing limitations of autoregressive models through causal modeling, dynamic memory routing, and real-time distillation techniques. AI-generated summary Autoregressive video generation aims at real-time,… 38 Vercel — AI dev-tools 2mo ago Seedance 2.0 Video Generation on AI Gateway You can now access Bytedance's latest state-of-the-art video generation model, Seedance 2.0, via AI Gateway with no other provider accounts required. Seedance 2.0 is available on AI Gateway in two variants: Standard and Fast. Both share the same capabilities. Standard produces… 10 Google DeepMind official-blog 5mo ago Veo 3.1 Ingredients to Video: More consistency, creativity and control Our latest Veo update generates lively, dynamic clips that feel natural and engaging — and supports vertical video generation. 5 Google DeepMind official-blog 8mo ago Behind “ANCESTRA”: combining Veo with live-action filmmaking We partnered with Darren Aronofsky, Eliza McNitt and a team of more than 200 people to make a film using Veo and live-action filmmaking. 10 Google DeepMind official-blog 8mo ago Introducing Veo 3.1 and advanced creative capabilities We’re rolling out significant updates to Veo that give people even more creative control. 30 Google DeepMind official-blog 13mo ago Fuel your creativity with new generative media models and tools Introducing Veo 3 and Imagen 4, and a new tool for filmmaking called Flow. 12 Google DeepMind official-blog 14mo ago Generate videos in Gemini and Whisk with Veo 2 Transform text-based prompts into high-resolution eight-second videos in Gemini Advanced and use Whisk Animate to turn images into eight-second animated clips. 26 Lil'Log (Lilian Weng) research 26mo ago Diffusion Models for Video Generation Diffusion models have demonstrated strong results on image synthesis in past years. Now the research community has started working on a harder task—using it for video generation. The task itself is a superset of the image case, since an image is a video of 1 frame, and it… 12 Page 2 of 2 · 80 articles ← Newer