sensenova/SenseNova-U1-A3B-MoT · Hugging Face
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
|
🚀 SenseNova U1 is a new series of native multimodal models that unifies multimodal understanding, reasoning, and generation within a monolithic architecture. It marks a fundamental paradigm shift in multimodal AI: from modality integration to true unification. Rather than relying on adapters to translate between modalities, SenseNova U1 models think-and-act across language and vision natively. Unifying visual understanding and generation in an end-to-end architecture from pixel to word opens tremendous possibilities, enabling highly efficient and strong understanding, generation, and interleaved reasoning in a natively multimodal manner.
2 weeks ago, they released 8B model mentioned in above table. [link] [comments] |
More from r/LocalLLaMA
-
24+ tok/s from ~30B MoE models on an old GTX 1080 (8 GB VRAM, 128k context)
May 13
-
Web-Search is coming to a screeching performance halt as Google shuts down their free search index, and traffic defenders like Cloudflare challenge AI at every gateway. What are our options?
May 13
-
Side Projects.
May 13
-
MI50s Qwen 3.6 27B @52.8 tps TG @1569 tps PP (no MTP, no Quant)
May 13
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.