Tag

Image Gen

13 articles archived under #image-gen · RSS

Hugging Face Daily Papers research 2h ago

Images in Sentences: Scaling Interleaved Instructions for Unified Visual Generation

Abstract INSET is a unified multimodal model that embeds images as native vocabulary within textual instructions, enabling better handling of complex interleaved inputs through transformer-based contextual locality and supporting both image generation and editing tasks.…

34
r/MachineLearning community 5h ago

Image generation models running locally on limited resources [P]

I have a project consisting of generating high quality free ebook covers out of its content. On my 16GB of ram machine with no gpu, i have tested the opensourced stable diffusion models without any success. All return bad quality covers with blurred faces and scenes that do not…

6
arXiv — Machine Learning research 19h ago

Efficient Adjoint Matching for Fine-tuning Diffusion Models

arXiv:2605.11480v1 Announce Type: new Abstract: Reward fine-tuning has become a common approach for aligning pretrained diffusion and flow models with human preferences in text-to-image generation. Among reward-gradient-based methods, Adjoint Matching (AM) provides a principled…

30
OpenAI news 22d ago

Introducing ChatGPT Images 2.0

ChatGPT Images 2.0 introduces a state-of-the-art image generation model with improved text rendering, multilingual support, and advanced visual reasoning.

9
Smol AI News news-outlet 22d ago

GPT-Image-2

**OpenAI** launched **GPT-Image-2**, enhancing image generation with improved text rendering, layout fidelity, editing, multilingual support, and "thinking" capabilities. It supports generating slides, infographics, diagrams, UI mockups, and QR codes, and integrates with tools…

36
OpenAI news 27d ago

Codex for (almost) everything

The updated Codex app for macOS and Windows adds computer use, in-app browsing, image generation, memory, and plugins to accelerate developer workflows.

5
Hugging Face official-blog 2mo ago

PRX Part 3 — Training a Text-to-Image Model in 24h!

Back to Articles PRX Part 3 — Training a Text-to-Image Model in 24h! Team Article Published March 3, 2026 Upvote 64 David Bertoin Bertoin Photoroom Roman Frigg photoroman Photoroom Jon Almazán jon-almazan Photoroom Introduction Welcome back 👋 In the last two posts ( Part 1 and…

23
Smol AI News news-outlet 2mo ago

Nano Banana 2 aka Gemini 3.1 Flash Image Preview: the new SOTA Imagegen model

**Google and DeepMind** launched **Nano Banana 2** (aka **Gemini 3.1 Flash Image Preview**), a leading image generation and editing model integrated across multiple Google products with features like **4K upscaling**, **multi-subject consistency**, and **real-time…

29
Hugging Face official-blog 3mo ago

Training Design for Text-to-Image Models: Lessons from Ablations

Back to Articles Training Design for Text-to-Image Models: Lessons from Ablations Team Article Published February 3, 2026 Upvote 73 David Bertoin Bertoin Photoroom Roman Frigg photoroman Photoroom Jon Almazán jon-almazan Photoroom Welcome back! This is the second part of our…

13
Hugging Face official-blog 5mo ago

Diffusers welcomes FLUX-2

Back to Articles Welcome FLUX.2 - BFL’s new open image generation model 🤗 Published November 25, 2025 Update on GitHub Upvote 190 YiYi Xu YiYiXu Daniel Gu dg845 Sayak Paul sayakpaul Alvaro Somoza OzzyGT Dhruv Nair dn6 Aritra Roy Gosthipaty ariG23498 Linoy Tsaban linoyts…

12
Google DeepMind official-blog 5mo ago

Build with Nano Banana Pro, our Gemini 3 Pro Image model

Build with Nano Banana Pro, our Gemini 3 Pro Image model Share x.com Facebook LinkedIn Mail Here’s how developers can use Nano Banana Pro (Gemini 3 Pro Image), a powerful new image generation and editing model with advanced features and creative control. Alisa Fortin Product…

10
Google DeepMind official-blog 14mo ago

Experiment with Gemini 2.0 Flash native image generation

Native image output is available in Gemini 2.0 Flash for developers to experiment with in Google AI Studio and the Gemini API.

5
Eugene Yan research 42mo ago

Text-to-Image: Diffusion, Text Conditioning, Guidance, Latent Space

The fundamentals of text-to-image generation, relevant papers, and experimenting with DDPM.

35

Images in Sentences: Scaling Interleaved Instructions for Unified Visual Generation

Image generation models running locally on limited resources [P]

Efficient Adjoint Matching for Fine-tuning Diffusion Models

Introducing ChatGPT Images 2.0

GPT-Image-2

Codex for (almost) everything

PRX Part 3 — Training a Text-to-Image Model in 24h!

Nano Banana 2 aka Gemini 3.1 Flash Image Preview: the new SOTA Imagegen model

Training Design for Text-to-Image Models: Lessons from Ablations

Diffusers welcomes FLUX-2

Build with Nano Banana Pro, our Gemini 3 Pro Image model

Experiment with Gemini 2.0 Flash native image generation

Text-to-Image: Diffusion, Text Conditioning, Guidance, Latent Space