Locally running mode turns an Image into a Cute Controllable Character you can Play as
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| This is a sequel to my last post here !! It meant a lot to have such positive feedback last time. This is the 800M version of the previous model. It still has a LOT of issues but the promise is the same. Working comfortably on consumer GPUs The context is increased to 12 latent frames. The wierd flashes of last time are gone. Stability is much better although consistency is horrible. I'm hoping to fix that in next iteration. the 500M model gets over 60 fps on a RTX 5090 now. The architecture is still the same , I mostly just fattened the MLP. Again the de noiser is trained from scratch with diffusion forcing LLMs sample just 1 token every forward pass and add it to the KV cache. So the KV Cache is where the "context" lives Diffusion Models work more based on guidance. Noise in -> model does a round of denoising So the idea in models like mine is causal diffusion . We do a de noising loop for each frame but then add it to the KV cache too. So the KV cache is a store of all past frames. However because we only trained till like 20-30 latent frames (approx 80-120 pixel frames because of the pretrained VAE I use) I have to use a sliding window in the KV cache and evict intermediate useless frames so the model still thinks "yes I can work with a context I was trained with, not more" I've been putting out a lot of videos, pretty much everything I try on a subrdit I made called lucidmlx [link] [comments] |
More from r/LocalLLaMA
-
Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought
Jun 30
-
I Hate Dario Amodei, and everything he stands for.
Jun 29
-
Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.
Jun 29
-
Krea-2-Turbo Image Model - Easy to be fully uncensored, but it can also EDIT Images!
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.