r/LocalLLaMA · May 27, 2026 · 1 min read

Finally pioneering beyond the local 256k context window frontier!

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Finally pioneering beyond the local 256k context window frontier!

The autocompact at 341.5k tokens is manually set and I'll be slowly pushing it back now I'm confident there's overhead for memory eviction of key values into cache.

The question now is will the proposed fix complete in those remaining 16k tokens, as I'll be cross if the trial run fails also to produce a worthwhile outcome.

Kudos to Apple, DeepSeek and oMLX.

submitted by /u/challis88ocarina
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA