I made a UI and server for using Anthropic's new Natural Language Autoencoders locally with llama.cpp
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Anthropic's first open weight models, Natural Language Autoencoders, are just finetunes of popular open weight models. They do not modify architecture and modeling code so inference with llama.cpp is mostly trivial. I packaged every feature of NLAs (namely activation extraction, activation explanation, activation reconstruction and explanation-edit steering) into a custom llama.cpp server. It comes with a Mikupad UI for token-level activation explanation and steering. I'm currently working on a LoRA version so we can load a single model into memory instead of needing all three models (base model, actor model and critic) loaded, stay tuned! [link] [comments] |
More from r/LocalLLaMA
-
we really all are going to make it, aren't we? 2x3090 setup.
May 13
-
24+ tok/s from ~30B MoE models on an old GTX 1080 (8 GB VRAM, 128k context)
May 13
-
Web-Search is coming to a screeching performance halt as Google shuts down their free search index, and traffic defenders like Cloudflare challenge AI at every gateway. What are our options?
May 13
-
Side Projects.
May 13
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.