r/LocalLLaMA · · 1 min read

Could you help me test MTP for GLM-4.7-Flash?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Some of you may remember old models from GLM: GLM Air or GLM Flash. I know they’re outdated, but I have a soft spot for them, so I am currently working on enabling MTP for them in llama.cpp.

If you know how to compile llama.cpp from source and have the hardware to run GLM-4.7-Flash, could you test this out and let me know if it works for you (and what's the speed gain with MTP), or if you encounter any issues?

https://huggingface.co/jacek2024/GLM-4.7-Flash-MTP-GGUF

(if you need smaller quant - let me know)

submitted by /u/jacek2023
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA