Fine-Tuning TranslateGemma-4B to improve bi-directional English & Welsh translations on an H200 GPU!
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Open source repo: https://github.com/grctest/finetuned-gemmatranslate-cy
5% of the fine-tuning took 40 minutes and cost a couple dollars to prove the process works.
Looking forwards to Flash Attention v4 to leave beta, to test fine-tuning performance on a B200 on the cloud, probably a few months away it seems?
What languages would you train TranslateGemma to be able to translate? I was originally thinking about klingon but the available datasets seemed a bit lacking..
[link] [comments]
More from r/LocalLLaMA
-
24+ tok/s from ~30B MoE models on an old GTX 1080 (8 GB VRAM, 128k context)
May 13
-
Web-Search is coming to a screeching performance halt as Google shuts down their free search index, and traffic defenders like Cloudflare challenge AI at every gateway. What are our options?
May 13
-
Side Projects.
May 13
-
MI50s Qwen 3.6 27B @52.8 tps TG @1569 tps PP (no MTP, no Quant)
May 13
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.