I fine-tuned an LLM to be C-3PO to test which training data format works best for persona injection [P]
Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.
| Tested three formats: chat demos, first-person statements ("I am C-3PO..."), and synthetic Wikipedia-style docs. Same model, same LoRA config, 500 examples each. First-person statements won on generalization, which I didn't expect. The synthetic doc model was the weirdest result: it knew C-3PO was anxious but only expressed it 37% of the time. Knowing a trait vs feeling it are apparently different things in weight space. Code and GitHub repo link are included inside! [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.