r/LocalLLaMA · · 1 min read

I built a tool to turn your Claude Code sessions into fine-tuning data for local models

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

If you use Claude Code, every session is already sitting on disk as a .jsonl file under ~/.claude/projects/. It has real coding conversations: multi-turn edits, tool calls, reasoning traces. That's training data you already generated for free.

The problem is the format is not what any fine-tuning framework expects. So I built claude_converter to bridge that gap.

What it does:

  • Converts Claude Code .jsonl sessions into the messages format that apply_chat_template() consumes directly
  • Outputs are compatible with TRL/SFTTrainer, Axolotl, and LLaMA-Factory (sharegpt format)
  • Ships a clean_messages() helper to strip <tool_use>, <tool_result>, and <thinking> blocks before training
  • Includes an inspect_session() CLI-style function with token counts and block breakdowns so you know what you're working with before you train on it
  • Zero dependencies

Quick example:

```python import glob from datasets import Dataset from trl import SFTTrainer, SFTConfig from claude_converter import session_to_messages, clean_messages

all_messages = [] for path in glob.glob("~/.claude/projects/*/.jsonl", recursive=True): msgs = clean_messages(session_to_messages(path)) if len(msgs) >= 2: all_messages.append({"messages": msgs})

dataset = Dataset.from_list(all_messages) ```

One caveat worth calling out: raw sessions include failed attempts, retries, and dead ends. Don't train on everything blindly. Filter to sessions where the final assistant turn actually solved the problem.

Repo: https://github.com/FredyRivera-dev/claude_converter

uv pip install claude-converter

Happy to answer questions about the format or the conversion logic.

submitted by /u/F4k3r22
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA