Hugging Face · December 4, 2025 · 21 min read

We Got Claude to Fine-Tune an Open Source LLM

#model-release #open-source #security

Mirrored from Hugging Face for archival readability. Support the source by reading on the original site.

Like Read original ↗

Back to Articles

We Got Claude to Fine-Tune an Open Source LLM

Published December 4, 2025

Update on GitHub

Upvote

624

We gave Claude the ability to fine-tune language models using a new tool called Hugging Face Skills. Not just write training scripts, but to actually submit jobs to cloud GPUs, monitor progress, and push finished models to the Hugging Face Hub. This tutorial shows you how it works and how to use it yourself.

Claude Code can use "skills"—packaged instructions, scripts, and domain knowledge—to accomplish specialized tasks. The hf-llm-trainer skill teaches Claude everything it needs to know about training: which GPU to pick for your model size, how to configure Hub authentication, when to use LoRA versus full fine-tuning, and how to handle the dozens of other decisions that go into a successful training run.

With this skill, you can tell Claude things like:

Fine-tune Qwen3-0.6B on the dataset open-r1/codeforces-cots

And Claude will:

Validate your dataset format
Select appropriate hardware (t4-small for a 0.6B model)
Use and update a training script with Trackio monitoring
Submit the job to Hugging Face Jobs
Report the job ID and estimated cost
Check on progress when you ask
Help you debug if something goes wrong

The model trains on Hugging Face GPUs while you do other things. When it's done, your fine-tuned model appears on the Hub, ready to use.

This isn't a toy demo. The skill supports the same training methods used in production: supervised fine-tuning, direct preference optimization, and reinforcement learning with verifiable rewards. You can train models from 0.5B to 70B parameters, convert them to GGUF for local deployment, and run multi-stage pipelines that combine different techniques.

Setup and Install

Before starting, you'll need:

A Hugging Face account with a Pro or Team / Enterprise plan (Jobs require a paid plan)
A write-access token from huggingface.co/settings/tokens
A coding agent like Claude Code, OpenAI Codex, or Google's Gemini CLI

Hugging Face skills are compatible with Claude Code, Codex, and Gemini CLI. With integrations on the way for Cursor, Windsurf, and Continue.

Claude Code

/plugin marketplace add huggingface/skills

To install a skill, run:

/plugin install <skill-folder>@huggingface-skills

For example:

/plugin install hf-llm-trainer@huggingface-skills

Codex

Codex will identify the skills via the AGENTS.md file. You can verify the instructions are loaded with:

codex --ask-for-approval never "Summarize the current instructions."

For more details, see the Codex AGENTS guide.

Gemini CLI

This repo includes gemini-extension.json to integrate with the Gemini CLI.
Install locally:

gemini extensions install . --consent

or use the GitHub URL:

gemini extensions install https://github.com/huggingface/skills.git --consent

See Gemini CLI extensions docs for more help.

Connect to Hugging Face

You have to authenticate to your Hugging Face account with a write-access token so that the job can create a model repo.

Set up your token:

hf auth login
# or 
export HF_TOKEN=hf_your_write_access_token_here

Configure Hugging Face MCP Server to use your write token by sending it in either the HF_TOKEN or Authorization: Bearer HTTP Headers.

For Claude Code: claude mcp add --transport http hf-skills https://huggingface.co/mcp?bouquet=skills --header "Authorization: Bearer $HF_TOKEN"

Your First Training Run

Let's walk through a complete example. We'll fine-tune a small model to see the full workflow, then explore more advanced capabilities.

Instruct the coding agent to fine tune

Start with a simple and clear instruction to fine tune a specific model

Fine-tune Qwen3-0.6B on the open-r1/codeforces-cots dataset for instruction following.

The coding agent analyzes your request and prepares a training configuration. For a 0.6B model on a demo dataset, it selects t4-small—enough GPU for this model size and the cheapest option available.

The open-r1/codeforces-cots dataset is a dataset of codeforces problems and solutions. It is a good dataset for instruction tuning a model to solve hard coding problems.

This works for vision language models too! You can simply run "Fine-tune Qwen/Qwen3-VL-2B-Instruct on llava-instruct-mix"

Review Before Submitting

Before your coding agent submits anything, you'll see the configuration:

I'll fine-tune Qwen/Qwen3-0.6B on open-r1/codeforces-cots using SFT.

Configuration:
- Hardware: t4-small (~$0.75/hour)
- Estimated time: ~20 minutes
- Estimated cost: ~$0.30
- Output: username/qwen-codeforces-cots-sft

The model will be pushed to Hub automatically. Should I submit?

This is your chance to adjust anything. Change the output repo name, pick different hardware, or ask Claude to modify training parameters. Once you approve, the agent submits the job.

For example, you can ask the agent to try a test run:

Do a quick test run on 100 examples.

Track Progress

After submission, you get job details:

✅ Job submitted successfully!

Job ID: abc123xyz
Monitor: https://huggingface.co/jobs/username/abc123xyz

Expected time: ~20 minutes
Estimated cost: ~$0.30

View real-time metrics at: https://huggingface.co/spaces/username/trackio

The skill includes Trackio integration, so you can watch training loss decrease in real-time. Jobs run asynchronously so you can close your terminal and come back later. When you want an update:

How's my training job doing?

Then the agent fetches the logs and summarizes progress.

Use Your Model

When training completes, your model is on the Hub:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("username/qwen-codeforces-cots-sft")
tokenizer = AutoTokenizer.from_pretrained("username/qwen-codeforces-cots-sft")

That's the full loop. You described what you wanted in plain English, and the agent handled GPU selection, script generation, job submission, authentication, and persistence. The whole thing cost about thirty cents.

Training Methods

The skill supports three training approaches. Understanding when to use each one helps you get better results.

Supervised Fine-Tuning (SFT)

SFT is where most projects start. You provide demonstration data—examples of inputs and desired outputs—and training adjusts the model to match those patterns.

Use SFT when you have high-quality examples of the behavior you want. Customer support conversations, code generation pairs, domain-specific Q&A—anything where you can show the model what good looks like.

Fine-tune Qwen3-0.6B on my-org/support-conversations for 3 epochs.

The agent validates the dataset, selects hardware (a10g-large with LoRA for a 7B model), and configures training with checkpoints and monitoring.

For models larger than 3B parameters, the agent automatically uses LoRA (Low-Rank Adaptation) to reduce memory requirements. This makes training 7B or 13B models feasible on single GPUs while preserving most of the quality of full fine-tuning.

Direct Preference Optimization (DPO)

DPO trains on preference pairs—responses where one is "chosen" and another is "rejected." This aligns model outputs with human preferences, typically after an initial SFT stage.

Use DPO when you have preference annotations from human labelers or automated comparisons. DPO optimizes directly for the preferred response without needing a separate reward model.

Run DPO on my-org/preference-data to align the SFT model I just trained.
The dataset has 'chosen' and 'rejected' columns.

DPO is sensitive to dataset format. It requires columns named exactly chosen and rejected, or a prompt column with the input. The agent validates this first and shows you how to map columns if your dataset uses different names.

You can run DPO using Skills on vision language models too! Try it out with openbmb/RLAIF-V-Dataset. Claude will apply minor modifications but will succeed in training.

Group Relative Policy Optimization (GRPO)

GRPO is a reinforcement learning task that is proven to be effective on verifiable tasks like solving math problems, writing code, or any task with a programmatic success criterion.

Train a math reasoning model using GRPO on the openai/gsm8k dataset based on Qwen3-0.6B.

The model generates responses, receives rewards based on correctness, and learns from the outcomes. This is more complex than SFT or DPO, but the configuration is similar.

Hardware and Cost

The agent selects hardware based on your model size, but understanding the tradeoffs helps you make better decisions.

Model Size to GPU Mapping

For tiny models under 1B parameters, t4-small works well. These models train quickly—expect $1-2 for a full run. This is perfect for educational or experimental runs.

For small models (1-3B), step up to t4-medium or a10g-small. Training takes a few hours and costs $5-15.

For medium models (3-7B), you need a10g-large or a100-large with LoRA. Full fine-tuning doesn't fit, but LoRA makes these very trainable. Budget $15-40 for production.

For large models (7B+), this HF skills job is not suitable.

Demo vs Production

When testing a workflow, start small:

Do a quick test run to SFT Qwen-0.6B with 100 examples of my-org/support-conversations.

The coding agent configures minimal training—enough to verify your pipeline works without real cost.

For production, be explicit:

SFT Qwen-0.6B for production on the full my-org/support-conversations.
Checkpoints every 500 steps, 3 epochs, cosine learning rate.

Always run a demo before committing to a multi-hour production job. A $0.50 demo that catches a format error saves a $30 failed run.

Dataset Validation

Dataset format is the most common source of training failures. The agent can validate datasets before you spend GPU time.

Check if my-org/conversation-data works for SFT training.

The agent runs a quick inspection on CPU (fractions of a penny) and reports:

Dataset validation for my-org/conversation-data:

SFT: ✓ READY
  Found 'messages' column with conversation format

DPO: ✗ INCOMPATIBLE
  Missing 'chosen' and 'rejected' columns

If your dataset needs transformation, the agent can show you how:

My DPO dataset uses 'good_response' and 'bad_response' instead
of 'chosen' and 'rejected'. How do I fix this?

The agent provides mapping code and can incorporate it directly into your training script.

Monitoring Training

Real-time monitoring helps you catch problems early. The skill configures Trackio by default—after submitting a job, you can watch metrics at:

https://huggingface.co/spaces/username/trackio

This shows training loss, learning rate, and validation metrics. A healthy run shows steadily decreasing loss.

Ask the agent about status anytime:

What's the status of my training job?

Job abc123xyz is running (45 minutes elapsed)

Current step: 850/1200
Training loss: 1.23 (↓ from 2.41 at start)
Learning rate: 1.2e-5

Estimated completion: ~20 minutes

If something goes wrong, the agent helps diagnose. Out of memory? the agent suggests reducing batch size or upgrading hardware. Dataset error? The agent identifies the mismatch. Timeout? The agent recommends longer duration or faster training settings.

Converting to GGUF

After training, you might want to run your model locally. The GGUF format works with llama.cpp and dependent tools like LM Studio, Ollama, etc.

Convert my fine-tuned model to GGUF with Q4_K_M quantization.
Push to username/my-model-gguf.

The agent submits a conversion job that merges LoRA adapters, converts to GGUF, applies quantization, and pushes to Hub.

Then use it locally:

llama-server -hf <username>/<model-name>:<quantization>

# For example, to run the Qwen3-1.7B-GGUF model on your local machine:
llama-server -hf unsloth/Qwen3-1.7B-GGUF:Q4_K_M

What's Next

We've shown that coding agents like Claude Code, Codex, or Gemini CLI can handle the full lifecycle of model fine-tuning: validating data, selecting hardware, generating scripts, submitting jobs, monitoring progress, and converting outputs. This turns what used to be a specialized skill into something you can do through conversation.

Some things to try:

Fine-tune a model on your own dataset
Build a preference-aligned model with SFT → DPO
Train a reasoning model with GRPO on math or code
Convert a model to GGUF and run it with Ollama

The skill is open source. You can extend it, customize it for your workflows, or use it as a starting point for other training scenarios.

Resources

SKILL.md — Full skill documentation
Training Methods — SFT, DPO, GRPO explained
Hardware Guide — GPU selection and costs
TRL Documentation — The underlying training library
Hugging Face Jobs — Cloud training infrastructure
Trackio — Real-time training monitoring

Datasets mentioned in this article 1

Codex is Open Sourcing AI models

December 11, 2025

llmfine-tuningtraining

Train AI models with Unsloth and Hugging Face Jobs for FREE

101

February 20, 2026

Community

Wow 😮 You're awesome 😎\n","updatedAt":"2025-12-04T19:09:31.677Z","author":{"_id":"5f1ba750cb8f993fa01f4678","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5f1ba750cb8f993fa01f4678/4-dAcvedO-tIxYJm6aLTL.jpeg","fullname":"Behrooz Azarkhalili","name":"ermiaazarkhalili","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":30,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8897186517715454},"editors":["ermiaazarkhalili"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/5f1ba750cb8f993fa01f4678/4-dAcvedO-tIxYJm6aLTL.jpeg"],"reactions":[{"reaction":"🤗","users":["burtenshaw","taesiri","AUsername111","real-jiakai"],"count":4}],"isReport":false}},{"id":"6931f37e7ca3caa55a72881d","author":{"_id":"6659fd841b9c4fb5cda9b161","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6659fd841b9c4fb5cda9b161/PZ79m3q9jL1MLK0VYa96e.png","fullname":"Dean Williams","name":"dinoamino","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-12-04T20:47:58.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Is this still usable without a Pro account? Will it be able to output everything up to \"Submit the job to Hugging Face Jobs\"?","html":"Is this still usable without a Pro account? Will it be able to output everything up to \"Submit the job to Hugging Face Jobs\"?\n","updatedAt":"2025-12-04T20:47:58.716Z","author":{"_id":"6659fd841b9c4fb5cda9b161","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6659fd841b9c4fb5cda9b161/PZ79m3q9jL1MLK0VYa96e.png","fullname":"Dean Williams","name":"dinoamino","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.852893590927124},"editors":["dinoamino"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6659fd841b9c4fb5cda9b161/PZ79m3q9jL1MLK0VYa96e.png"],"reactions":[{"reaction":"👀","users":["josephgitau","Muratt03","TheRealOKAI","herocouple","Rohitchaudhary2213","cmz1024"],"count":6}],"isReport":false}},{"id":"69321145bdad9fd465de5dc4","author":{"_id":"5e67bdd61009063689407479","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583857146757-5e67bdd61009063689407479.jpeg","fullname":"Clem 🤗","name":"clem","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":2994,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-12-04T22:55:01.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"So cool!","html":"So cool!\n","updatedAt":"2025-12-04T22:55:01.375Z","author":{"_id":"5e67bdd61009063689407479","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583857146757-5e67bdd61009063689407479.jpeg","fullname":"Clem 🤗","name":"clem","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":2994,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6243783235549927},"editors":["clem"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1583857146757-5e67bdd61009063689407479.jpeg"],"reactions":[{"reaction":"❤️","users":["evalstate"],"count":1}],"isReport":false}},{"id":"693271693a8b37d03cde5904","author":{"_id":"67c6a533c0b62d612c530e33","avatarUrl":"/avatars/82209727124385e34cc4eb72a902ccc8.svg","fullname":"Kyle Moore","name":"kylechristophermoore","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-12-05T05:45:13.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Is there data privacy when doing this?\n\nIs it posted privately to a personal/team hub? \n\nCould this be done locally without the push to the repo?","html":"Is there data privacy when doing this?\nIs it posted privately to a personal/team hub? \nCould this be done locally without the push to the repo?\n","updatedAt":"2025-12-05T05:45:13.177Z","author":{"_id":"67c6a533c0b62d612c530e33","avatarUrl":"/avatars/82209727124385e34cc4eb72a902ccc8.svg","fullname":"Kyle Moore","name":"kylechristophermoore","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9066386818885803},"editors":["kylechristophermoore"],"editorAvatarUrls":["/avatars/82209727124385e34cc4eb72a902ccc8.svg"],"reactions":[{"reaction":"👍","users":["Doctor-Chad-PhD","arpieb","merercalavera","TheRealOKAI","imace","Javadex"],"count":6}],"isReport":false}},{"id":"693287e3ccb25bf360f77989","author":{"_id":"63e979e9dd2c4effdd6a43ba","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63e979e9dd2c4effdd6a43ba/UaB8UVPwGO9KLjCe0yZC0.png","fullname":"Yuki Arimo","name":"yukiarimo","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":91,"isUserFollowing":false},"createdAt":"2025-12-05T07:21:07.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Another agentic way of wasting tokens","html":"Another agentic way of wasting tokens\n","updatedAt":"2025-12-05T07:21:07.101Z","author":{"_id":"63e979e9dd2c4effdd6a43ba","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/63e979e9dd2c4effdd6a43ba/UaB8UVPwGO9KLjCe0yZC0.png","fullname":"Yuki Arimo","name":"yukiarimo","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":91,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6542713046073914},"editors":["yukiarimo"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/63e979e9dd2c4effdd6a43ba/UaB8UVPwGO9KLjCe0yZC0.png"],"reactions":[{"reaction":"👍","users":["franco334578","TheRealOKAI","real-jiakai"],"count":3}],"isReport":false}},{"id":"6932b49aff4db1f36d8f9793","author":{"_id":"64f187a2cc1c03340ac30498","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/dMTUFA5Ul35v595JPKCMw.jpeg","fullname":"Jun Zhang","name":"jzhang533","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":55,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/TYYUxK8xD1AxExFMWqbZD.png","fullname":"BAIDU","name":"baidu","type":"org","isHf":false,"plan":"team"}},"createdAt":"2025-12-05T10:31:54.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"is it possible to use this inside vscode's copilot extension ?","html":"is it possible to use this inside vscode's copilot extension ?\n","updatedAt":"2025-12-05T10:31:54.157Z","author":{"_id":"64f187a2cc1c03340ac30498","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/dMTUFA5Ul35v595JPKCMw.jpeg","fullname":"Jun Zhang","name":"jzhang533","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":55,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/TYYUxK8xD1AxExFMWqbZD.png","fullname":"BAIDU","name":"baidu","type":"org","isHf":false,"plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8038625717163086},"editors":["jzhang533"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64f187a2cc1c03340ac30498/dMTUFA5Ul35v595JPKCMw.jpeg"],"reactions":[],"isReport":false}},{"id":"693320d1a96be1367dbb3b6d","author":{"_id":"67f00bf17530c3fccbb26c79","avatarUrl":"/avatars/f0d56f04b1def33dce872a8de71f560d.svg","fullname":"Anton Protopopov","name":"aprotopopov","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false},"createdAt":"2025-12-05T18:13:37.000Z","type":"comment","data":{"edited":true,"hidden":false,"latest":{"raw":"Skill documentation is not available at the provided link - https://github.com/huggingface/skills/blob/main/hf-llm-trainer/SKILL.md","html":"Skill documentation is not available at the provided link - <a href=\"https://github.com/huggingface/skills/blob/main/hf-llm-trainer/SKILL.md\" rel=\"nofollow\">https://github.com/huggingface/skills/blob/main/hf-llm-trainer/SKILL.md</a>\n","updatedAt":"2025-12-05T18:14:22.243Z","author":{"_id":"67f00bf17530c3fccbb26c79","avatarUrl":"/avatars/f0d56f04b1def33dce872a8de71f560d.svg","fullname":"Anton Protopopov","name":"aprotopopov","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":2,"isUserFollowing":false}},"numEdits":1,"identifiedLanguage":{"language":"en","probability":0.6841225624084473},"editors":["aprotopopov"],"editorAvatarUrls":["/avatars/f0d56f04b1def33dce872a8de71f560d.svg"],"reactions":[],"isReport":false},"replies":[{"id":"69332fad7326616c82b07e07","author":{"_id":"6319b36409baf858241f0f89","avatarUrl":"/avatars/909635453bf62a2a7118a01dd51b811c.svg","fullname":"shaun smith","name":"evalstate","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":337,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}},"createdAt":"2025-12-05T19:17:01.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Ah, we moved a couple of bits around in the repo -- link for that is here: https://github.com/huggingface/skills/blob/main/hf-llm-trainer/skills/model-trainer/SKILL.md -- I'll update the article 👍.","html":"Ah, we moved a couple of bits around in the repo -- link for that is here: <a href=\"https://github.com/huggingface/skills/blob/main/hf-llm-trainer/skills/model-trainer/SKILL.md\" rel=\"nofollow\">https://github.com/huggingface/skills/blob/main/hf-llm-trainer/skills/model-trainer/SKILL.md</a> -- I'll update the article 👍.\n","updatedAt":"2025-12-05T19:17:01.696Z","author":{"_id":"6319b36409baf858241f0f89","avatarUrl":"/avatars/909635453bf62a2a7118a01dd51b811c.svg","fullname":"shaun smith","name":"evalstate","type":"user","isPro":true,"isHf":true,"isHfAdmin":false,"isMod":false,"followerCount":337,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1583856921041-5dd96eb166059660ed1ee413.png","fullname":"Hugging Face","name":"huggingface","type":"org","isHf":true,"details":"The AI community building the future.","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8375345468521118},"editors":["evalstate"],"editorAvatarUrls":["/avatars/909635453bf62a2a7118a01dd51b811c.svg"],"reactions":[],"isReport":false,"parentCommentId":"693320d1a96be1367dbb3b6d"}}]},{"id":"6934801c7b4e69f34bd6c878","author":{"_id":"68092d1b2c91d31e3912264a","avatarUrl":"/avatars/3b09fdad9e2cbd7ad54fb276c94445cf.svg","fullname":"Mike Ehrmantraut","name":"AUsername111","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":12,"isUserFollowing":false},"createdAt":"2025-12-06T19:12:28.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is so cool. Many thanks.","html":"This is so cool. Many thanks.\n","updatedAt":"2025-12-06T19:12:28.579Z","author":{"_id":"68092d1b2c91d31e3912264a","avatarUrl":"/avatars/3b09fdad9e2cbd7ad54fb276c94445cf.svg","fullname":"Mike Ehrmantraut","name":"AUsername111","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":12,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9620528221130371},"editors":["AUsername111"],"editorAvatarUrls":["/avatars/3b09fdad9e2cbd7ad54fb276c94445cf.svg"],"reactions":[{"reaction":"❤️","users":["evalstate"],"count":1}],"isReport":false}},{"id":"693661bb7b4e69f34bd6c8ae","author":{"_id":"6932862502baca9ccdd4665d","avatarUrl":"/avatars/ca176894b2f946a3f371252248224246.svg","fullname":"Roman Gardner","name":"Roman1902","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-12-08T05:27:23.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"\"Really fascinating read! I found the explanation of Hugging Face’s “Skills Training” initiative — how it lets you use a coding‑agent (like Claude Code or other supported agents) to fine‑tune large language models, submit GPU jobs, monitor progress and push trained models to the Hub — particularly eye‑opening. The combination of high‑level instructions, hardware selection, monitoring, and automation makes the complex process of model training much more approachable, even for developers who may not be ML‑infrastructure experts. \n\nI also recently read a related guide: https://mobisoftinfotech.com/resources/blog/ai‑development/llm‑api‑pricing‑guide \n — which gives practical advice on LLM API usage, token‑based pricing, and how to plan costs when working with LLMs.\n\nPutting your article’s look into empowering accessible LLM fine‑tuning together with the cost‑management strategies from that guide gives a well‑rounded perspective: it helps developers understand not just what is possible now with modern tools, but also how to build and deploy responsibly, balancing capability and cost.\"","html":"\"Really fascinating read! I found the explanation of Hugging Face’s “Skills Training” initiative — how it lets you use a coding‑agent (like Claude Code or other supported agents) to fine‑tune large language models, submit GPU jobs, monitor progress and push trained models to the Hub — particularly eye‑opening. The combination of high‑level instructions, hardware selection, monitoring, and automation makes the complex process of model training much more approachable, even for developers who may not be ML‑infrastructure experts. \nI also recently read a related guide: <a href=\"https://mobisoftinfotech.com/resources/blog/ai%E2%80%91development/llm%E2%80%91api%E2%80%91pricing%E2%80%91guide\" rel=\"nofollow\">https://mobisoftinfotech.com/resources/blog/ai‑development/llm‑api‑pricing‑guide</a> — which gives practical advice on LLM API usage, token‑based pricing, and how to plan costs when working with LLMs.\nPutting your article’s look into empowering accessible LLM fine‑tuning together with the cost‑management strategies from that guide gives a well‑rounded perspective: it helps developers understand not just what is possible now with modern tools, but also how to build and deploy responsibly, balancing capability and cost.\"\n","updatedAt":"2025-12-08T05:27:23.785Z","author":{"_id":"6932862502baca9ccdd4665d","avatarUrl":"/avatars/ca176894b2f946a3f371252248224246.svg","fullname":"Roman Gardner","name":"Roman1902","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8867176175117493},"editors":["Roman1902"],"editorAvatarUrls":["/avatars/ca176894b2f946a3f371252248224246.svg"],"reactions":[],"isReport":false},"replies":[{"id":"6937a65cb8f3ce7a697f0415","author":{"_id":"643fd365e44f30a723213d32","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/qsYMVidL2s7CfqN_3stHW.png","fullname":"Daniel Omusula","name":"DanteWu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-12-09T04:32:28.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Slop alert","html":"Slop alert\n","updatedAt":"2025-12-09T04:32:28.364Z","author":{"_id":"643fd365e44f30a723213d32","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/qsYMVidL2s7CfqN_3stHW.png","fullname":"Daniel Omusula","name":"DanteWu","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.37732750177383423},"editors":["DanteWu"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/qsYMVidL2s7CfqN_3stHW.png"],"reactions":[],"isReport":false,"parentCommentId":"693661bb7b4e69f34bd6c8ae"}}]},{"id":"69369152d78c2090cef4a862","author":{"_id":"67e4339361b84dee66bbf79f","avatarUrl":"/avatars/d48bbf1fef37b3b155f5e516c69bc827.svg","fullname":"Julien Jouganous","name":"julienjouganous","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-12-08T08:50:26.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Great work and great article!\nRegarding the maximum models size we can train using this approach, at the beginning of the article it's mentioned \"models from 0.5B to 70B parameters\" but at the end you write that \"For large models (7B+), this HF skills job is not suitable\", which order of magnitude is correct?\nI suspect the max range is 7B, if it's the case, do you plan to support training of larger models?\nThanks!","html":"Great work and great article! Regarding the maximum models size we can train using this approach, at the beginning of the article it's mentioned \"models from 0.5B to 70B parameters\" but at the end you write that \"For large models (7B+), this HF skills job is not suitable\", which order of magnitude is correct? I suspect the max range is 7B, if it's the case, do you plan to support training of larger models? Thanks!\n","updatedAt":"2025-12-08T08:50:26.027Z","author":{"_id":"67e4339361b84dee66bbf79f","avatarUrl":"/avatars/d48bbf1fef37b3b155f5e516c69bc827.svg","fullname":"Julien Jouganous","name":"julienjouganous","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9466469287872314},"editors":["julienjouganous"],"editorAvatarUrls":["/avatars/d48bbf1fef37b3b155f5e516c69bc827.svg"],"reactions":[],"isReport":false}},{"id":"6937932f6290efe69fb7173e","author":{"_id":"638b66745d81d551ab44df52","avatarUrl":"/avatars/2e74d42f73fa197f2a79d39a8842b0cd.svg","fullname":"DAMIENELSON","name":"DAMIENE","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-12-09T03:10:39.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"is the trained model now open source and / or available to the public?","html":"is the trained model now open source and / or available to the public?\n","updatedAt":"2025-12-09T03:10:39.665Z","author":{"_id":"638b66745d81d551ab44df52","avatarUrl":"/avatars/2e74d42f73fa197f2a79d39a8842b0cd.svg","fullname":"DAMIENELSON","name":"DAMIENE","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9463943839073181},"editors":["DAMIENE"],"editorAvatarUrls":["/avatars/2e74d42f73fa197f2a79d39a8842b0cd.svg"],"reactions":[],"isReport":false}},{"id":"693a796c693e8158df69033e","author":{"_id":"64169a99bce2fed80ab86122","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1679202958868-noauth.jpeg","fullname":"Sigrid Jin","name":"sigridjineth","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":164,"isUserFollowing":false},"createdAt":"2025-12-11T07:57:32.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"https://huggingface.co/blog/sionic-ai/claude-code-skills-training\n\nNice work about the demo getting Claude Code to fine-tune an open LLM. But the researchers from Sionic AI already do most of their work with Claude Code. It writes training scripts, debugs CUDA errors, searches hyperparameters overnight. For the actual work of building models, Claude has become the default partner. But there was one thing it couldn't do - remember what the teammates learned last week.\n\nCheck how we do here :D","html":"<a href=\"https://huggingface.co/blog/sionic-ai/claude-code-skills-training\">https://huggingface.co/blog/sionic-ai/claude-code-skills-training</a>\nNice work about the demo getting Claude Code to fine-tune an open LLM. But the researchers from Sionic AI already do most of their work with Claude Code. It writes training scripts, debugs CUDA errors, searches hyperparameters overnight. For the actual work of building models, Claude has become the default partner. But there was one thing it couldn't do - remember what the teammates learned last week.\nCheck how we do here :D\n","updatedAt":"2025-12-11T07:57:32.040Z","author":{"_id":"64169a99bce2fed80ab86122","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1679202958868-noauth.jpeg","fullname":"Sigrid Jin","name":"sigridjineth","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":164,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9003725051879883},"editors":["sigridjineth"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1679202958868-noauth.jpeg"],"reactions":[],"isReport":false},"replies":[{"id":"695e8e432d9cf1829bd7026b","author":{"_id":"695d3df489b85dc68f206309","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png","fullname":"go go","name":"cveavy","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://www.gravatar.com/avatar/c50c76459362c26ca625f024fb5c1950?d=retro&size=100","fullname":"Futurepath Solutions","name":"futurepathsolutions","type":"org","isHf":false,"details":"Passionate about advancing the frontiers of artificial intelligence through research in large language models, multi-modal architectures, and efficient training methodologies. Particularly interested in model alignment, reasoning capabilities, and the intersection of NLP with computer vision.\r\n","plan":"team"}},"createdAt":"2026-01-07T16:48:03.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Right","html":"Right\n","updatedAt":"2026-01-07T16:48:03.609Z","author":{"_id":"695d3df489b85dc68f206309","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png","fullname":"go go","name":"cveavy","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://www.gravatar.com/avatar/c50c76459362c26ca625f024fb5c1950?d=retro&size=100","fullname":"Futurepath Solutions","name":"futurepathsolutions","type":"org","isHf":false,"details":"Passionate about advancing the frontiers of artificial intelligence through research in large language models, multi-modal architectures, and efficient training methodologies. Particularly interested in model alignment, reasoning capabilities, and the intersection of NLP with computer vision.\r\n","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.36209210753440857},"editors":["cveavy"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png"],"reactions":[],"isReport":false,"parentCommentId":"693a796c693e8158df69033e"}}]},{"id":"693b6d2b4db5ca8e59e9a716","author":{"_id":"68aba5d3d466d2506c935465","avatarUrl":"/avatars/45986a2f84b844e06250fe416681a52c.svg","fullname":"Deep","name":"illiliiiiil","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false},"createdAt":"2025-12-12T01:17:31.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"Is it possible to use it even in privately uploaded datasets?","html":"Is it possible to use it even in privately uploaded datasets?\n","updatedAt":"2025-12-12T01:17:31.985Z","author":{"_id":"68aba5d3d466d2506c935465","avatarUrl":"/avatars/45986a2f84b844e06250fe416681a52c.svg","fullname":"Deep","name":"illiliiiiil","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9924927949905396},"editors":["illiliiiiil"],"editorAvatarUrls":["/avatars/45986a2f84b844e06250fe416681a52c.svg"],"reactions":[],"isReport":false}},{"id":"6944b178c6953b50365d3dec","author":{"_id":"66a18c2696a2ff2a7c4ba554","avatarUrl":"/avatars/ddc40046800db4fb8a9b780b0aec3b1e.svg","fullname":"Ed Dan","name":"Ed13210","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2025-12-19T01:59:20.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"how many tokens will a session incur?","html":"how many tokens will a session incur?\n","updatedAt":"2025-12-19T01:59:20.972Z","author":{"_id":"66a18c2696a2ff2a7c4ba554","avatarUrl":"/avatars/ddc40046800db4fb8a9b780b0aec3b1e.svg","fullname":"Ed Dan","name":"Ed13210","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7473177313804626},"editors":["Ed13210"],"editorAvatarUrls":["/avatars/ddc40046800db4fb8a9b780b0aec3b1e.svg"],"reactions":[],"isReport":false}},{"id":"695e8dda3543fcff39fac85b","author":{"_id":"695d3df489b85dc68f206309","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png","fullname":"go go","name":"cveavy","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://www.gravatar.com/avatar/c50c76459362c26ca625f024fb5c1950?d=retro&size=100","fullname":"Futurepath Solutions","name":"futurepathsolutions","type":"org","isHf":false,"details":"Passionate about advancing the frontiers of artificial intelligence through research in large language models, multi-modal architectures, and efficient training methodologies. Particularly interested in model alignment, reasoning capabilities, and the intersection of NLP with computer vision.\r\n","plan":"team"}},"createdAt":"2026-01-07T16:46:18.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is genuinely game-changing for AI teams working with limited MLOps resources. Having Claude automatically handle hardware selection, job orchestration, and monitoring removes so much friction from the fine-tuning process - I've seen too many projects stall because teams get bogged down in the infrastructure complexity rather than focusing on model performance. The business impact here is huge: instead of needing dedicated DevOps engineers to manage training pipelines, data scientists can now iterate much faster on custom models. The fact that it supports the full production stack (SFT, DPO, RLHF) means you're not just prototyping but actually building deployment-ready models. What really excites me is the cost optimization angle - automatic hardware matching means you're not overpaying for compute while still getting reasonable training times. The multi-stage pipeline support is particularly valuable for enterprise use cases where you need that SFT → DPO → RLHF workflow for safety and alignment. This could democratize custom model development for mid-market companies who previously couldn't justify the engineering overhead. Looking forward to testing this on some internal projects where we've been manually managing these workflows.","html":"This is genuinely game-changing for AI teams working with limited MLOps resources. Having Claude automatically handle hardware selection, job orchestration, and monitoring removes so much friction from the fine-tuning process - I've seen too many projects stall because teams get bogged down in the infrastructure complexity rather than focusing on model performance. The business impact here is huge: instead of needing dedicated DevOps engineers to manage training pipelines, data scientists can now iterate much faster on custom models. The fact that it supports the full production stack (SFT, DPO, RLHF) means you're not just prototyping but actually building deployment-ready models. What really excites me is the cost optimization angle - automatic hardware matching means you're not overpaying for compute while still getting reasonable training times. The multi-stage pipeline support is particularly valuable for enterprise use cases where you need that SFT → DPO → RLHF workflow for safety and alignment. This could democratize custom model development for mid-market companies who previously couldn't justify the engineering overhead. Looking forward to testing this on some internal projects where we've been manually managing these workflows.\n","updatedAt":"2026-01-07T16:46:18.224Z","author":{"_id":"695d3df489b85dc68f206309","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png","fullname":"go go","name":"cveavy","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://www.gravatar.com/avatar/c50c76459362c26ca625f024fb5c1950?d=retro&size=100","fullname":"Futurepath Solutions","name":"futurepathsolutions","type":"org","isHf":false,"details":"Passionate about advancing the frontiers of artificial intelligence through research in large language models, multi-modal architectures, and efficient training methodologies. Particularly interested in model alignment, reasoning capabilities, and the intersection of NLP with computer vision.\r\n","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9416612386703491},"editors":["cveavy"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png"],"reactions":[],"isReport":false}},{"id":"69622c8b5d4f5276ab3cef27","author":{"_id":"68823d50ca5db489fd00d58b","avatarUrl":"/avatars/822c4cf4f7f3a0b464924457f2e051c4.svg","fullname":"Akili","name":"akiliaiafrica","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false},"createdAt":"2026-01-10T10:40:11.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"I think this document needs to be updated. The skills name has changed based on what I see on the huggingface github repo. ","html":"I think this document needs to be updated. The skills name has changed based on what I see on the huggingface github repo. \n","updatedAt":"2026-01-10T10:40:11.295Z","author":{"_id":"68823d50ca5db489fd00d58b","avatarUrl":"/avatars/822c4cf4f7f3a0b464924457f2e051c4.svg","fullname":"Akili","name":"akiliaiafrica","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9689856767654419},"editors":["akiliaiafrica"],"editorAvatarUrls":["/avatars/822c4cf4f7f3a0b464924457f2e051c4.svg"],"reactions":[],"isReport":false},"replies":[{"id":"6964d1ddaa865b63109b575c","author":{"_id":"695d3df489b85dc68f206309","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png","fullname":"go go","name":"cveavy","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://www.gravatar.com/avatar/c50c76459362c26ca625f024fb5c1950?d=retro&size=100","fullname":"Futurepath Solutions","name":"futurepathsolutions","type":"org","isHf":false,"details":"Passionate about advancing the frontiers of artificial intelligence through research in large language models, multi-modal architectures, and efficient training methodologies. Particularly interested in model alignment, reasoning capabilities, and the intersection of NLP with computer vision.\r\n","plan":"team"}},"createdAt":"2026-01-12T10:50:05.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"correct","html":"correct\n","updatedAt":"2026-01-12T10:50:05.552Z","author":{"_id":"695d3df489b85dc68f206309","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png","fullname":"go go","name":"cveavy","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":14,"isUserFollowing":false,"primaryOrg":{"avatarUrl":"https://www.gravatar.com/avatar/c50c76459362c26ca625f024fb5c1950?d=retro&size=100","fullname":"Futurepath Solutions","name":"futurepathsolutions","type":"org","isHf":false,"details":"Passionate about advancing the frontiers of artificial intelligence through research in large language models, multi-modal architectures, and efficient training methodologies. Particularly interested in model alignment, reasoning capabilities, and the intersection of NLP with computer vision.\r\n","plan":"team"}}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.6915013790130615},"editors":["cveavy"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/kJYw9Ts14b1MrcNlX8cv3.png"],"reactions":[],"isReport":false,"parentCommentId":"69622c8b5d4f5276ab3cef27"}}]}],"status":"open","isReport":false,"pinned":false,"locked":false,"collection":"community_blogs"},"contextAuthors":["burtenshaw","evalstate"],"primaryEmailConfirmed":false,"discussionRole":0,"acceptLanguages":["en"],"withThread":true,"cardDisplay":false,"repoDiscussionsLocked":false}">

ermiaazarkhalili

Dec 4, 2025

Wow 😮
You're awesome 😎

dinoamino

Dec 4, 2025

Is this still usable without a Pro account? Will it be able to output everything up to "Submit the job to Hugging Face Jobs"?

clem

Dec 4, 2025

So cool!

kylechristophermoore

Dec 5, 2025

Is there data privacy when doing this?

Is it posted privately to a personal/team hub?

Could this be done locally without the push to the repo?

yukiarimo

Dec 5, 2025

Another agentic way of wasting tokens

jzhang533

Dec 5, 2025

is it possible to use this inside vscode's copilot extension ?

aprotopopov

Dec 5, 2025

•

edited Dec 5, 2025

Skill documentation is not available at the provided link - https://github.com/huggingface/skills/blob/main/hf-llm-trainer/SKILL.md

evalstate

Article author Dec 5, 2025

Ah, we moved a couple of bits around in the repo -- link for that is here: https://github.com/huggingface/skills/blob/main/hf-llm-trainer/skills/model-trainer/SKILL.md -- I'll update the article 👍.

AUsername111

Dec 6, 2025

This is so cool. Many thanks.

Roman1902

Dec 8, 2025

"Really fascinating read! I found the explanation of Hugging Face’s “Skills Training” initiative — how it lets you use a coding‑agent (like Claude Code or other supported agents) to fine‑tune large language models, submit GPU jobs, monitor progress and push trained models to the Hub — particularly eye‑opening. The combination of high‑level instructions, hardware selection, monitoring, and automation makes the complex process of model training much more approachable, even for developers who may not be ML‑infrastructure experts.

I also recently read a related guide: https://mobisoftinfotech.com/resources/blog/ai‑development/llm‑api‑pricing‑guide
— which gives practical advice on LLM API usage, token‑based pricing, and how to plan costs when working with LLMs.

Putting your article’s look into empowering accessible LLM fine‑tuning together with the cost‑management strategies from that guide gives a well‑rounded perspective: it helps developers understand not just what is possible now with modern tools, but also how to build and deploy responsibly, balancing capability and cost."

DanteWu

Dec 9, 2025

Slop alert

julienjouganous

Dec 8, 2025

Great work and great article!
Regarding the maximum models size we can train using this approach, at the beginning of the article it's mentioned "models from 0.5B to 70B parameters" but at the end you write that "For large models (7B+), this HF skills job is not suitable", which order of magnitude is correct?
I suspect the max range is 7B, if it's the case, do you plan to support training of larger models?
Thanks!

DAMIENE

Dec 9, 2025

is the trained model now open source and / or available to the public?

sigridjineth

Dec 11, 2025

https://huggingface.co/blog/sionic-ai/claude-code-skills-training

Nice work about the demo getting Claude Code to fine-tune an open LLM. But the researchers from Sionic AI already do most of their work with Claude Code. It writes training scripts, debugs CUDA errors, searches hyperparameters overnight. For the actual work of building models, Claude has become the default partner. But there was one thing it couldn't do - remember what the teammates learned last week.

Check how we do here :D

cveavy

Jan 7

Right

illiliiiiil

Dec 12, 2025

Is it possible to use it even in privately uploaded datasets?

Ed13210

Dec 19, 2025

how many tokens will a session incur?

cveavy

Jan 7

This is genuinely game-changing for AI teams working with limited MLOps resources. Having Claude automatically handle hardware selection, job orchestration, and monitoring removes so much friction from the fine-tuning process - I've seen too many projects stall because teams get bogged down in the infrastructure complexity rather than focusing on model performance. The business impact here is huge: instead of needing dedicated DevOps engineers to manage training pipelines, data scientists can now iterate much faster on custom models. The fact that it supports the full production stack (SFT, DPO, RLHF) means you're not just prototyping but actually building deployment-ready models. What really excites me is the cost optimization angle - automatic hardware matching means you're not overpaying for compute while still getting reasonable training times. The multi-stage pipeline support is particularly valuable for enterprise use cases where you need that SFT → DPO → RLHF workflow for safety and alignment. This could democratize custom model development for mid-market companies who previously couldn't justify the engineering overhead. Looking forward to testing this on some internal projects where we've been manually managing these workflows.