r/LocalLLaMA · · 1 min read

DeepSpec - a deepseek-ai Collection

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

DeepSpec - a deepseek-ai Collection

DeepSpec

DeepSpec is a full-stack codebase for training and evaluating draft models for speculative decoding. It contains data preparation utilities, draft model implementations, training code, and evaluation scripts.

Released Checkpoints

The checkpoints below are the ones used for Table 1 in the paper. Each checkpoint was trained on open-perfectblend data generated by its corresponding target model in non-thinking mode, and is the direct output of the corresponding training configuration under config/.

Algorithm Qwen/Qwen3-4B Qwen/Qwen3-8B Qwen/Qwen3-14B google/gemma-4-12B-it
Eagle3 deepseek-ai/eagle3_qwen3_4b_ttt7 deepseek-ai/eagle3_qwen3_8b_ttt7 deepseek-ai/eagle3_qwen3_14b_ttt7 deepseek-ai/eagle3_gemma4_12b_ttt7
DFlash deepseek-ai/dflash_qwen3_4b_block7 deepseek-ai/dflash_qwen3_8b_block7 deepseek-ai/dflash_qwen3_14b_block7 deepseek-ai/dflash_gemma4_12b_block7
DSpark deepseek-ai/dspark_qwen3_4b_block7 deepseek-ai/dspark_qwen3_8b_block7 deepseek-ai/dspark_qwen3_14b_block7 deepseek-ai/dspark_gemma4_12b_block7

Important

If you cite these results in a new paper, align your setup with the training settings in this repository; otherwise, the comparison is not meaningful. For domain-specific use, fine-tune the draft model again for better results, especially if the target model is expected to run in thinking mode.

Supported Algorithms

Currently, DeepSpec includes three draft models: DSpark, DFlash and Eagle3.

HuggingFace : https://huggingface.co/collections/deepseek-ai/deepspec

GitHub : https://github.com/deepseek-ai/DeepSpec

submitted by /u/pmttyji
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA