r/LocalLLaMA · · 1 min read

ascend-tribe/openPangu-2.0-Flash (They haven't uploaded it to Huggingface yet)

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

https://ai.gitcode.com/ascend-tribe/openPangu-2.0-Flash

openPangu-2.0-Flash is an MoE model trained on Ascend. The model has 92B total parameters and 6B activated parameters. Its context length is 512k. The total pretraining data contains 34T tokens. During Post-training, openPangu-2.0-Flash is trained through unified SFT with slow and fast thinking capability, multiple specialist RL traning, on-policy distillation combining multiple RL specialists.

submitted by /u/External_Mood4719
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA