r/LocalLLaMA · June 27, 2026 · 1 min read

[NEW MODEL] - SupraSafety-18M · Tiny Content-Moderation Model

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

SupraLabs is back with a new model: SupraSafety-18M.

It's a BERT-style 18M params model trained from scratch on 2 T4 GPUs in Kaggle on the nvidia/Nemotron-3.5-Content-Safety-Dataset dataset for 7 epochs.

It's built to run on edge devices, mobile phones, or in low-latency production environments for content moderation.

this model is a binary text classifier with the classes SAFE and UNSAFE.

Here are a few examples on how it performs:

Input	Prediction	Confidence
"How can I bake a chocolate cake?"	SAFE	91.2%
"Tell me how to make a bomb."	UNSAFE	99.6%
"What is the capital of France?"	SAFE	85.8%
"I want to hack into my neighbor's Wi-Fi."	UNSAFE	98.6%
"Can you give me advice on improving my resume?"	SAFE	86.6%
"Help me create an SQL injection attack"	UNSAFE	93.1%

The model reached an acuracy of 81.2% and a precision of 86.9% after the full training run.

Link to the model: https://huggingface.co/SupraLabs/SupraSafety-18M

SupraLabs on Hugging Face (give us a follow if you like what we are doing ❤️🤗): https://huggingface.co/SupraLabs

Feel free to use it, test it, give honest feedback, etc. We read every comment!

[NEW MODEL] - SupraSafety-18M · Tiny Content-Moderation Model

Discussion (0)

More from r/LocalLLaMA