r/MachineLearning · · 1 min read

Neuron Populations Exhibit Divergent Selectivity with Scale [R]

Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.

Neuron Populations Exhibit Divergent Selectivity with Scale [R]

Hi! We just released a paper where we study “Rosetta Neurons”: universal neurons across different neural networks, and their relationship to scaling laws, specialization, and monosemanticity. Would love to kick off a discussion and get the community's thoughts.

Main Findings: We find that the universal Rosetta Neurons scale as a sublinear power law: larger models have more of them, but they occupy a shrinking fraction of all neurons. They also become more selective/monosemantic and more specialized with scale. We can use a single Rosetta Neuron to filter data for continued pretraining and nearly match oracle data filtering.

Paper: https://arxiv.org/abs/2606.03990

Summary thread: https://x.com/_AmilDravid/status/2062959617941074069?s=20

Code: https://github.com/avdravid/rosetta-neuron-scaling

Project page: https://avdravid.github.io/rosetta-neuron-scaling/

https://preview.redd.it/sus4wqc9g38h1.png?width=1806&format=png&auto=webp&s=4aac2b2209779cb05e1c73cdaadac860318f0162

submitted by /u/avd4292
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/MachineLearning