Hacker News — AI on Front Page · · 9 min read

The Smallest Brain You Can Build: A Perceptron in Python

Mirrored from Hacker News — AI on Front Page for archival readability. Support the source by reading on the original site.

230 pts · 39 comments on Hacker News

A perceptron is the smallest brain you can build. One number goes in. One yes-or-no answer comes out. That is the whole thing.

It sounds too simple to matter. But this tiny idea is the seed of every neural network running today. In this post we build a perceptron from scratch in Python, and we watch it learn, live, in your browser. No heavy math. No big libraries. Just a weight, a bias, and a loop.

I am not a native English speaker, and I am still learning this field myself. So I will explain it the way I needed someone to explain it to me. Slowly, and from the ground up.

What is a perceptron?#

In 1958, a researcher named Frank Rosenblatt built a machine he called the perceptron.

It was inspired by a single brain cell, a neuron. A neuron takes in signals, and if those signals are strong enough, it fires. Rosenblatt copied that idea in math:

output = 1   if (w · x + b) > 0
         0   otherwise

Here x is the input, w is the weight, and b is the bias. Do not worry about those words yet. We will meet each of them by building something real.

Think like a human first#

Before a machine decides anything, let us watch a human decide. Meet John Doe. He has a job offer, and he must answer one question: should he take it?

John does not flip a coin. He weighs things. Some factors matter to him more than others.

Factor (input)ValueHow much John cares (weight)
Extra payhigha lot
Stays in the same cityno, he must movea lot

John multiplies each factor by how much he cares about it, then adds everything up. If the total is high enough, he says yes. If not, he says no.

That is a perceptron. The factors are the inputs. How much he cares is the weight. And “high enough” is a threshold he carries in his head. Hold on to that threshold. Later we will give it a name: the bias.

How John Doe decides: each input is multiplied by a weight, the results are summed with a bias, and the total becomes one yes-or-no answer.

The simplest possible decision: is this number positive?#

Let us shrink the problem until almost nothing is left. One input. One question.

Is this number positive?

That is it. Feed the machine a number. It should answer True for positive and False for negative.

The machine makes its guess like this:

prediction = (weight * value + bias) > 0

Multiply the input by the weight, add the bias, and check if the result is above zero. If yes, it predicts True. If no, it predicts False. This little formula is the classifier, also called the decision function.

At the start, the weight and bias are just random numbers. So the machine guesses badly. Now comes the only clever part: it learns from its mistakes.

if prediction != result:
    error = result - prediction      # True - False = 1, False - True = -1
    weight += learning_rate * error * value
    bias   += learning_rate * error

When the guess is wrong, we nudge the weight and bias in the right direction. The error tells us which way to nudge. The learning rate decides how big each nudge is. We do this for every example, then repeat the whole pass again. One full pass over the data is called an epoch. Repeating epochs is training.

Here is that exact machine. Press Train and watch it learn. Each green dot is a positive number (True), each red dot is negative (False), and the blue dashed line is where it has decided to split them.

epoch 0 weight 0 bias 0 boundary  accuracy 0%

It snaps into place almost immediately. Look at the readout: the boundary lands right around 0, and the bias settles near 0 too.

That is not an accident. For this problem, we never needed the bias at all. Which is strange, because bias is supposed to be important. To see why it matters, we need a harder question.

What is a decision boundary?#

That blue line has a name: the decision boundary. It is the exact point where the machine flips from saying False to saying True.

We can compute it. The boundary sits where w · x + b = 0. Solve for x:

decision_boundary = -bias / weight

For “is this number positive,” the boundary should be at 0. And it is. Now watch what happens when the right answer is not at zero.

Why do we need bias? The student-pass example#

New problem. Same machine. We give it exam scores from 0 to 100, and we ask:

Did the student pass?

The rule is simple: a score of 50 or higher passes. So the decision boundary should sit at 50, not at 0.

Let us try to solve it the way we solved the last one, using the weight only. In the demo below, turn off “Use bias” and press Train.

epoch 0 weight 0 bias 0 boundary  accuracy 0%
Use bias

Watch the accuracy. It climbs to around 50 percent and then gets stuck. It cannot do better, no matter how long you train it.

Here is why. With no bias, the formula is just weight * score. Every exam score is a positive number. So if the weight is positive, the machine calls every student a pass. If the weight is negative, it fails everyone. The boundary is glued to 0, and it cannot move. A line forced through zero simply cannot separate “below 50” from “50 and up.”

Now turn “Use bias” back on and press Train again. The accuracy climbs all the way to 100 percent, and the boundary slides over and parks near 50.

That is the whole job of the bias. The weight sets the steepness. The bias moves the boundary left or right so it can sit wherever the answer actually is. Remember decision_boundary = -bias / weight. With a bias, the boundary can be anything. Without one, it is stuck at zero forever.

The one sentence to remember: when your inputs sit far from zero, you need a bias to move the line to them.

How does a perceptron learn? Epochs and learning rate#

You saw two dials while training: epochs and learning rate.

An epoch is one full pass over all the data. The machine rarely gets everything right in a single pass, so we go again, and again. More epochs means more chances to fix mistakes. That is why accuracy climbs as you keep training.

The learning rate is the size of each correction. In the code it is the learning_rate multiplier:

weight += learning_rate * error * value

Small steps are careful but slow. Big steps are fast but can overshoot and bounce around. Choosing it well is part of the craft. Here we used 0.1, which is gentle enough to stay stable.

Why do we normalize data?#

There is a quiet problem hiding in the pass example. Look at that update line again:

weight += learning_rate * error * value

The correction is multiplied by value. For exam scores, value can be as large as 100. So a single wrong guess can throw the weight by a huge amount. The machine still learns, but it lurches around instead of settling smoothly.

The fix is normalization: shrink the inputs to a small, tidy range before training. The simplest version is to divide every score by the largest possible score, so 0 to 100 becomes 0 to 1.

In the demo below, first press Train with normalization off and watch the accuracy line jump around on its way up. Then turn “Normalize data” on, reset, and train again. Same machine, same answer, but it gets there in a fraction of the epochs, and the climb is smooth.

accuracy per epoch
epoch 0 weight 0 bias 0 boundary  accuracy 0%
Normalize data

One honest note. With a single input like this, normalization mostly buys you speed and calm. It becomes essential when your inputs live on very different scales. Think back to John Doe: his pay was measured in thousands of dollars, but “same city” was just a 0 or a 1. Without normalization, the dollars would drown out everything else, and the machine would basically ignore the city. Putting both on the same scale lets each factor get a fair say. (Dividing by the max is the easy version; a common general method is to subtract the mean and divide by the spread, called standardization.)

The full perceptron in Python#

Here is the complete program for “is this number positive,” with nothing hidden. It is short enough to read in one sitting.

import random

learning_rate = 0.1
EPOCHS = 100

weight = random.uniform(-1, 1)
bias   = random.uniform(-1, 1)

# positive numbers are True, negative numbers are False
data  = [(i * 0.1, True)  for i in range(1, 501)]
data += [(i * 0.1, False) for i in range(-500, 0)]
random.shuffle(data)

for epoch in range(EPOCHS):
    for value, result in data:
        prediction = (weight * value + bias) > 0
        if prediction != result:
            error = result - prediction          # +1 or -1
            weight += learning_rate * error * value
            bias   += learning_rate * error

decision_boundary = -bias / weight
print(f"weight = {weight:.3f}")
print(f"bias   = {bias:.3f}")
print(f"decision boundary = {decision_boundary:.3f}")

To turn this into the student-pass machine, you change two things: make the data exam scores with result = score >= 50, and, if you want to feel the pain of a missing bias, freeze the bias at 0. Everything else stays the same.

Acknowledgments#

The core inspiration for this post came from the fantastic video ChatGPT is made from 100 million of these [The Perceptron] by Welch Labs. If you are a visual learner and want to see the rich history and hardware behind these concepts, I highly recommend watching it!

What’s next?#

You just built a working perceptron. It takes an input, weighs it, adds a bias, and decides. It learns from its own mistakes, one epoch at a time.

A single neuron can only draw one straight line. The magic starts when you stack them: the output of one neuron becomes the input of the next. Layer enough of them together and you get a neural network that can learn shapes far more tangled than a single line. But every one of those neurons is doing exactly what you just watched. A weight, a bias, a decision.

If you want the non-technical story of how I ended up writing code in Canada at all, I wrote about it here: The Outsider Who Shipped Anyway.

Thanks for building this with me. Now go change the numbers and break it. That is the fastest way to learn.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hacker News — AI on Front Page