Parameters

In an earlier lesson, I wrote that parameters define how much a model can “learn”.

If a language model were a brain, parameters would be its neurons - tiny nodes that help it make decisions.

The more parameters a model has, the more patterns it can learn - and the more detailed its responses can be.

Think of it like $1 USD.

Your model is $1.

That $1 is made up of 100 pennies.

Those pennies are individual units that can each make a transaction.

In this example, your model can make 100 transactions.

In an LLM, those transactions are lessons learned. They’re values inside the model that help decide how it responds to input.

When the model is training, it adjusts these values based on what it’s learning.

Modern LLMs have far more than 100 parameters though. They typically have billions (yes, billions) of parameters. ChatGPT-style models have hundreds of billions. Some cutting-edge models are reaching trillions. That’s a lot of learning power.

So how do we actually teach them?

There are two primary ways: Pretraining & Fine-Tuning.