Statistical Models

Some smart people thought it would be better to give computers lots of text and let them learn from it.

The idea was: if a computer read enough content, it could figure out language patterns on its own.

This is how the idea of language models began.

In theory, the more a model reads, the more it understands language.

The more it understands, the better it can generate language.

For example, take this well-known pangram:

“The quick brown fox jumps over the lazy dog.”

If a model sees this sentence often, it will probably guess the word “dog” if you asked it to complete this sentence:

“The quick brown fox jumps over the lazy ___.”

It’s ability to predict accurately comes down to statistics and probability.

Because the model has seen the full sentence so many times, it can calculate the probability of each word.

This shift from writing rules to using statistics and patterns was a BIG step forward for AI.

It led to a predictive approach, which is what most modern AI uses today: predicting or generating what comes next.

But, to get to where we are today, computers needed more power.

As it became available, these models evolved into something called Neural Networks.