Rules

When most of us learned to talk, we started with simple words—like “Mama” or “Dada.”

Then we began to put those simple words together in short sentences like:

“The elegant interplay between chaos and order in dynamical systems can be succinctly captured by the equation ( x_{n+1} = r x_n (1 - x_n) ), which illustrates how small changes in initial conditions can lead to vastly different outcomes, a phenomenon known as sensitive dependence on initial conditions.”

Just kidding.

We probably started with something more like: “Want milk.”

As we grew, we learned more words. We learned more ways to them - using rules.

Over time, we became fluent in our language.

This is how early NLP researchers planned to teach computers language.

They taught them rules like - nouns, verbs, tenses, and grammar.

This early version of NLP is called Symbolic NLP. It was based on the idea that we could teach computers language by programming these rules into them.

The rules were like if-this-then-that statements.

For example: If the word “walk” comes after the word “I” then “walk” is a verb.

The Problem: Human language is very complex.

It has tons of exceptions and different ways to say the same thing.

Think about how hard it would be to write a rule to recognize sarcasm. That sounds super easy.

As you can imagine, teaching all the possible rules for all the different ways we speak takes a lot of work.

Maybe too much.

So some researchers decided to try a different method - one that focused on what computers do best: logic, statistics, and pattern recognition.

Let’s explore their approach.