Published April 28th, 2020 by Assaf Trafikant

Artificial Intelligence Vs. Machine Learning Vs. Deep Learning

It’s time to make sense of the terminology. AI, ML, and DL, from easy to hard, from top to bottom. Almost.
Over the past year, I’ve been working with several artificial intelligence-based systems (or machine learning, to be exact), along with some other buzzwords you’ve probably heard. These terms come up all the time in most of the professional groups I’m in, and not a day goes by without reading one or two articles on the subject. Sometimes it’s all bullshit; other times, it’s the real deal.

I’m going to try and make sense of the terminology, going from easy to hard, or more precisely – from (relatively) simple to complex.
I do have to admit that even though I’ve been immersed in these topics for years, I struggled to put it into something bright and coherent. So here’s me setting myself up, hoping for success. Let me know in the comments.

The Basics

The entire field falls under the broad title of Artificial Intelligence (AI), but if you want to understand the terminology and when to use it, we’ll focus on the software industry and how it treats this terminology.

In short: system designers will always describe their systems using the most precise and narrowest term – that is, the sub-group within the world of AI that best describes the idea behind their system.

Let’s look at this example: I developed a backgammon app and injected dozens of rules that would help it beat you. Best case (and presumptuous) scenario, I’d say that it’s an AI-based app. I’d probably say that there’s nothing special about it and that it’s nothing more than a mathematical app (a.k.a hard computing).
But if the app can identify your playing style, anticipate your moves and change its strategy as it goes, then you are most likely to fall victim to an hour-long lecture filled with AI terminology and acronyms. This is also known as “soft computing.” Developing code in a world of uncertainty about the input, output, and even the algorithm.

So let’s take it from the top, and focus on the three main layers (even though there are tons more!).

Artificial Intelligence (AI)

Artificial intelligence represents the notion that machines, programs, or technological mechanisms imitate human thinking. An app that plays chess against you is a great example.
On the other end, we could also look at the robotic vacuum that can avoid falling down the stairs or calculate the size of the room and optimal cleaning route, and can even return to its dock before its battery runs out.

An app that adds doggy ears to your head and a virtual tongue that sticks out when you open your mouth, Netflix and Spotify recommendations, Alexa, Siri, autonomous car, and face recognition tools – these are all great examples of AI from the past few years. And yes, a WordPress plugin that can read an article that you wrote and immediately generate ten tags – that’s also AI (even if it doesn’t seem like it).

As I said, developers of such systems tend to describe them using the most precise and narrowest term. They will not necessarily use the name AI that’s considered too generic and general. So when do they use it?
Usually, when the system is repetitive and routine, it doesn’t learn new things and pretty much does the same thing over and over again. An example is the gate to your parking garage that photographs your license plate, cross-checks it with a database, and decides whether or not to open the gate. There’s input, and there’s output. And it’s always the same output for the same input.

Now that we know that this is all AI, let’s move down a level.

Machine Learning (ML)

Let’s pretend I launched a Netflix-like streaming system, and as part of improving and increasing viewing times, I decided to develop a recommendation engine. To do that, I analyze my users’ viewing history – which films do they watch, what genre, and when.

When I just started and had maybe ten users and 100 movies, it was no big deal. I manually analyzed the datasheets, wrote two or three rules for a recommendation engine, and everything was fine—regular input and output.

I made a great AI-based system for myself. If you watched “Jurassic Park” and “Indiana Jones,” my system would recommend Poltergeist. If you watched “A Walk in the Clouds” and “The Lake House,” I would probably recommend “The Bridges of Madison County.”

But at some point, it became an impossible task. My movie database grew, I added TV shows, and I wanted to consider films that the viewer watched all the way through vs. those that they quit halfway, and by the 500th user, I was ready to hang myself. I couldn’t handle the amount of data, and the number of rules I had to create is endless and too complex to calculate – and that’s before the recommendations.

Another example:
I thought it would be easy to compete with Waze, so I developed my very own navigation app and called it Trafikant. I defined a rule where if a user leaves point A and wants to get to point B, the app would take them via a particular route that I chose for them.

However, I didn’t take into account the hour, the traffic, other vehicles, roadblocks, accidents, and police. The number of factors I should have taken into account is phenomenal, and at this given point, not only is it certain that there’s a better route I didn’t think of, but I also have no way of predicting the user’s time of arrival. I lost that user pretty quickly.

ML systems are there to help me with these two challenges. The first – a mass of data and parameters; the second – prediction. ML mechanisms analyze insane amounts of data and try to reach some sort of conclusion. If it’s a navigation app, the system will examine all of these factors and try to calculate the estimated duration of the trip.

Let’s say that it predicted a 20-minute journey. If the journey ended up taking 30 minutes, the algorithm would attempt to find out which factor had changed during the trip and why its prediction failed (“oops, I forgot that this 4-lane road becomes single-lane and that creates a delay, and that delay is a regular occurrence and not a random traffic jam. I’ll take that into account next time”).

Given enough of these events, the algorithm ‘realizes’ that it’s wrong and simply corrects itself and adds other factors to the calculation – number of lanes, maybe outside temperature… And so the algorithm gets another input and produces another output, over and over, checks the result, identifies where it went wrong, changes itself, amends the weight it gives to different factors and gets better with every trip.

The input remains seemingly the same, but the output changes. It’s enough to have left at 9 am for the algorithm to give a different result than going at 6 am. And if you told the app that you’re taking a motorcycle and not a car, everything will change accordingly.

ML systems, of course, serve Facebook and Google’s ad networks. Each tries, in its way, to predict which of the users who clicked the ad are expected to make a purchase. Both platforms try to identify intent.

How do they do it? At first, they simply guessed according to factors fed by humans, like Google deciding that those who watch unboxing videos on YouTube have a significant intent to make a purchase. Over time, assuming that the user indeed makes a purchase, the algorithm gets a point. If not, the algorithm gets ‘marks off.’
The more points – good and bad – it receives, the more the algorithm can improve itself, give greater weight to good factors, and ‘weaken’ less significant ones. But wait – who told the system to look at unboxing videos in the first place?

The truth is, no one did. Someone, a human, told the system to identify all of the videos that a user watches on YouTube and identify – based on the video, audio, video description, keywords, etc. – what kind of video it is.
Maybe after millions of videos, the algorithm starts to recognize a relationship between certain types of videos and actions such as website purchases. Google feeds the algorithm with all of the user’s activities: the emails they read, where they go, the photos they upload to the cloud, their text messages – any accessible information.

Everything is poured into a massive database from which Google tries to compile profiles and match between a personal profile and their chances to make a purchase, or any other action it wishes to identify.

This miraculous machine keeps learning new things and tries to make connections, predict outcomes, check and see if it was successful – and, if not, correct itself over and over again until it hits the target. This machine has no sentiments; all information is accepted, and if it finds a proven connection between your shoe size and videos of a cat sniffing its balls, it will use it – even if it doesn’t make any sense.

At this point, I want to remind you the phrase – correlation is not causation. The machine can find connections between certain behaviors, but it doesn’t mean that behavior A is causing outcome B. Only after millions of iterations and testings can the machine turn a correlation into causation.
Note the target or goal. The algorithm doesn’t wake up one day and decide what your app should do. That’s determined by the system’s creator – calculating ride times, designing an optimal path between points A and B, etc. You and I, the system’s creators, define the goal.
Google’s goal is for users to make a purchase, and everything eventually comes down to that, because Google is – first and foremost – an advertising system.

By the way, the definition of an optimal route is also human-made. The machine doesn’t know what optimal is. It’s just a word. That’s why we have to help it by telling it that optimal means minimum time, few stops, few traffic lights, etc. In short, the goal is determined by the person, not the machine. The machine only pursues the goal that was set for it.

Another example, though not as great: Netflix’s recommendation engine. For each movie, someone uploaded into the system its duration, genre, sub-genre, cast, year of production, and dozens of other parameters. The ML mechanism is supposed to take all of this information, combine it with your viewing habits, and knows how to recommend movies accordingly. But, that doesn’t always work, and Netflix’s recommendation system is considered a flop in industry terms.
If I watched Back to the Future 1 and 2, recommending number 3 is a no-brainer. I don’t need a complicated algorithm for that. This is an excellent example of an ML that could have been fantastic but simply doesn’t work well (if at all).

Let’s sum up: ML has two challenges – coping with masses of data and prediction. As far as the latter goes, Netflix is floundering. I think Spotify does an excellent job of handling these two challenges, and successfully tailors new materials that fit my taste out of a selection of over 50,000,000 pieces.

Some ML mechanisms are based on organized and labeled data like in Netflix, with all of the features of the movies as well as those of the viewers (nation, age, time of viewing, etc.). Still, there are also ML mechanisms with a bit more leeway that are based on incomplete information (we know everything about the movies, but nothing about you).

The others don’t necessarily try to build a recommendation engine, but rather try to find data patterns, deviations, etc.
So this complex called ML is made up of different algorithms that are adept at text analysis, others that focus on audio processing, those that analyze browsing history, or use the web page you are browsing to determine the site’s category and subject.

Dozens, if not hundreds of these mechanisms, run around and make up a full picture. That’s how most big ad networks work. The smarter Google/Facebook’s machine is, the better it will be at presenting the right ad to the right user at the right time and on the right device.
Ready to go down a level?

Deep Learning (DL)

Whereas ML learns from organized, labeled data, DL mechanisms (a sub-group of ML) learn from unrecognized inputs, which the system tries to categorize and classify. How does it work? Much like the human brain.

The human brain and neural network are comprised of billions of neurons. Each of these has receptors that receive input, an information processing system, and the ability to transmit the processed information onward (output) to other cells or neurons. One cell could collect and process data from several different neurons all at the same time, and according to the input, decide whether or not to take action, then transmit the ‘conclusion’ onward.

The great thing about neurons is that it takes millions of them to reach one conclusion. Each neuron layer processes the information in a certain way, transmits it onward to the next neuron layer for more precise processing, and so forth, like a manufacturing line where each employee adds a single unit of information and pass it on.

To make it as simplistic as possible, let’s pretend we’re holding an apple. One neuron layer will decipher the color and pass on this information (“red”). A subsequent one will interpret the texture (“firm”). Another neuron layer will crack the taste, or weight, and so on. Each neuron layer will process the information according to its ingrained mechanisms. Slowly, all of this information will accumulate until it is finally presented as a conclusion: I’m holding a red apple.

If the apple was blue, the information would be passed on from the first layer to identify the color, and at the same time from the second layer that identified the apple shape, and reaches the third layer that would then say, “hold on – there’s no such thing as a blue apple.” I’m passing this on to a different layer that would check to see if this is something that is not an apple.
By the way, I once held a white apple, and my brain told me that it wasn’t an apple. After biting into it, I realized that it’s a new breed of white apples that had just reached the stores. The new information was “saved” for future processing, e.g., the next white apple that comes along.

Now, want to hear something cool?

Remember how different websites used to have mangled texts that we had to recognize to prove that we’re human and get into the site (less of them use them now)? Google moved on to photos (“mark all the people” or “mark all the cars”), and eventually, all we were left with, is a small checkbox to tick.
What happened was that at first, Google used us to “teach” the machine where the car was in the image (early on, a person on the other end would check to see if we were right; this machine was still young and couldn’t tell for itself whether we were right). Out of billions of users who were tested, time and time again, we taught the machine where there was or wasn’t a car in the picture.

At the same time, an exciting process took place: Google’s system ran DL mechanisms on the pictures, to analyze images. The next time we see a picture of a street and are asked by the captcha to mark vehicles, the machine has already done the process on its own and is now waiting for us to mark them ourselves.

If we and the machine marked the same images, the machine gets the positive affirmation it needs, and we can enter the website. If there’s no match, two things happen: behind the scenes, the machine will try to figure out why there’s a gap, and might fix the algorithm (or decide that we were wrong); on our end, we’ll be given a new captcha challenge.

As time progressed, its level of precision increased, and it no longer needed training. Now Google has a mechanism that can identify vehicles in images – one that can recognize faces, signs, numbers, streetlights, bridges, etc. So one network identified circles, another singled-out bright spot, a different one identified four wheels, and so on (I’m simplistic here; the networks’ level of processing is much more abstract.)

Today, when you’re driving an autonomous car, its computer activates all of these mechanisms in real-time. It can identify vehicles and dozens of objects on the road in a matter of nano-seconds.

So, that’s it ? I hope I got the main points across, but if not – let me know. After all, it’s Human Learning…

More Articles

All articles

My Extensions

  • Analytics UTM Builder

    50,000+ USERS

    • Star
    • Star
    • Star
    • Star
    • Star

    (100)

    Google
    Download
  • Data Studio Auto Refresh

    1,000+ USERS

    • Star
    • Star
    • Star
    • Star
    • Star

    (170)

    Google
    Download
  • No Stress Booking

    10,000+ USERS

    • Star
    • Star
    • Star
    • Star
    • Star

    (40)

    Google
    Download