Artificial Intelligence & Mathematics: Algebra

Summary

Topic: Mathematics used in AI : Algebra

Summary: Algebra is an essential mathematical enhancer for AI. It allows to represent the real world in mathematical structures that can be easily manipulated by the computer.

Keywords: AI; Algebra;

Author: Sylvain LIÈGE

Note: This Paper was NOT written by AI, although AI might be used for research purposes.

1 Introduction

It is not a secret, Ai is making an extensive use of mathematics. What people talking about AI don’t often understand is that the “Intelligence” part of AI is in fact coming from mathematics. And what I mean here is that unless we consider mathematics as “intelligence”, then we have a semantic problem with AI. This does not take away any of the marvel that AI can do for us, but I doubt that the “I” of AI has been selected wisely at the time.

Anyway, this paper is about to introduce some (and I really mean “only some”) of the mathematics used constantly every time you use an application making use of AI.

2 Linear Algebra

By definition, linear algebra is the branch of mathematics concerning linear equations such as: a₁x₁+⋯+a_nx_n=b. This equation can be represented by the mean of a Vector [a₁, a₂, …, a_n]. A Vector can be seen as a list of numbers. Each number has a specific meaning in the real world.

In AI, we use vectors to represent all sorts of things, like:

Images: Each pixel in an image can be represented as a number, and the whole image as a big vector. In fact, it is easier to think of it as a Matrix, i.e. a 2 dimensional Vector (see image). But of course, if you put all columns of this matrix one after the other, you get …a huge vector.

In this simple case, we could have each pixel represented by 0 when black and 1 when white. Should we have colours and we can of course use numbers beyond 0 and 1.
Text: Words can be represented as vectors, where each number represents how similar the word is to other words.

Let’s use an example of a word vector to make it more concrete.

Imagine we want to represent the word “cat” as a vector. We might assign numbers to different characteristics of the word, such as:

Furriness: 5
Smallness: 4
Whiskers: 3
Meowiness: 5

We could represent the word “cat” as the vector: [5, 4, 3, 5].

Similarly, we could represent the word “dog” as a vector: [4, 3, 3, 2].

By representing words as vectors, we can calculate how similar or different two words are. For example, if two words have similar vectors, they likely have similar meanings. This is a fundamental concept in natural language processing, where computers can understand and process human language.

Data: Any kind of data can be turned into a vector, making it easier for computers to understand and process. This is an essential aspect of AI: we can convert almost anything into numbers that can be stacked into Vectors or Matrices and manipulated easily by the computer.

By using vectors and matrices (which are like grids of numbers), AI can recognize patterns, make predictions, and solve complex problems. For example, a computer can learn to recognize a cat in a picture by comparing the vector representation of the image to a database of cat images.

3 The Data Scientists

In an AI project, someone has to convert the real-world data into these Vectors and Matrices. It is rarely an obvious job. Someone has to decide how we convert an image taken from a car into a meaningful Vector, or how to convert the company documentation into meaningful text. Also decide how much information is really needed.

Let’s take an example:

Imagine you’re building a model to predict house prices.

You have a dataset containing information about houses, including:

House features:Square footage, number of bedrooms, number of bathrooms, lot size, etc.
Location features:Neighborhood, city, proximity to schools, distance to the city center, etc.
Owner information:Age, occupation, income level, etc.

While all this information might seem relevant, not all of it will be equally important for predicting house prices. For example, the owner’s occupation might not have a significant impact on the house’s value.

Choosing the right data features involves selecting the most relevant information that will contribute to the model’s accuracy. In this case, features like square footage, number of bedrooms, bathrooms, lot size, and location would be more relevant than the owner’s occupation.

In the end, the data could look something like that:

[Square footage; bedrooms; bathrooms; lot size; Zip code; City (code); ….]

Real data for model entry would look like:

[3250; 4; 2; 50000; 54365; 348; ….]

As you can see, we do convert every information into numbers that can then be manipulated by the computer easily.

All in all, the data must enter the system in a format understandable by the computer and the data must be relevant for the problem. This is the job of the Data Analyst. It is not the only job of this role, but it is one of them.

4 Where is the Intelligence?

Well, so far, we have not encountered anything that can be described as “Intelligence”. What is clever though, is the idea of representing the world, as complex as can be, in a simple way, easily used for computation. But, that is still human.