Placeholder How do machines recognize human emotions, and why should I care? | SINSMART

ROIPRESS / EXPERTS / TECHNOLOGY / AI / NEUROMARKETING - In order to answer these questions, we should start with this other one: How do you endow a machine with emotions? Everyone knows that one of the big differences between humans and machines or robots is that only people can feel emotions.

A machine is just a combination of zeros and ones, programmed by humans to perform different tasks, such as traffic light control, mathematical calculations, video games, etc. Even equipped with the so-called artificial intelligence, they are capable of beating the best chess players (1951) or more recently in the game of Go (2016).

This exponential growth of artificial intelligence (AI) capabilities has given rise to new very popular keywords such as #deepLearning, #machineLearning, #AI, or #NeuralNetworks. The technological advances of neural networks in the last five years have been incredible, from the mathematical point of view and with respect to their use in computers, making it possible to solve problems that, to date, were out of their reach or that required excessive time. processing, from weeks to months.

This is what has allowed, through AI, machines to identify human emotions. These machines cannot (at the moment) feel emotions like us humans, but they are able to understand our emotional state through images and videos. But the question is how is it possible?

Just like some humans do, it uses the information provided by micro expressions on people's faces.

The potential of universal micro expressions in human factions.

Micro expressions are a very fast movement (on the order of ¼ second) that humans cannot control. These include one or more facial muscles that are directly connected to the brain. These micro expressions have been studied by the scientific community, including Paul Ekman in 1978, who was able to classify the different facial muscles and determine that all humans universally express seven basic emotions in an identical way, regardless of our origin or origin. culture. These expressions are: happiness, surprise, sorrow, fear, anger, disgust and contempt.

Humans have been trained from a very young age to see and understand those expressions on the faces of those around us and act accordingly. For example, if you travel abroad, even if you do not speak the language of the natives, you instinctively understand that it is better not to approach someone like the person in the image on the left to ask something or you will even keep your distance. However, we would approach the person on the right, and we could ask him a question without feeling any risk.

OK, the machines that use micro expressions are inspired by Ekman's theories with a help from AI. But this does not explain how they work.

Let's answer that question. The machines use convolutional neural networks that use mathematical functions like Leaky Relu to deduce human emotions using an image or video.

If I'm not a PhD in math or computer engineering, and I'm not an expert in neuroscience either, can I understand how these machines work?

Don't worry. In less than five minutes, after reading the following paragraphs, you will be able to answer this question for yourself. To help us in this process, let us use an example.

What is the main emotion that a person can express?

Let's take the following 4 images from a video as an example, where a person is expressing a deep emotion. They are the first 30 tens of seconds from the birth of that emotion. As a person, you should be able to guess what the main emotion is, right?

If you look at the two images on the left, do you think he is a happy man, about to burst out laughing?

But if you now look at the two images on the right, do you think that he is a sad man, that he is about to cry? Or is he angry about to start screaming? You're a human with biological neurons, and it's not that obvious, is it?

Let's see how the machine faces this challenge using neural networks, or what is the same, AI, and let's take this opportunity to explain this operation in a simple way. You can test if the explanation is really simple, because even a child over 10 years old should understand it.

What is a neural network?

On these lines we have a neuron. Get the information from the arrow on the left. Mathematical calculations analyze the information and the result is transmitted via the arrow on the right.

Under these lines we have a network of 6 neurons. They all receive information, process it and transmit it. They are all interconnected, as human biological neurons are, with the difference that in this case they are zeros and ones in a computer.

Below this paragraph you can see an example of a neural network composed of several "groups" of neurons organized in columns called "hidden layers". Each column deals with processing information and sending it to the column on its right, until a certain point is reached. final result For example, take image #2 to show, step by step, the operation of the machine through its neural network.

The first step is to provide the machine with the image (the first arrow on the left). The first neuron deals with converting this image into a numerical table that represents the amount of red, green, and blue in each pixel of the image. It must be taken into account that the screens of our devices are made up of many points (pixels), in turn made up of a certain amount of red, green and blue, whose mixture gives rise to the final color of each point. These numerical tables are sent to the second column of neurons whose task is to identify the faces in the image.

The third step is to transmit the face to the third column of neurons, which is responsible for detecting strategic points on the face. Most neural networks identify between 68 and 105 points on a face.

The fourth step is to identify parts of the face such as the jaw, mouth, eyes or eyebrows.

The fifth step is to identify the number of emotions corresponding to each part of the face. A smile-shaped mouth will give us information related to happiness. Raised eyebrows can be associated with a surprise, etc.

The machine will eventually deduce that image #2 contains a human face which it associates with 65% happiness and 31% anger. According to the machine, in this image, the man is about to laugh, and therefore expresses happiness. In order to give a relevant result, the machine will analyze a succession of images from a video, as a person would do during a dialogue with another human, which is not a static action either.

Let's go back to our example, and see what the machine would deduce based on these 4 images.

What is the main emotion of this man according to the machine?

As we saw previously, the machine will follow four steps, according to each column of artificial neurons. Initially you will isolate the faces of each of the four images.

Then it will identify 68 points on each face and deduce where each part of the face is located (here we group steps 2 and 3)

Finally, the machine identifies the amount of emotions in each of the images:

The machine will compute the results of each image and deduce the main emotion that the person felt. According to the machine, this man is 66% sad. As a human, surely you also came to the same conclusion (but perhaps without assigning a percentage).

If you have come this far, you will be able to briefly explain how a machine can deduce emotions. Throw in some fancy words like convolutional neural networks or Leaky Relu and you'll be considered a scientist for a few minutes.

But surely you are wondering: what is it for? What can a machine achieve with that information?

Why should machines care about human emotions?

While we wait for machines to eventually be able to feel emotions, see them express themselves as humans, wouldn't it be great if they could at least understand our emotions?

For example, to avoid sending us a “Don't forget to read your child a story tonight” reminder when we are away on business or isolated abroad due to COVID-19, far away from loved ones.

If the machines can understand the sadness one feels in a night in that situation, perhaps they could offer us the possibility of a video call instead, or view a family photo album.

In the business world, this can have many beneficial implications for humans. In the health sector, for example, the machine can help patients with autism by helping them understand other human beings.

In the world of advertising, the machine could analyze the emotions of users and tell brands if the consumer is neutral or favorable to a new product or service, or what is the impact of certain messages or slogans.

For the movie industry, it could improve the quality of movies by ensuring that humorous notes are actually funny or that a dramatic moment really gets that feeling in viewers, and with what intensity.

Yes, the applications are endless.

However, as an expert in mathematics, computer science, neuroscience and non-verbal communication, I believe that using AI on micro facial expressions alone can lead to significant errors and biases.

The limit of micro expressions

As humans, the face is one of the main parts of our physiognomy to understand how other people feel, but it is by no means the only one. Every second and without realizing it, we analyze whether our interlocutor is looking directly at us or has an averted gaze, we analyze his gestures and the speed of his speech, as well as his tone of voice.

The distance at which our interlocutor is positioned also indicates the degree of relationship we have with him (if he is a close friend, it will be close, or further if he is a stranger or some work relationship). The machine therefore cannot be limited to only data collected from facial expressions and must build intelligence capable of analyzing many other criteria.

The winning combination of humans + machines.

What is the use of this artificial intelligence and emotions in the professional context?

Can the analysis of non-verbal communication alone be enough to deduce emotions and quantify their intensity? As an expert, my direct answer is NO.

It depends in particular on the context in which the person is interacting. For example, a real laugh with friends will be trivial and acceptable. On the other hand, if that same laugh occurs during a professional meeting where your client is expressing a disagreement with you, it will be very unwelcome and the consequences could be disastrous, although for the machine both situations would be perceived as an expression of happiness.

For professional use, machines will need human knowledge to get their analysis right, particularly regarding the context of the situations where the information is received, and many other criteria that are very complex for a machine to automate.

If we intelligently use this alliance of humans with machines, human capacities will improve, being able to facilitate exchanges between men and women, companies and clients or governments and citizens.

I am convinced that the human/machine alliance focused on the analysis of human emotions will improve interactions between people and improve the relevance of relationships between companies and customers.

Reprinted in:La Voz de La Empresa

Deixe um comentário

O seu carrinho

×