Comment savoir si l'intelligence artificielle fonctionne comme nous le souhaitons

Crédit :Pixabay/CC0 Domaine public

Il y a une dizaine d'années, les modèles d'apprentissage en profondeur ont commencé à obtenir des résultats surhumains dans toutes sortes de tâches, qu'il s'agisse de battre des joueurs de jeux de société champions du monde ou de surpasser les médecins dans le diagnostic du cancer du sein.

Ces puissants modèles d'apprentissage en profondeur sont généralement basés sur des réseaux de neurones artificiels, qui ont été proposés pour la première fois dans les années 1940 et sont devenus un type populaire d'apprentissage automatique. Un ordinateur apprend à traiter les données en utilisant des couches de nœuds interconnectés, ou neurones, qui imitent le cerveau humain.

À mesure que le domaine de l'apprentissage automatique s'est développé, les réseaux de neurones artificiels se sont développés avec lui.

Les modèles d'apprentissage en profondeur sont désormais souvent composés de millions ou de milliards de nœuds interconnectés dans de nombreuses couches qui sont formés pour effectuer des tâches de détection ou de classification à l'aide de grandes quantités de données. Mais parce que les modèles sont extrêmement complexes, même les chercheurs qui les conçoivent ne comprennent pas entièrement comment ils fonctionnent. Il est donc difficile de savoir s'ils fonctionnent correctement.

Par exemple, peut-être qu'un modèle conçu pour aider les médecins à diagnostiquer les patients a correctement prédit qu'une lésion cutanée était cancéreuse, mais il l'a fait en se concentrant sur une marque non liée qui se produit fréquemment lorsqu'il y a du tissu cancéreux sur une photo, plutôt que sur le cancer. tissu lui-même. C'est ce qu'on appelle une fausse corrélation. Le modèle obtient la bonne prédiction, mais il le fait pour la mauvaise raison. Dans un contexte clinique réel où la marque n'apparaît pas sur les images positives pour le cancer, cela pourrait entraîner des diagnostics manqués.

Avec tant d'incertitudes autour de ces modèles dits de "boîte noire", comment peut-on démêler ce qui se passe à l'intérieur de la boîte ?

Ce casse-tête a conduit à un nouveau domaine d'étude en croissance rapide dans lequel les chercheurs développent et testent des méthodes d'explication (également appelées méthodes d'interprétabilité) qui cherchent à faire la lumière sur la façon dont les modèles d'apprentissage automatique en boîte noire font des prédictions.

Quelles sont les méthodes d'explication ?

À leur niveau le plus élémentaire, les méthodes d'explication sont soit globales, soit locales. Une méthode d'explication locale se concentre sur l'explication de la façon dont le modèle a fait une prédiction spécifique, tandis que les explications globales cherchent à décrire le comportement global d'un modèle entier. Cela se fait souvent en développant un modèle séparé, plus simple (et, espérons-le, compréhensible) qui imite le modèle plus grand de la boîte noire.

Mais comme les modèles d'apprentissage en profondeur fonctionnent de manière fondamentalement complexe et non linéaire, le développement d'un modèle d'explication global efficace est particulièrement difficile. Cela a conduit les chercheurs à se concentrer davantage sur les méthodes d'explication locales, explique Yilun Zhou, un étudiant diplômé du groupe de robotique interactive du laboratoire d'informatique et d'intelligence artificielle (CSAIL) qui étudie les modèles, les algorithmes et les évaluations en interprétable. apprentissage automatique.

Les types les plus populaires de méthodes d'explication locale se répartissent en trois grandes catégories.

Le premier type de méthode d'explication, et le plus largement utilisé, est connu sous le nom d'attribution de caractéristiques. Les méthodes d'attribution de caractéristiques montrent quelles caractéristiques étaient les plus importantes lorsque le modèle a pris une décision spécifique.

Les caractéristiques sont les variables d'entrée qui sont introduites dans un modèle d'apprentissage automatique et utilisées dans sa prédiction. Lorsque les données sont tabulaires, les entités sont tirées des colonnes d'un jeu de données (elles sont transformées à l'aide de diverses techniques afin que le modèle puisse traiter les données brutes). For image-processing tasks, on the other hand, every pixel in an image is a feature. If a model predicts that an X-ray image shows cancer, for instance, the feature attribution method would highlight the pixels in that specific X-ray that were most important for the model's prediction.

Essentially, feature attribution methods show what the model pays the most attention to when it makes a prediction.

"Using this feature attribution explanation, you can check to see whether a spurious correlation is a concern. For instance, it will show if the pixels in a watermark are highlighted or if the pixels in an actual tumor are highlighted," says Zhou.

A second type of explanation method is known as a counterfactual explanation. Given an input and a model's prediction, these methods show how to change that input so it falls into another class. For instance, if a machine-learning model predicts that a borrower would be denied a loan, the counterfactual explanation shows what factors need to change so her loan application is accepted. Perhaps her credit score or income, both features used in the model's prediction, need to be higher for her to be approved.

"The good thing about this explanation method is it tells you exactly how you need to change the input to flip the decision, which could have practical usage. For someone who is applying for a mortgage and didn't get it, this explanation would tell them what they need to do to achieve their desired outcome," he says.

The third category of explanation methods are known as sample importance explanations. Unlike the others, this method requires access to the data that were used to train the model.

A sample importance explanation will show which training sample a model relied on most when it made a specific prediction; ideally, this is the most similar sample to the input data. This type of explanation is particularly useful if one observes a seemingly irrational prediction. There may have been a data entry error that affected a particular sample that was used to train the model. With this knowledge, one could fix that sample and retrain the model to improve its accuracy.

How are explanation methods used?

One motivation for developing these explanations is to perform quality assurance and debug the model. With more understanding of how features impact a model's decision, for instance, one could identify that a model is working incorrectly and intervene to fix the problem, or toss the model out and start over.

Another, more recent, area of research is exploring the use of machine-learning models to discover scientific patterns that humans haven't uncovered before. For instance, a cancer diagnosing model that outperforms clinicians could be faulty, or it could actually be picking up on some hidden patterns in an X-ray image that represent an early pathological pathway for cancer that were either unknown to human doctors or thought to be irrelevant, Zhou says.

It's still very early days for that area of research, however.

Words of warning

While explanation methods can sometimes be useful for machine-learning practitioners when they are trying to catch bugs in their models or understand the inner-workings of a system, end-users should proceed with caution when trying to use them in practice, says Marzyeh Ghassemi, an assistant professor and head of the Healthy ML Group in CSAIL.

As machine learning has been adopted in more disciplines, from health care to education, explanation methods are being used to help decision makers better understand a model's predictions so they know when to trust the model and use its guidance in practice. But Ghassemi warns against using these methods in that way.

"We have found that explanations make people, both experts and nonexperts, overconfident in the ability or the advice of a specific recommendation system. I think it is very important for humans not to turn off that internal circuitry asking, 'let me question the advice that I am
given,'" she says.

Scientists know explanations make people over-confident based on other recent work, she adds, citing some recent studies by Microsoft researchers.

Far from a silver bullet, explanation methods have their share of problems. For one, Ghassemi's recent research has shown that explanation methods can perpetuate biases and lead to worse outcomes for people from disadvantaged groups.

Another pitfall of explanation methods is that it is often impossible to tell if the explanation method is correct in the first place. One would need to compare the explanations to the actual model, but since the user doesn't know how the model works, this is circular logic, Zhou says.

He and other researchers are working on improving explanation methods so they are more faithful to the actual model's predictions, but Zhou cautions that, even the best explanation should be taken with a grain of salt.

"In addition, people generally perceive these models to be human-like decision makers, and we are prone to overgeneralization. We need to calm people down and hold them back to really make sure that the generalized model understanding they build from these local explanations are balanced," he adds.

Zhou's most recent research seeks to do just that.

What's next for machine-learning explanation methods?

Rather than focusing on providing explanations, Ghassemi argues that more effort needs to be done by the research community to study how information is presented to decision makers so they understand it, and more regulation needs to be put in place to ensure machine-learning models are used responsibly in practice. Better explanation methods alone aren't the answer.

"I have been excited to see that there is a lot more recognition, even in industry, that we can't just take this information and make a pretty dashboard and assume people will perform better with that. You need to have measurable improvements in action, and I'm hoping that leads to real guidelines about improving the way we display information in these deeply technical fields, like medicine," she says.

And in addition to new work focused on improving explanations, Zhou expects to see more research related to explanation methods for specific use cases, such as model debugging, scientific discovery, fairness auditing, and safety assurance. By identifying fine-grained characteristics of explanation methods and the requirements of different use cases, researchers could establish a theory that would match explanations with specific scenarios, which could help overcome some of the pitfalls that come from using them in real-world scenarios.

Comment les sables bitumineux du Canada peuvent aider à construire de meilleures routes

Pourquoi les chemins de fer britanniques ne peuvent pas faire face aux vagues de chaleur et ce qui pourrait aider

Électronique