Technical Papers

OptiLIME: Optimized LIME Explanations for Diagnostic Computer Algorithms

Giorgio Visania, Enrico Bagli and Federico Chesania

Abstract of OptiLIME: Optimized LIME Explanations for Diagnostic Computer Algorithms

Local Interpretable Model-Agnostic Explanations (LIME) is a popular method to perform interpretability of any kind of Machine Learning (ML) model. It explains one ML prediction at a time, by learning a simple linear model around the prediction. The model is trained on randomly generated data points, sampled from the training dataset distribution and weighted according to the distance from the reference point - the one being explained by LIME. Feature selection is applied to keep only the most important variables, their coefficients are regarded as explanation. LIME is widespread across different domains, although its instability - a single prediction may obtain different explanations - is one of the major shortcomings. This is due to the randomness in the sampling step, as well and determines a lack of reliability in the retrieved explanations, making LIME adoption problematic. In Medicine especially, clinical professionals trust is mandatory to determine the acceptance of an explainable algorithm, considering the importance of the decisions at stake and the related legal issues. In this paper, we highlight a trade-off between explanation’s stability and adherence, namely how much it resembles the ML model. Exploiting our innovative discovery, we propose a framework to maximise stability, while retaining a predefined level of adherence. OptiLIME provides freedom to choose the best adherence-stability trade-off level and more importantly, it clearly highlights the mathematical properties of the retrieved explanation. As a result, the practitioner is provided with tools to decide whether the explanation is reliable, according to the problem at hand. We extensively test OptiLIME on a toy dataset - to present visually the geometrical findings - and a medical dataset. In the latter, we show how the method comes up with meaningful explanations both from a medical and mathematical standpoint.

Introduction to OptiLIME: Optimized LIME Explanations for Diagnostic Computer Algorithms

Nowadays Machine Learning (ML) is pervasive and widespread across multiple domains. Medicine makes no difference, on the contrary it is considered one of the greatest challenges of Artificial Intelligence. The idea of exploiting computers to provide assistance to the medical personnel is not new. An historical overview on the topic, starting from the early ‘60s is provided in. More recently, computer algorithms have been proven useful for patients and medical concepts representation, outcome prediction and new phenotype discovery. An accurate overview of ML successes in Health related environments, is provided by Topol in.

Unfortunately, ML methods are hardly perfect and, especially in the medical field where human lives are at stake, Explainable Artificial Intelligence (XAI) is urgently needed. Medical education, research and accountability (“who is accountable for wrong decisions?”) are some of the main topics XAI tries to address. To achieve the explainability, quite a few techniques have been proposed in recent literature. These approaches can be grouped based on different criterion such as i) Model agnostic or model specific ii) Local, global or example based iii) Intrinsic or posthoc iv) Perturbation or saliency based. Among them, model agnostic approaches are quite popular in practice, since the algorithm is designed to be effective on any type of ML model.

LIME is a well-known instance-based, model agnostic algorithm. The method generates data points, sampled from the training dataset distribution andweighted according to distance from the instance being explained. Feature selection is applied to keep only the most important variables and a linear model is trained on the weighted dataset. The model coefficients are regarded as explanation. LIME has already been employed several times in medicine, such as on Intensive Care data and cancer data.

The technique is known to suffer from instability, mainly caused by the randomness introduced in the sampling step. Stability is a desirable property for an interpretable model, whereas the lack of it reduces the trust in the explanations retrieved, especially in the medical field.In our contribution, we review the geometrical idea on which LIME is based upon. Relying on statistical theory and simulations, we highlight a trade-off between the explanation’s stability and adherence, namely howmuch LIME’s simple model resembles theMLmodel. Exploiting our innovative discovery, we propose OptiLIME: a framework to maximise the stability, while retaining a predefined level of adherence. OptiLIME provides both i) freedom to choose the best adherencestability trade-off level and ii) it clearly highlights the mathematical properties of the explanation retrieved. As a result, the practitioner is provided with tools to decide whether each explanation is reliable, according to the problem at hand.

We test the validity of the framework on a medical dataset, where the method comes up with meaningful explanations both from a medical and mathematical standpoint. In addition, a toy dataset is employed to present visually the geometrical findings. The code used for the experiments is available at https://github.com/giorgiovisani/LIME_stability.

ARE YOU A DEVELOPER?

Check out all the resources for TPPs and developers on the Crif Platform development portal.

REQUEST YOUR FREE COPY

PRIVACY POLICY PURSUANT TO ART. 13 OF EU REGULATION 679/2016 (“GDPR”)

In accordance with the legislation in force on the protection of personal data, CRIF S.p.A., located at Via Fantin 1-3, 40131 Bologna, Italy, VAT No. 02083271201 (“CRIF”), as the Controller for the processing of your personal data, must provide you with certain information concerning the use of such data. 1 – Purpose of the processing of personal data and lawful basis of the processing 1.1 – Purpose and lawful basis of the processing Your personal data is processed by CRIF for the following purposes: a) for the purpose of fulfilling contact requests. Lawfulness of processing: art. 6(1)(b) of the GDPR. b) for marketing and/or information purposes, as well as market analysis and initiatives related to CRIF activities, including via automated calling systems (e.g., SMS, MMS, e-mail, fax). Lawfulness of processing: art. 6(1)(a) of the GDPR. c) purpose of sharing/transferring your data with/to CRIF Group companies (refer to link https://www.crif.it/chi-siamo/la-nostra-presenza-globale/ to fulfill contact requests. Lawfulness of processing: art. 6(1)(b) of the GDPR. The provision of personal data for the purposes referred to in point (b) is optional, and the related processing requires the consent of the data subject; any refusal to provide consent will not give rise to any consequences. The provision of data for the purposes referred to in points (a) and (c) is necessary and does not require consent. The user is free to not provide this information, but in this case we will not be able to fulfill your requests. After the initial telephone/e-mail contact, if the user decides not to subscribe to any service or to purchase any product or states that he/she does not want to be contacted again, the Controller will cancel the user’s details. Likewise, users can decide not to receive any marketing communications at any time by using the opt-out link at the bottom of each message and in any case exercising the relative right to withdraw consent. Any other processing for different purposes is excluded. 2 - Retention times 2.1 We hereby inform you that your personal data will be processed and retained for up to 5 years or in any case until you withdraw your consent. In this regard, you can withdraw consent for the processing of personal data for the purposes described in point 1.1 (b) at any time by e-mailing: dirprivacy@crif.com. 3 – Methods of data processing 3.1 Data processing is carried out using manual, computerized and ICT tools according to methods strictly related to the purposes themselves and, in any case, in a way that guarantees the confidentiality and security of the data. 4 – Categories of subjects to which personal data can be communicated or who may become aware of such data 4.1 – To achieve the purposes described in point 1.1 “Purpose and lawful basis of the processing” of this Privacy Policy, CRIF may communicate your personal data to third parties belonging to the following categories: a) personnel authorized to perform the processing, or third-party subjects appointed as processors; b) CRIF Group companies, including outside the European Union, which will act as independent controllers and will provide their own privacy notice in accordance with art. 14 of the GDPR. 5 – Transfer of data outside the European Union 5.1 To achieve the purposes described in point 1.1 letter c) “Purpose and lawful basis of the processing” of this Privacy Policy, CRIF may also communicate your personal data to CRIF Group companies based outside the European Economic Area. 5.2 The above transfer may be put in place, without specific authorizations, if the third country to which the data is transferred falls under those which guarantee an adequate level of protection according to the European Commission. In the absence of such an adequacy decision adopted by the European Commission, this transfer to recipients located in third countries can be carried out by adopting and documenting the sufficient guarantees referred to in art. 46 of the GDPR. In the absence of an adequacy decision or additional guarantees, the transfer of personal data to recipients located in third countries can be carried out if the terms are met and the additional conditions set out by Chapter V of the GDPR exist, including the possibility to make use of the derogations for specific situations in art. 49 of the GDPR. 5.3 A list of countries where CRIF Group companies operate is available at: https://www.crif.it/chi-siamo/la-nostra-presenza-globale/ 6 - Data Subject rights 6.1 According to Chapter III of the GDPR, as the Data Subject, you have the right to (i) obtain confirmation of whether personal data relating to you is being processed, obtaining the information listed in article 15 of the Regulation; (ii) obtain rectification of inaccurate personal data regarding you or to have incomplete personal data completed; (iii) obtain deletion of personal data regarding you, pursuant to and with the limitations set out in article 17 of the Regulation; (iv) obtain the restriction of processing of your personal data, in the cases specified in article 18 of the Regulation; (v) receive the personal data concerning you in a structured and machine-readable format, in the cases specified in article 20 of the Regulation; (vi) oppose the processing of personal data pursuant to and with the limitations set out in article 21 of the Regulation, even only for automated contact; and (vii) withdraw consent at any time, without prejudice to the lawfulness of the processing based on the consent given prior to the withdrawal. 7 - Controller 7.1 The Controller responsible for the processing of personal data is CRIF S.p.A., Via Mario Fantin 1‐3, 40131 Bologna, Italy, VAT No. 02083271201. A complete list of Processors is available from the Controller’s head office. The following methods can be used to exercise the rights set out in Chapter III of the GDPR: - e-mail sent to the address: dirprivacy@crif.com; - certified e-mail sent to the address: crif@pec.crif.com 7.2 You can also submit a complaint to the Italian Data Protection Authority, following the instructions via the link: http://www.garanteprivacy.it/web/guest/home/docweb/-/docweb-display/docweb/4535524. 8 – Data Protection Officer 8.1 For any questions regarding the processing of your personal data, you can contact the Data Protection Officer at: e-mail: dirprivacy@crif.com: Certified e-mail: crif@pec.crif.com.