Artificial Intelligence in medical diagnosis


AI in medical diagnosis is a revolution. Further, explainable AI makes it reliable.

Generally, we find AI predictions as black boxes. However, now we have explainable AI.

Earlier, we learned Hypothyroid diagnosis using neural networks here. Further, we will understand explainable AI (XAI) using the same example.

We would find important features.For example, which one contributed to the model prediction.

Especially, using the same example. Therefore, we would use the same pre-trained model. Also, You can refer to this project at GitHub here.


In this tutorial, we would use the DALEX library to explain the trained model.

Above all, this explainable AI (XAI) is model agnostic. For example, it can work on linear models. Likewise, it can work on non-linear models.

Together with, AI in medical diagnosis we explain use of XAI.


  • Deep Learning: It is a machine learning technique. In addition, it enables learning through the use of neural networks that mimic the human brain.
  • Hypothyroid: We can call it underactive thyroid. In particular, if a person’s body doesn’t produce enough thyroid hormones.
  • SHAPLEY values: It measures the contribution of each feature based on Game Theory.


  • Programming knowledge in Python.
  • Basic knowledge of Jupyter Notebook, Deep Learning, Keras.

How can we explain an AI in medical diagnosis with a pre-trained Model?

For the most part, we can explain pre-trained models using libraries like DALEX and SHAP.

Further, they internally use various concepts like game theory, additive modelling, partial dependence.

To begin with, We can explain model with local interpretation or global interpretation.

In brief, global interpretations are for the whole dataset. Similarly, local interpretation means prediction done for a single instance of a dataset.

Though, we should prefer global interpretations in general. On the contrary, we can consider local interpretation as a postmortem. For example, we get a Hypothyroid prediction for person A. But, we have different opinions from healthcare experts for the same person A. Undoubtedly, we need to find out the reason.

In particular, we are using DALEX in this tutorial.

Step 1: Install DALEX  library as a python package.

!pip install dalex

Step 2. Load data and perform preprocessing.

We are preprocessing the data. However, I am not explaining the steps in detail here. Still, you can refer to them here.

# Load dataset from csv using pandas
dataset = pd.read_csv('data/hypothyroid.csv')
# Renaming the first column as target
dataset = dataset.rename(columns = {dataset.columns[0]:"target"})
dataset["target"] = dataset["target"].map({"negative":0,"hypothyroid":1})
# Replacing the categorical values into binary values
dataset = dataset.replace({'f':0,'t':1, 'y':1, 'n':0, 'M':0, 'F':1})
# Replacing ? into NaN values
dataset.replace(to_replace='?', inplace=True, value=np.NaN)
# Count the number of null values
# Dropping the TBG column as it contains extremely high number of null values
dataset.drop('TBG', axis = 1, inplace=True)
# Selecting columns with data type as 'object'
columns = dataset.columns[dataset.dtypes.eq('object')]
# Convert to numeric values
dataset[columns] = dataset[columns].apply(pd.to_numeric, errors='coerce')
# Replacing null values by mean
dataset['Age'].fillna(dataset['Age'].mean(), inplace = True)
dataset['T4U'].fillna(dataset['T4U'].mean(), inplace = True)
# Replacing null values by median
dataset['TSH'].fillna(dataset['TSH'].mean(), inplace = True)
dataset['T3'].fillna(dataset['T3'].median(), inplace = True)
dataset['TT4'].fillna(dataset['TT4'].median(), inplace = True)
dataset['FTI'].fillna(dataset['FTI'].median(), inplace = True)
# The gender data looks to be imbalanced with 0 lesser than 1
# Replacing null values with 0
dataset['Gender'].fillna(0, inplace = True)

Step 3. Load the saved pre-trained model.

We are loading our pre-trained model. For this purpose, We use the load_model() method from Keras.

from keras.models import load_model
model = load_model('models/saved_model.h5')

Step  4. Prepare the explainer for the model used for AI in medical diagnosis.

We are preparing the explainer for the process. Moreover, you can view below the parameters used in creating the explainer object.

explainer = dx.Explainer(model, X_train_df, Y_train_df, label='Hyporthyroidism')
Explainer object parameters

Step 5. Call model_parts() and plot the findings.

We remove the explanatory variable to check its impact on the result. To clarify, the method uses perturbations, like resampling from an empirical distribution or permutation of the values of the variable.

In particular, It calculates B = 10 permutations of variable importance on N = 1000 observations. Similarly, we can set custom values for B and N.

In summary, It appears that the features TT4, TSH, FTI were considered important in decision making.

explainable AI (XAI) in medical diagnosis

Step 6. Check model performance with different loss functions.


For example, we get following output.

AI in medical diagnosis

In summary, the model is doing good with all loss functions.

Step 7.Explaining feature importance using SHAPLEY values.


To sum up, TT4, FTI, and TSH are important features. In this case, we are using SHAP calculations.

AI diagnogis

Step.8 Check PD -partial dependence.

To begin with, we check partial dependency scores. For example, we call model_profile().

Especially, partial dependence plots change the value of one explanatory variable.

Meanwhile, it keeps other variables. As a result, it identifies the impact of that one variable.

pd_rf = explainer.model_profile()

explainable artificial intelligence-model profile

Subsequently, we plot PD for a group of variables.

explainer.model_profile().plot(variables=['TT4', 'FTI', 'TSH', 'T3'])
AI in diagnosis

Step.9 Use surrogate_model() method.

In particular, this is an interesting concept. For example, it uses a surrogate model. For instance, here the library uses DecisionTreeRegressor.

surrogate_model = explainer.model_surrogate(max_vars=6, max_depth=3)
Surrogate model -TreeRegressor

Learning Tools and Strategies:

  • Firstly, we can learn Additive Learning.
  • Secondly, we need to understand game theory.
  • Thirdly and most importantly, game theory helps to understand Shapley Additive explanations (SHAP).
  • Moreover, we use DALEX to have better plots and its documentation is also very detailed.
  • The SHAP library version has some issues. In other words, we can use the SHAP package as well.

In particular, the order mentioned above is more helpful to understand this concept easily. Further, you can find the references in the Citations section below.

Reflective Analysis:

Blackbox AI model is a big challenge. Moreover, we want reliable AI in medical diagnosis. Therefore, We need to explain models using XAI.

To sum up, without XAI, it is difficult to accept disease diagnosis. Thus, XAI is an imminent need in AI diagnosis.

Financial transactions are critical in nature. For instance, the decision to approve a loan is a critical AI use case. Especially, if we can not explain the loan rejection, customers become unhappy.

In short, We can use XAI as per need.

Conclusions and Future Directions for AI in medical diagnosis using XAI:

In conclusion, we have explained the importance of XAI in AI diagnosis. Further, XAI can explain neural networks. Therefore, it can explain nonlinear relationships.

Moreover, we use AI in financial decision making. For example, loan approvals, credit ranking, price prediction etc. So, we can use XAI for them as well.

In other words, we can use XAI everywhere. To sum up, XAI can help understand difficult diagnoses of Hypothyroid. In addition, It also helps us understand financial transactions.

Lastly, we discussed a few of the methods in this project. Similarly, we can learn other methods to use XAI.

Moreover, there are different tools. For instance, facets, What-if Tool and so on.