Model Visualization and Explainability

Model explainability remains a hurdle towards widespread adoption and understanding of machine learning. In this notebook, we will train and visualize a neural net that predicts credit card defaults based on credit usage and payment history, plus some demographic information. The goal is to explore how we can use VIP to visualize the output of complex machine learning models, and to then explain the results in terms of input features.

In the final portion of the notebook, we also visualize the results of a gridsearch optimization of hyperparameters to determine what combinations of these hyperparameters optimize a gradient boosting machine (GBM).

Import Data and Preprocess

Data from UCI Machine Learning Repository https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients

This dataset contains information on default payments, demographic factors, credit data, history of payment, and bill statements of credit card clients in Taiwan from April 2005 to September 2005. All amounts are given in Taiwanese dollars.

See Kaggle for further explanation of the dataset: https://www.kaggle.com/uciml/default-of-credit-card-clients-dataset/

Load and Visualize Data in VIP