One of the big issues I’ve encountered in my teaching is explaining how to evaluate the performance of machine learning models. Simply put, it is relatively trivial to generate the various performance metrics–accuracy, precision, recall, etc–if you wanted to visualize any of these metrics, there wasn’t really an easy way to do that. Until now….
Recently, I learned of a new python library called YellowBrick, developed by Ben Bengfort at District Data Labs, that implements many different visualizations that are useful for building machine learning models and assessing their performance. Many visualization libraries require you to write a lot of “boilerplate” code: IE just generic and repetitive code, however what impressed me about YellowBrick is that it largely follows the scikit-learn API, and therefore if you are a regular user of scikit-learn, you’ll have no problem incorporating YellowBrick into your workflow. YellowBrick appears to be relatively new, so there still are definitely some kinks to be worked out, but overall, this is a really impressive library.
Visualizing a Model with YellowBrick
The code example below demonstrates how to create a classification report using YellowBrick. After importing the relevant modules, the next step is to create the visualizer object and pass it the model. Once that has been done, just like scikit-learn, you fit your visualizer using the .fit() method, some visualizations you call the .score() method, and finally the .poof() method which actually renders the visualization.
from yellowbrick.classifier import ClassificationReport
from sklearn.naive_bayes import GaussianNB
bayes = GaussianNB()
This code, reproduced from the YellowBrick documentation produces the classification report below.
YellowBrick is based on Matplotlib, so you can use common Matplotlib configuration variables to adjust the appearance of your visualizations.
What can you do with YellowBrick?
YellowBrick (YB) has many visualizers and is useful at all phases of the machine learning process. My intent here is not to reproduce the YB Documentation, but I’m going to highlight some of the more useful visualizations that exist in YB.
In the beginning… there was Feature Analysis…
One of the first challenges you will face when building a machine learning model is determining whether your features are predictive. A lot can go wrong in this step, but visualizing your features can really help you very quickly determine which features will be useful and which will not. For instance, the RadViz visualizer lets you map out the features in a circular space.
This visualization will quickly let you determine whether there is too much noise in your feature set and if there is a clear differentiation between the classes. This visualization works when you have a large number of features. Seaborn’s pairplot works well for this, but is difficult to use if you have a lot of features. In addition to the RadViz, YB has a heatmap ranking for the covariance of your features, as seen below.
Ideally, your features should not have strong relationships with each other, so you’re looking for features that are lightly colored on the heatmap. Dark colors–either red or blue indicate strong correlations between two features.
Assessing Model Performance
YB has a series of visualizations that are useful for evaluating the performance of your model. At the moment there are visualizations for regression and classification models, with more on the way. Again, I really like this because it makes it very easy and very quick to assess model performance. For classifiers, YB has the following visualizations:
- Confusion Matrices
- Classification Reports
For regression, YB has:
- Residuals Plot
- Prediction Error plot
To show some examples, here are two confusion matrices I worked up for a class:
For these matrices, on the top-left to bottom right diagonal, darker colors are better, and lighter colors are better in the opposite direction. You can clearly see the model on the right performs much better than the model on the left. The code for these is below:
svm_conf_matrix = ConfusionMatrix( svm_classifier )
svm_conf_matrix.fit( X_train, y_train )
svm_conf_matrix.score( X_test, y_test )
The classification report is also very useful when comparing models:
For these classification reports, darker colors represent higher percentages and darker colors are better. It is very evident that the model on the right performs better than the model on the left.
One of the visualizations which I’d really like to see in YB is a visualization of a model’s learning curve. I wrote code to add this, but then I saw that someone already submitted a pull request so hopefully this will be integrated soon. Below are two examples of learning curve visualizations.
This visualization can help you decide how much data you really need to train your model and also whether your model is overfit or underfit. The chart on the left represents a well-fit model. We see the cross-validation score increasing and approaching the training score as more data is being added. The visualization on the right represents a possibly-overfit model–and we know this because there is a large gap between the cross-validation score and the training score–and that this model probably needs more data as the lines seem to be converging.
So this is one of my longer posts, but I hope that you will give YellowBrick a try. If you are new to machine learning, these visualizations can really help you understand what is going into your models and more importantly, what is coming out.