Skip to content

Month: July 2017

Tutorial: Visualizing Machine Learning Models

One of the big issues I’ve encountered in my teaching is explaining how to evaluate the performance of machine learning models.  Simply put, it is relatively trivial to generate the various performance metrics–accuracy, precision, recall, etc–if you wanted to visualize any of these metrics, there wasn’t really an easy way to do that.  Until now….

Decision Boundary

Recently, I learned of a new python library called YellowBrick, developed by Ben Bengfort at District Data Labs, that implements many different visualizations that are useful for building machine learning models and assessing their performance.   Many visualization libraries require you to write a lot of “boilerplate” code:  IE just generic and repetitive code, however what impressed me about YellowBrick is that it largely follows the scikit-learn API, and therefore if you are a regular user of scikit-learn, you’ll have no problem incorporating YellowBrick into your workflow.  YellowBrick appears to be relatively new, so there still are definitely some kinks to be worked out, but overall, this is a really impressive library.

2 Comments

Tip of the Day: How I reclaimed 10GB of Hard Disk Space on my MacBook Pro

I love my MacBook Pro. Quite honestly, it’s the best laptop I’ve ever owned. However, my one regret is not buying the larger hard drive. Anyway, over the last few months, I’ve noticed that my free disk space kept on shrinking. I did all the usual stuff, deleted unneeded applications, ran various disk cleaning tools, etc until finally, I hit the motherlode… I discovered that brew, everyone’s favorite package manager was archiving old versions every time you ran brew update!!

To fix this problem… simply run: brew cleanup. I did this and voila! 10 GB of hard disk space cleaned up!

2 Comments

Announcing the First Release of Griffon: A Virtual Environment for Data Science

My colleagues Austin Taylor and Melissa Kilby are proud to announce the first stable release of Griffon:  A Virtual Machine for Data Science.   Griffon is a virtual machine which contains many data science tools pre-configured, installed and linked up to make it so that you don’t have to be a Linux expert to try them out.  If you are teaching a class, or if you are simply wanting to learn more about a particular tool, then Griffon is perfect for you.

You can download Griffon here: https://github.com/gtkcyber/griffon-vm.

Leave a Comment