I spend most of my time now teaching others about data science and as such I do a lot of research into what is going on with respect to data science education. As such I decided to take an online machine learning course and it led me to a serious question: why don’t we use pseudo-code to teach math concepts?
Consider the following:
This is the formula for Residual Sum of Squares, which if you aren’t familiar, is a metric used to measure the effectiveness of regression models.
Now consider the following pseudo-code:
residuals_squared = (actual_values - predictions) ^ 2
RSS = sum( residuals_squared )
This example expresses the exact same concept and while it does take up more space on the page, in my mind at least, is much easier to understand. I don’t have any empirical data to back this up, but I would suspect that many of you would agree.
Greek Letters are Jargon
Another thing I’ve realized is that part of the reason math becomes so difficult for people is that it is entirely taught in jargon, shorthand, and shorthand for shorthand. The greek letter sigma represents a sum, but if you don’t know that then it represents confusion. If you aren’t familiar with this formula, then the other Greek letters could be meaningless, yet if we used pseudocode, any part of this formula could be rewritten using English words (or any other language) and thus easily understood by anyone.
I’m working on developing a short course in Machine Learning called Crash Course in Machine Learning which I will be teaching at the BlackHat conference in August. I’m curious as to what people think about presenting algorithms using pseudo-code instead of math jargon. I suspect it will make it easier for people to understand without diluting the rigor.
[…] a compelling post about some challenges in machine learning education. You can read it for yourself here, but I’d like to build upon what he posted; it’s bothered me for a long time that the […]
You make a good point. I’ve written a small posting related to this, on my own Web log, “Data Mining in MATLAB” ( http://matlabdatamining.blogspot.com/2016/03/matlab-as-near-pseudocode.html ), revolving around my MATLAB implementation of your pseudo-code:
residuals_squared = (actual_values – predictions) .^ 2;
RSS = sum( residuals_squared );
This would be wonderful! Explaining data science concepts in this manner would hit the core of the concepts head on. This would be similar to explanations of math concepts on http://www.betterexplained.com
I took a course in data science with General Assembly in DC and it was helpful.
I urge you to check out this book (http://www.amazon.com/Fundamentals-Machine-Learning-Predictive-Analytics/dp/0262029448/ref=sr_1_4?s=books&ie=UTF8&qid=1457751712&sr=1-4&keywords=machine+learning)
Great blog! bookmarked
[…] “Teaching Data Science in English (Not in Math)”, the Feb-08-2016 entry of his Web log, “The Datatist”, Charles Givre criticizes the use […]
[…] via Teaching Data Science in English (not in Math) — The Dataist […]