There’s just two weeks left in my semester in Stanford’s machine learning class. Avert your eyes, Mom, because I have a confession to make: I’m not entirely certain I’m going to pass. I purposefully signed up for the advanced track, in which you have to write your own code in addition to taking the weekly quizzes. But I’ve missed a few assignments, and those I’ve completed are almost always late, at which point you’re docked 20 percent for your tardiness.
This doesn’t really concern me, given that there’s very little on the line other than my own edification—no class credit or a potential employer waiting to see a report card. There won’t even be a report card, just a certificate of completion. I’ve learned a tremendous amount in this class, and it has reignited my interest in how even an amateur command of computer science can illuminate almost any topic that can be quantified. (Last week I proposed a way to group baseball teams with “clustering” algorithms.)
A safer approach, imaginary-GPA-wise, would have been to take this class on the basic track. This easier route would have saved me from many hours of frustrating coding. But I don’t think I would have emerged from the basic track with very much new knowledge. Even if I never write another line of Octave—the programming language we use for machine learning—it’s the coding assignments that have trained my new intuition for the subject of the class. (That said, I sincerely hope to write many more lines of Octave code in the service of predictive political journalism.)
The beauty of this experiment in distributed education is that the grading is automatic. Every week, we are presented with about a dozen programming scripts, several of which we have to complete in order to make that week’s assignment work. We’re given a certain set of data to work with while testing our code, but when we submit it for grading, it is tested with a different set of info. That way, the class’s automated T.A.s can verify that our code works generally, not just on the specific examples allotted by the homework. Within seconds of submitting our code, we learn whether we passed that part of the assignment.
Everyone learns differently, but I find it difficult to imagine that anyone could really absorb new concepts in advanced computer science without the hallowed trial and error of writing code, receiving oblique error messages, and tweaking it until it works. The odds that you’ll stumble on a workable solution in your code, without really internalizing the concepts you need to pass the assignment, approach zero very quickly.
In this sense, this class is both a lecture and a lab in the same class. And while I don’t recommend this particular course for those new to programming or uninterested in math, it is a wonderful demonstration of how learning occurs as much during the coursework as it does through the lectures.
Many Internet illuminati have suggested that the future of journalism rests in reporters learning how to write their own code. I stumbled onto this path of the reporter-developer somewhat by accident, but I’m unconvinced that a newsroom needs more than two or three reporters with a strong command of the blood and guts of Web mechanics going forward. (Granted, this figure is generally zero at the moment.) Stanford plans to expand its free online offerings in the new year, one of which is reportedly titled “Computer Science 101.” This strikes me as an opportunity for anyone looking for ways to innovate in a nontechnical field. For example, one can write some extremely simple code to perform advanced analysis of a piece of literature. People have used this to scan the entire Bible and see which characters are mentioned in the same sentences or chapters. Another simple function could take two drafts of manuscripts and identify exactly what changed between versions. Many social scientists are gathering priceless data from simple interactive websites and smartphone apps.
Nothing I’m describing is revolutionary—computer scientists have teamed up with the humanities for decades. But computer scientists are busy people in high demand. It’s a lot easier if you learn to do it yourself.