This weekend, many of the NBA’s sharpest minds will gather at the Sloan Sports Analytics Conference, an increasingly splashy affair held each year by MIT, and now sponsored by ESPN. They’ve come a long way. Just over a decade ago, several of the most prominent among them first gathered in a much humbler spot: a Yahoo Groups message board called “APBR Analysis,” for the Association of Professional Basketball Researchers. In the very first post, on Feb. 10, 2001 at 10:32 p.m., a former Cal Tech hoopster turned statistics Ph.D. named Dean Oliver laid out an ambitious agenda of 12 issues. “To start off the group, I think that it is most appropriate to identify some of the outstanding questions in basketball,” he wrote. Some questions were practical. “Does Hack-a-Shaq work?” “Why has Charlotte had such a good record without Derrick Coleman in the lineup and a mediocre one with him in?” Others were more theoretical. “What additional statistics could be taken to improve individual defensive evaluation?”
Oliver and his cohorts on the message board wrestled with these questions and countless others, logging on at all hours to debate the relative merits of Allen Iverson or how best to calculate a new metric called usage rate. Long before Moneyball author Michael Lewis wrote a New York Times Magazine cover story on the topic, the board wondered why Shane Battier had such a positive impact on his teams despite not appearing to be all that good at basketball. The message board was a veritable think tank. “You could tell that this was a place where there was going to be a serious level of discussion about NBA statistics,” says Kevin Pelton, who would become one of the original writers for Basketball Prospectus. “It was literally the only place in the world it was happening.”
The NBA establishment quickly took notice. Oliver, who published the seminal Basketball on Paper in 2003, seven months after Moneyball hit stores, was hired full time by the Seattle Supersonics in 2004. Another frequenter of the board, John Hollinger, was hired the following year by ESPN—and recently became a vice president of basketball operations for the Memphis Grizzlies. Hollinger’s ESPN gig was filled by Pelton, who, after making his name at Basketball Prospectus did a consulting stint with the Indiana Pacers’ front office. Roland Beech, who created the popular website 82games, was hired by the Dallas Mavericks in 2009 as director of basketball analytics. (His boss, Mark Cuban, is regularly one of the biggest names at the Sloan conference.)
As soon as each statistician joined an NBA squad, sharing in public became off-limits—and so, gradually, the think tank closed shop. What were the teams paying for, after all, if their new stat gurus were just posting their ideas online for the other 29 franchises to read? This has had a paradoxical result: Because NBA teams embraced advanced stats so quickly, progress on basketball analytics has actually slowed down. The top minds are now all working in silos, not only unable to collaborate but actually competing against each other.
Major League Baseball teams were hidebound enough to ignore Bill James and sabermetrics for a full quarter century—as a result, he and others hashed out ideas out in open, public forums. By the time MLB executives finally embraced advanced baseball statistics, the movement was fully formed. But advanced basketball stats were just getting started when NBA teams tuned in. And though many of those teams are now collecting the kind of data that outsiders can only dream of, they lack the manpower to fully harness it. Certainly there have been advances: Teams’ internal stats generally blow away what’s available publicly. But they haven’t come as fast as they otherwise might have. And we, as fans, don’t understand the game as well as we could.
Dean Oliver estimates that between 22 and 24 NBA teams currently employ some form of analytics, with about one-half that number seriously incorporating their findings into the team’s approach to the game. Most analytics departments are small, which makes it hard to tell when your research is headed down the wrong path, says Aaron Barzilai, a former MIT player who started the site BasketballValue before joining the Philadelphia 76ers in November as their Director of Basketball Analytics. “You often just don’t have a ton of feedback on how you’re doing, especially if you’re one person on a team by yourself,” he says. And asking for help isn’t an option. Desperate for any competitive advantage, NBA teams guard their data—and whatever conclusions they draw from it—with about the same paranoia as a government official sitting on bomb codes. When asked how many analysts he employs, Houston Rockets General Manager Daryl Morey, the first stats acolyte to be hired to run a franchise, replies, “It’s not something we talk about.”
Fans have lost out in the bargain, too, with the newest ideas mostly staying locked up inside team offices. Just glancing at the homepages of leading advanced stat sites makes it clear they’re not getting enough love—BasketballValue and 82games both look like they were designed by a 14-year-old sometime in 1998. Barzilai hasn’t updated the numbers on his site since the Sixers hired him, and while the stats 82games remain current, Beech stopped posting articles there when he joined the Mavericks. Hollinger’s analysis, too, has now disappeared behind the league’s veil. NBA.com just launched a much prettier stats portal, complete with advanced metrics—but what’s on the site pales in comparison to what’s available behind closed doors.
Oliver, now back out of the league and working for ESPN, says that he’s particularly frustrated by the lack of headway that’s been made on one of the first problems he posed on the Yahoo message board: What new metrics could be created to quantify individual defensive performance? The last decade has seen tremendous progress in understanding the offensive side of the floor, but defense—where players must constantly rotate and cover for each other—presents a much knottier problem. Oliver believes that technology is providing the raw data to solve it, but all those NBA stat gurus working in isolation against each other aren’t close to cracking the code.
Where is that raw data coming from? Cameras that weigh about a pound and can fit in the palm of your hand. They’re provided by STATS, the global information behemoth, as part of its SportVU program, and they currently hang in the rafters belonging to 15 different NBA franchises, six per arena. They record everything: How far and how fast a player runs during the game, how many dribbles he takes when he has the ball, where he shoots from, the arc of his shot, whom he’s passing to, whom he’s not passing to, the spots where he get his rebounds, the spots where others get his rebounds. It’s endless. For each second of game play, the SportVU cameras capture the location on the court of the ball and each player 25 times, according to Brian Kopp, a VP at STATS. “You have 1 million data records per game.”
STATS acquired SportVU in 2008 from an Israeli company that had originally designed it for soccer. This is the system’s third year in the NBA since being recalibrated for basketball. STATS charges teams from $75,000 to $100,000 per season for SportVU, and the program has grown in that time from four initial teams to now half the league. The result is one of the largest and richest data sets not just in sports, but in the world.
Kirk Goldsberry, a visiting scholar at the Harvard Center for Geographic Analysis who also uses spatial mapping to analyze the NBA for Grantland and on his blog, Court Vision, is one of the few civilians who’s been granted access to any of the SportVU data. He’s working with another Harvard professor, statistician Luke Bornn, and four Harvard and MIT Ph.D. students in a semester-long project to break some of it down. “We look at that data and we say this isn’t just good data, this is the best space-time data,” Goldsberry says. “It’s just an incredible amount of information, regardless of whether it’s about NBA or anything else … There’s very few people who have ever seen any data like this.”
If six people from Harvard and MIT have their hands full with SportVU, you can only imagine how teams in the NBA are dealing with it. STATS provides standard reports to help teams understand the information, but those only scratch at the surface of what’s possible. “I’d like to think we’re ahead,” Morey says, “but it is a whole new overwhelming amount of data. You need to take a different approach to it and I don’t think anyone has the killer app there—the thing that comes out of that data that gives someone a very significant edge.”
Many, including Oliver, believe the killer app is hiding in there somewhere. The challenge is that there’s so much information, it’s easy to get lost. “It’s like saying you’re going to Wal-Mart or Ikea to get something,” offers Tommy Sheppard, the Washington Wizards vice president of basketball administration. “You better know what you want, or you’re going to walk out with a ton of shit.” That each franchise is working alone—and against each other—compounds the problem. Goldsberry describes it as 30 “micro-CIAs,” all racing against each other to “procure actionable intelligence out of these haystacks of vast data.”
Which brings us back to that lingering question from Oliver’s first post on the Yahoo Message board: How to measure defense? Traditional measures—like blocks, steals, and rebounds—fail to account for the full context of each play, but SportVU can provide a more complete picture. “We can say, OK, when Roy Hibbert is near an offensive player, A) they don’t even tend to challenge him very much, and B) when they do, their field goal percentage is really low if he’s within three feet of the shot,” Goldsberry says. “And then you can look at somebody like David Lee—when he’s within three feet of a shot, those numbers are much higher.” The paper Goldsberry submitted (along with co-author Eric Weiss) to this year’s Sloan conference expands on that idea, using SportVU to quantify which NBA big men are best not only at defending shots close to the basket, but deterring those shots from being taken in the first place. When it comes to analyzing SportVU data, though, the authors note that their “paper’s methods only represent a small first step.”
In theory, the Sloan conference is where all these analysts now gather to learn from each other. But they’re no longer working together, as they once did on that Yahoo message board. Daryl Morey admits that, from an academic perspective, it would be fun to drop the iron curtain dividing all of the franchises so that everybody could work in unison to hash out what’s probably the greatest data challenge in the history of sports. “Maybe someday when we all get fired we could get together, but right now our jobs are to win for our teams, so we focus on that,” he says. “Our businesses aren’t for the public domain. Knowledge in general will slow down, but hopefully knowledge that gives us an edge will not.”