Reading Report Cards

The benefits of releasing New York City’s data on teacher performance.

New York Justice Cynthia Kern

When is the release of faulty information defensible? In a much-anticipated ruling in Manhattan’s supreme court last week, New York Justice Cynthia Kern cited state sunshine laws and ruled in favor of 12 media outlets that want to unveil four years of data about the performance of New York City’s 12,000 teachers. The data ranks fourth through eighth grade math and English teachers, purportedly based on how much progress their students have made on standardized tests from year to year, using a statistical model known as value-added analysis.

The analysis uses previous test scores to predict what students should be getting on tests, while controlling for factors beyond a teacher’s control, like poverty, attendance and class size. Teachers are graded on how high they’ve lifted their students’ scores above the predicted ones; then they are ranked accordingly. In New York, the ranking is based on a curve.

The release of the value-added data—essentially, a report card for teachers—is controversial because studies (like this one, financed by the U.S. Department of Education) show that this model, like all statistical models, produces estimations rather than precise measures. In other words, every teacher’s ranking is really an approximation, with the actual ranking falling somewhere within a specified range that is sometimes quite broad.

The city defends the report cards because they are more accurate along the edges than in the middle. Over just a few years, the value-added model can quite reliably document the city’s very best and very worst teachers, even though a more average teacher—say, one who should fall in the 63rd percentile based on three years of performance—might actually show up anywhere between the 46th and the 80th percentile (as this study points out).

This is problematic, but still better than the alternative. Right now, 99 percent of all teachers in America are ranked “satisfactory,” according to a 2009 report conducted by the New Teacher Project. Meanwhile, one year with a very bad teacher can result in student test scores that are 10 percent below those of similar students with a high-performing teacher, according to a study Harvard’s Thomas Kane conducted with researchers Douglas Staiger and Robert Gordon. If a school system got rid of the bottom quarter of its teachers, Kane says, students could be expected to improve their test scores by 14 percentile points before graduation.

Justice Kern didn’t debate the pros and cons of any of this. Citing her state’s highest court, she wrote, “The Court of Appeals has clearly held that there is no requirement that data be reliable for it to be disclosed.” In other words, it’s not for the judiciary to decide that New York’s teacher data reports are too unreliable to be credible. It’s for the public.

This will undoubtedly impose a cost on some individual teachers. Their union is appealing. At a PTA meeting at my children’s New York City public elementary school in November, I defended the public’s right to see these report cards, but most of my fellow parents think this is a betrayal of the teaching staff. They pointed to the 200 or so bureaucratic errors the city has been accused of making when handing out scores to teachers. (The city has promised to fix those before releasing the reports.) A few parents argued that state-wide tests are bad indicators of a teacher’s real prowess. “Our teachers are much more than numbers,” a position paper they handed out declared. They also made the point that the scores are “too imprecise” to be meaningful.

I’m more concerned with who decides what’s imprecise. As Kern’s decision shows, it’s not a court’s job to hide expensive government data because of accusations that it’s erroneous or unreliable. (The city has budgeted $3.6 million so far to put together the reports, according to the New York Times.) The teacher report cards are here to stay: By 2013, they will make up 25 percent of New York City’s teacher evaluation system. School districts in North Carolina, Texas, Tennessee, Wisconsin, and Washington, D.C., have all signed on for value-added analyses. To get the grant money promised by the Obama administration’s Race to the Top initiative, districts have to figure out how to evaluate teachers based on how much they contribute to students’ progress on standardized tests. The value-added model is currently the most popular way to do that.

Last summer, the Los Angeles Times got seven years of raw data on teacher performance from the city and published its own value-added rankings of 6,000 teachers on-line. The results were mixed. Some of the rankings were inevitably inaccurate, and the teacher’s union organized a boycott of the paper. The union also blamed the LAT for the suicide of a teacher who had received a poor ranking, though it’s hard to imagine that the facts surrounding the death weren’t more complicated. The teacher rankings also got people talking. The paper says 1.8 million people have logged onto the series.

And the rankings fueled some important journalism: One LAT story focused on a popular elementary school, and showed that its students actually made unimpressive improvements on test scores after they’d gotten in. This suggested that evaluating schools based on test scores alone can tell you more about the students a school attracts than about how good a job it’s doing. Another story showed that 8,000 Los Angeles kids had gotten stuck for at least two years in a row with a low-performing teacher, out of the seven years studied. This is exactly what Kane’s research warns against, especially for at-risk kids. Late last year, the Los Angeles school district signed on to create its own value-added report cards for teachers.

If Kern’s ruling withstands appeal, the media outlets with access to New York City’s value-added data will include the New York Post, the New York Times, the Wall Street Journal, the Daily News, and NY 1. With luck, they’ll crunch the numbers and we’ll find out which schools have a concentration of the best and worst teachers and how likely schools are to place students with poor-performing teachers.

Inevitably, some teachers will be unfairly tagged along the way. I’m sorry about that. I also agree with Merryl Tisch, the head of the New York State Board of Regents, that New York must improve its student exams, which are too easy and predictable, if the value-added data is to be truly useful. The new tests should be ready for the 2011-12 school year. That should mean better teacher report cards are on the way. In the meantime, let’s look hard at the data we’ve got.

Like Slate on  Facebook. Follow us on  Twitter.