What Bernie Madoff Can Teach Us about Accountability in Education
Mindful of H.L. Mencken's observation that, "for every complex problem there is an answer that is clear, simple and wrong," the new Obama administration should avoid making the mistake of previous administrations in equating accountability in education with high-stakes test scores. There is increasing evidence that flaws in current test design should all but disqualify their continued use as metrics of accountability, especially in science and mathematics education.
To help us head off a potential collapse of trust in public education comparable in scale to the collapse of trust in our financial system, we might look to draw parallels from what we are learning with the economy. In particular, the closure of Bernie Madoff's fraudulent investment firm stands to teach us at least four basic lessons we might use in reflecting on the role high-stakes testing has in driving current education reform.
A first lesson is that the most compelling evidence for something being wrong is often hidden in plain view. Consistent investment returns of ten percent or more can't be real, and they weren't. Similarly in education, there is mounting evidence in plain view that our current approach to high-stakes test design can't tell us what we need to know in order to drive education reform.
Separate from whether any one test can give a complete picture of what a student knows or what he or she has learned in a given year - where the answer is obviously "no" - there is the more precise question of whether, empirically, the tests work as good measures of what a teacher has done during a given school year? The answer to the latter question is also "no."
Using student scores from the Texas Assessment of Knowledge and Skills (TAKS) our university-based research group has analyzed both the effectiveness of some specific reform projects in mathematics as well as year-to-year scores from the entire state in science, mathematics, social studies and English. For the most part, we found the TAKS tests to be what W. James Popham from Stanford University calls "insensitive to instruction."
This means that even in situations where sensitivity to instruction is most implicated - e.g., situations where there is a sustained, aggressive, high-quality, and content-focused intervention - most of a student's score (more than seventy percent of the variance) on the high-stakes TAKS test is predicted by the pervious year's math scores (with, at most, only 7-8% of the variance related to the intervention). We have checked with colleagues involved in mathematics interventions from around the country and their results with similar tests are comparable. We also found the predictive power of previous math scores holds up over a number of years of math testing, not just for the year prior.
We then did a series of cross-disciplinary comparisons where the results might be expected to be the most distinct: Math scores versus English, science, or social studies scores. What we again found were similarly high levels of test scores predicting other test scores in ways that are very likely to overwhelm the effects that any teacher could be expected to have in any given year.
For reform-oriented accountability to work, test scores need to be highly sensitive to what educators do. Instead we have tests made up of items selected for their ability to consistently sort students, year-in and year-out, in the same order relative to an increasingly cross-test, cross-year, and even cross-domain psychometric "profile" (i.e., the location of students, in terms of an ability construct, on a logistic curve) developed by the testing organizations.
Needless to say, these results are highly problematic for reform.
A second important lesson Madoff teaches us is that for misrepresentation to work at a large scale, our desires and, even more so, our fears need to be played to, often by appeals to highly specialized forms of expertise or insider knowledge.
Perhaps no single piece of recent domestic legislation speaks more directly to our hopes and fears as a nation than the goals of the No Child Left Behind legislation to improve both equity and the levels of excellence in education.
The fact, then, that these largely self-referential and self-confirming testing profiles align so consistently with existing inequities related to socio-economic status, race, or first language only serves to underscore how problematic our findings are. That the math tests in Texas are now being validated, in the name of predicting "college readiness," with what historically have been tests of "aptitude" (e.g., the SAT) with comparably problematic outcomes along these same dimensions, makes it even more likely our high-stakes tests in mathematics and science are to re-inscribe precisely the sorts of inequities the No Child Left Behind legislation was ostensibly meant to address.
Making matters worse, in an era when accountability hinges on improving scores, changing a student's placement on this self-referential profile - by teaching test-taking or test-breaking skills - is likely to be at least as effective as teaching the actual content better. Minimum exposure to content plus heavy test preparation, especially in schools that are underperforming, might very well turn out to be an "optimal" gaming strategy for improving scores. Anyone who has spent time recently in schools feeling pressure to improve test scores can attest to a dramatically heightened attention to test-taking skills at a level that might even make the employees of test-preparation companies, like Stanley KaplanTM, blush. The consequences of teaching "test taking," as opposed to substantive math or science, are likely to be profound in their long term implications, especially for children attending schools currently deemed underperforming.
A third lesson Madoff teaches us is that if you want to forestall the day of reckoning, make sure you are in charge of both generating and then interpreting your own metrics.
Currently only a handful of private organizations and companies operating in the United States have the large banks of proprietary items developed, and calibrated, in terms of fit with their own internal statistical profiles. Consequently, only these organizations have the ability to produce tests that can be used to evaluate our movement toward the psychometrically defined goals of the No Child Left Behind legislation. Test publishers are essential both to ongoing test construction and to the interpretation of the results for nearly all of the high-stakes tests developed in the country.
With affiliates of these same publishers also controlling the lion's share of the textbook market here in Texas and around the country, one might legitimately begin to wonder how, when it comes to the academic side of schooling (as opposed to school financing), anyone would continue to describe the US education system as locally, or even publicly, controlled.
The fourth lesson Madoff teaches us is to surround oneself with true believers. Reputations have to be on the line and this will make coming to grips with what is really going on that much harder. Some have speculated that even Bernie Madoff, at some early point, might have believed in his own seeming successes.
Those of us deeply involved in reforming science and mathematics education, and who might have once wanted to believe in the potential of testing as a blunt but perhaps necessary instrument of reform, are now forced to come to grips with the full implications of the tests being "insensitive to instruction" in a way that vastly diminishes the role they can hope to have as instruments of reform. We were wrong to help sell the idea of placing so much trust in institutions that, in retrospect, stood to benefit the most monetarily from our continued willingness to suspend disbelief.
Our professional reputations are indeed on the line, making this the toughest lesson the collapse of the Bernie Madoff empire may have to teach. We hope the new administration can learn from our mistakes well before belief in public education's ability to serve the purposes of a just, economically robust and democratic society is lost.