Reading Roger Farr’s Reading: What can be measured? brought to mind two past epiphanies I’d had when administering standardized tests to children with dyslexia. First, there was the time that Nick, a 4th grader, correctly completed the following analogy, which no other kid had ever gotten right in any of the scores of times I’d administered that test:
Shakespeare : playwright :: Gershwin : ________________
“Composer,” he said guilelessly. Breaking every code in the How to Administer Standardized Tests Handbook, I gasped and asked him, “How’d you know that?” He explained to me that in the hallway of the music and art wing of his elementary school hung framed photos of famous composers throughout history: Bach, Beethoven, Mozart, Foster, Souza, and Gershwin, among others. It struck me right then that this child — who empirically did not have superior word knowledge to the dozens of children who had erred on that analogy — simply had an experiential advantage on that particular item. How often, I wondered, did such random biases affect testing outcomes?
The second epiphany came while administering a phonological processing test for what was probably the hundredth time. One subtest whose purpose is to assess the speed of one’s ability to name randomized digits, consists of two pages of rows of numbers that the examinee must read aloud while being timed. One child, a sweet boy repeatedly deemed to be a “slow processor” by psychologists, teachers, tutors and administrators, read laboriously through the two pages, tearing up in frustration as he did so. When he got to the end of the second page, the boy looked at me and said unassumingly, “The second page is the same as the first, only upside down and backwards.” I assured him that that wasn’t the case and flipped back to show him that it was randomized, only to discover that he was right. This dear boy, who had taken more than three times as long as most of my reading disabled students, had seen something in the test that I had not, that no other child in my care had, that none of my clinicians had, and that no trainer or instructor had ever indicated knowing. Was it possible, I wondered, that this child’s “slowness” carried with it the significant advantage of deliberativeness, of discovery, and of that singular mark of high intelligence: the ability to find patterns? I later learned from this boy’s grandfather that he had an uncanny gift for discovering patterns in nature, like Fibonacci numbers and Von Karman vortices. But I knew from the test that he was terrible at reading digits fast, by gum!
In addition to my inner musings as I read Farr, the external world provided me with three separate events which further rendered this 40-year-old monograph really relevant:
(1) The hype machine for the new film Waiting for Superman got all geared up and running;
(2) Several friends posted links to a New York Times op-ed piece called “Scientifically Tested Tests”; and
(3) A friend emailed me for advice, frantic over her very bright, well-behaved 3rd grader’s D’s and F’s on reading comprehension tests.
Never an apologist for standardized testing, I’ve had to accept them as something of an expected evil, if not a necessary one, in the colliding worlds of education and accountability. In the graduate courses I teach, part of my job is to explain to other educators the pertinent structures and functions of cognitive and achievement assessments, including progress monitoring, screening, program evaluation, and diagnosis. In my role as an administrator, part of my job is to explain to parents what standardized tests do and don’t indicate, and what that may mean for their own child’s future and performance. I thank my lucky stars that in no realm of my work life am I expected to “teach to the test,” but I am keenly aware that most educators don’t share that professional freedom.
Like the other books I’ve read thus far in this study, Farr’s illuminates the longitudinal nature of educational debates. The present-day polemics about skills assessment are not new: Farr’s monograph raises issues that may still not be resolved today. Because psychometrics is not my particular area of expertise (nor of enjoyment), I can’t claim to be up to date on the current research regarding reading assessment. I can, however, speak to the ongoing presence of several of the same issues Farr spoke to decades ago:
1. Teachers still don’t know what they don’t know. Farr cites studies from the 1950s and 1960s suggesting that “many teachers are unable to identify those areas crucial to the teaching of reading in which they lack knowledge” (23). Boy, howdy, does that sound like a modern-day lament. The current research base in the field of ‘science-based’ reading indicates that teachers still lack the explicit understanding of language structure that is critical to effective teaching and positive student outcomes in reading and spelling (Moats 2010: 1; 2009a: 379; 2009b: 387; Moats & Foorman 2003: 36; Snow, Griffin & Burns 2005: 9-10; Lyon & Weiser 2009: 475). Moreover, teachers tend to overestimate their own linguistic knowledge (Cunningham, Zibulsky & Callahan 2009: 498; Cunningham et al 2004: 157; Moats 2009a: 388).
2. Teachers and administrators still blame each other for reading failure. Farr cites a 1963 study by Austin and Morrison indicating that while administrators found teachers to lack creativity, knowledge and skills in teaching reading, teachers found that support, time, and resources for teaching reading were lacking (63-64). I suspect that they were both right then, and they’re still both right when they issue similar complaints today. Teaching literacy is pretty challenging, and there are innumerable variables that aren’t likely to be measured in a test, like whether or not the kid had breakfast that day. The blame game seems to have more players today, though. Not only do educators point the finger at each other; they also implicate parents (and vice-versa), and television and movies (and vice-versa — see Superman and the Oprah juggernaut, for example).
3. We still measure reading and . . . . Farr stipulates that, depending on the test in question, we may not actually be measuring the target skill. For example, are we measuring, as Farr says, the “power” of comprehension or the speed of comprehension? Verbal intelligence or vocabulary knowledge? Knowledge or experience? My frantic friend was relieved to realize that the tests at school measuring her 3rd grader’s ability to do a cold read of a passage and answer questions in a particular time frame really had very little to do with how well he could read (or be read to) and understand stories, concepts and explanations in texts over time.
4. We still don’t know how to account for socioeconomic, familial, or neurological non-normativeness in normative situations. Nowhere is this more evident today than in the aggregation demands and constraints for high-stakes testing as outlined in No Child Left Behind, Reading First, and other government-driven reading improvement efforts.
I offer here no profound solution to these persistent dilemmas in the teaching and assessment of literacy, other than, ironically, a continuation of what’s already being done, and what has been done for more than a century: a dialogue. The dialogue will inevitably be one in search of whom to blame. It will inevitably leave questions unanswered. But it will also have victories along the way. I’m a little encouraged by the hope I see in such sweeping statements as the “It’s possible. Together we can fix education.” sprawled across the top of the Waiting for Superman website. I am also pleased to see a psychologist like the one who wrote the Times’ op-ed piece suggest ways in which “testing could be returned to its rightful place as one tool among many for improving schools, rather than serving as a weapon that degrades the experience for teachers and students alike” (Engel 2010). But mostly, I am encouraged by the relief I see in parents when professionals take the time to explain that failing a test isn’t the same as failing as a person, and that sometimes, tests just aren’t measuring what we think they are.
It’s been a long dialogue, and it isn’t over yet. I guess the more things stay the same, the more they change.
Cunningham, Anne, Zibulsky, Jamie & Callahan, Mia D. 2009. Starting small: Building preschool teacher knowledge that supports early literacy development. Reading and Writing: An Interdisciplinary Journal 22(4): 487-510. Dordrecht, Netherlands: Springer Science & Business Media.
Cunningham, Anne, Perry, Katheryn E., Stanovich, Keith E. & Stanovich, Paula J. 2004. Disciplinary knowledge of K-3 teachers and their knowledge calibration in the domain of early literacy. Annals of Dyslexia 54(1): 139-167. Baltimore MD: International Dyslexia Association.
Engel, Susan. 2010. Scientifically Tested Tests. New York Times. Retrieved from http://www.nytimes.com/2010/09/20/opinion/20engel.html. September 27, 2010.
Farr, Roger. 1969. Reading: what can be measured? Newark, DE: International Reading Association.
Lyon, G. Reid & Beverly Weiser. 2009. Teacher knowledge, instructional expertise, and the development of reading proficiency. Journal of Learning Disabilities. 42(5): 475-480. Austin TX: Hammill Institute of Disabilities.
Moats, Louisa C. 2010. Speech to print: Language essentials for teachers. Baltimore, MD: Paul H. Brookes.
Moats, Louisa C. 2009a. Knowledge foundations for teaching reading and spelling. Reading and Writing: An Interdisciplinary Journal 22(4): 379-399. Dordrecht, Netherlands: Springer Science & Business Media.
Moats, Louisa C. 2009b. Still wanted: Teachers with knowledge of language. Journal of Learning Disabilities 42(5): 387-391. Austin TX: Hammill Institute of Disabilities.
Moats, Louisa C. & Barbara R. Foorman. 2003. Measuring teachers’ content knowledge of language and reading. Annals of Dyslexia 53(1): 23-45. Baltimore MD: International Dyslexia Association.
Snow, Catherine E., Peg Griffin & M. Susan Burns (eds.). 2005. Knowledge to support the teaching of reading: Preparing teachers for a changing world. San Francisco: Jossey-Bass.