Posted on October 23, 2009 at 4:36 pm by Amber Winkler

Arne Duncan and the E.D. Hirsch imperative

Whew, I just finished reading Secretary Duncan’s meaty address to the faculty and students at Teachers College at Columbia University. It’s a fairly brave speech by Education Secretary standards (despite the fact that he defines great teaching as “a daily fight for social justice”).

Duncan calls out weak teacher prep programs and pushes them to better measure their success based on teacher outcomes:

….Simply put, incoming [college] freshmen don’t know the content because too often they have been taught by teachers who don’t know the content well. ……Now the fact is that states, districts, and the federal government are also culpable for the persistence of weak teacher preparation programs. Most states routinely approve teacher education programs, and licensing exams typically measure basic skills and subject matter knowledge with paper-and-pencil tests without any real-world assessment of classroom readiness. Local mentoring programs for new teachers are poorly funded and often poorly organized at the district level.

Less than a handful of states and districts carefully track the performance of teachers to their teacher preparation programs to identify which programs are producing well-prepared teachers-and which programs are not turning out effective teachers. We should be studying and copying the practices of effective teacher preparation programs-and encouraging the lowest-performers to shape up or shut down.

…… Right now, Louisiana is the only state in the nation that tracks the effectiveness of its teacher preparation programs. Every state in the nation should be doing the same-and, as I said, we are going to provide incentives for states to do so in the $4.3 billion Race to the Top competition. It’s a simple but obvious idea-college of educations and district officials ought to know which teacher preparation programs are effective and which need fixing. Transparency, longitudinal data, and competition can be powerful tonics for programs stuck in the past.

4th and 5th in Flint River sch... Digital ID: 1260091. New York Public Library

All well and good. But, there’s even more for those of us who value teaching would-be teachers a strong body of content knowledge. Duncan gives a well-deserved pat on the back to E.D. Hirsch, Core Knowledge Founder. And then links Hirsch’s unwelcome reception at his former education-school employer to a finding by Arthur Levine:

Ed school deans and faculty interviewed for Levine’s study painted an unflattering picture of teacher education, which they complained was “subjective, obscure, faddish . . . out-of-touch, politically correct . . . and failed to address the burning problems in the nation’s schools.” English professor E.D Hirsch, the father of the acclaimed, content-rich Core Knowledge Program, got his own taste of the ideological blinders at colleges of education when he chose to teach an ed school course on the causes and cure of the achievement gap. Having authored the 1987 bestseller, Cultural Literacy, Hirsch anticipated that his course would be oversubscribed. But three years in a row, only 10 or so students enrolled. Finally, one of Hirsch’s students informed him that other professors in the ed school were encouraging students to shun the course because it ran counter to their pedagogical beliefs.

This excerpt of the speech was of particular interest to me as I was a doctoral student at UVA when Dr. Hirsch taught there. I remember seeing his undergraduate course listed in the course directory and wondered how long the waiting list was to take it. I walked by his classroom door one afternoon, fully expecting to see students busting at the seams, and counted maybe 6 students in there. I learned later that his courses typically never had more than 10 students on a good day. Why? In short, he didn’t have too many friends there. Most of the professors (including ones whom I served as a Teaching Assistant) weren’t too fond of the Core Knowledge professor (he was “out-of-touch” with what teachers really needed to learn…) and they didn’t hesitate to tell their own students that.

Sol Stern also wrote a fine new piece in the City Journal highlighting Hirsch’s work and impact. In it, he quotes Hirsch: “That children from poor and illiterate homes tend to remain poor and illiterate is an unacceptable failure of our schools, one which has occurred not because our teachers are inept but chiefly because they are compelled to teach a fragmented curriculum based on faulty educational theories.” Though I do indeed believe that some teachers are “inept,” there is nonetheless a lot of truth to that statement.

I’m happy to see Secretary Duncan calling attention to a man who has done so much to shine a spotlight on the need for a content-rich curriculum in all of our schools.

If only the ed professors were listening.

Photograph from the New York Public Library photography collection

Related posts:

  1. Arne Duncan: Race to the Top grants to be announced “very soon”
  2. All about Arne Duncan (New Yorker style)
  3. Arne Duncan speaks to the nation’s education writers

You can leave a response, or trackback from your own site.

Comments

  1. Karl Wheatley:

    We (ed school professors) ARE listening, but unfortunately, Secretary Duncan’s proposals so far have shown a weak grasp of the issues and of what works best in the long run. (Superintendents like Duncan or Rod Paige are usually smart tough individuals with great people skills, but often don’t understand motivation, learning, curriculum, or assessment very well.) For example, a historic weakness of American curriculum (I teach curriculum year round) is that it is overcrowded, and thus American students never master the critical facts and key skills, and so we re-teach them the same things year after year. In some other countries, the textbooks are a fraction of the size, they go more slowly, and they make sure kids master the material the first time.

    I begin every course by analyzing the Declaration of Independence with students and discussing its possible implications for students. Then I ask the class how many remember studying the Declaration of Independence in school and how many remember memorizing state capitals. Usually, 70-100% will say they remember memorizing state capitals (which is trivial knowledge) but only 0-25% remember studying the Declaration of Independence.

    Similarly, we waste time teaching all sorts of subskills that children don’t need to be taught. Much of our system of factory schooling is still based on behaviorism, a theory that has been largely displaced in psychology by more powerful theories, but that still has our schools under its thumb. If you follow behaviorism, it may seem that you need 600 separate lessons to teach kids to read, while if your follow more powerful and recent theories, you can help virtually all kids learn to read by following a half dozen guidelines and owning a library card (and kids will like reading and writing more at the end). Guess which approach costs taxpayers more money and crowds out time for more important content?

    No, our curriculum is far too crowded with trivia. Doing less, and more in-depth, is far more powerful.

    As a teacher educator, I know Duncan is right that teacher ed needs to do better (as does every profession). However, content experts are often some of the worst teachers (teaching is a lot more than talking through the content), and many students have a good grasp of the content but fail terribly when they get to student teaching because they lack all sorts all of personal and interpersonal skills and habits. Pretending a teacher who has content knowledge is “highly qualified” is like pretending a plumber who owns a wrench is a good plumber. Through the ETS Praxis series, we have actually been testing prospective teachers for “content knowledge” for years. Judging from a recent study in AERJ, it has had little effect.

    If you can force all districts to align their curriculum tightly to the state test and align that state test to NAEP, it is easy to raise state and NAEP test scores, but so what? Kaplan also easily boosts scores on all sorts of tests, but that doesn’t mean the kids are really any more capable, just more capable test takers. Thus, such gains on tests generally don’t translate very well into real-world competence or success. As a general rule, like a fad diet, the approaches that raise test scores fastest in the short run are the approaches most likely to be counterproductive in the long run. Whether Massachusetts has achieved any meaningful and lasting improvement in actual, real world student competence is impossible to tell, based on the data I’ve seen.

    The root problem is that “testing knowledge” and “more content” come from an outdated and less effective paradigm of education: authoritarian, factory style education. After decades of research have shown that alternative approaches work much better, it’s time for policymakers to stop making the same mistakes over and over again, albeit in intensified fashion. Educational approaches with integrated, interest-based, real-life curriculum, substantial student choice, local control, and authentic assessment simply work better in the long run for the range of goals that parents and employers value–and they are also more consistent with core American values.

    To get to what is more effective, we have to give up approaches based on outdated ideas of learning, motivation, and curriculum, including NCLB, a curriculum filled with teacher-dominated separate subject instruction, merit pay for test scores, and yes, we need to outgrow ideas like Race to the Top.

    Respectfully submitted, -Karl Wheatley

  2. tim-10-ber:

    To Karl Wheatley — thank you for your comments. My question is how do you eliminate the teaching degree, test prospective teachers for the appropriate personality traits to be an exceptional teacher, tighten up curriculum (strengthen and make it stronger), implement the new procedures, etc. It has been mentioned in many blogs that the traditional ed schools are money makers for colleges/universities. If true…how in the world do you shut down a cash cow and make the changes needed for todays students that seem to be poorer, have more challenges at home and need better teachers in the classroom?

    Thanks –

  3. Ze'ev:

    To Karl Wheatly,

    1. “If you follow behaviorism, it may seem that you need 600 separate lessons to teach kids to read, while if your follow more powerful and recent theories, you can help virtually all kids learn to read by following a half dozen guidelines and owning a library card (and kids will like reading and writing more at the end).”

    Can you please elaborate on this a bit? I am not an expert on reading so I am unsure what you refer to here. The recommendations of the National Reading Panel do not imply (to me) 600 separate lessons, but I also don’t interpret them as a simple half a dozen guidelines (implying that they can be taught in a half dozen lessons).

    2. “many students have a good grasp of the content but fail terribly when they get to student teaching because they lack all sorts all of personal and interpersonal skills and habits.”

    If indeed many teachers had good grasp of content we wouldn’t see such low passing rates on Praxis II content knowledge tests, that just seems to get lower over time. For example, 29% of teacher failed in math content in 2006, with passing scores varying from ridiculously low of 115 to mediocre high of 150 (out of 200), up from only 18% failing in 1999. In English Language, Literature and Composition the failing numbers are a bit lower but still rose from 9.5% in 1999 to 14% in 2006. This seems to cast doubt on your assertion that the issue is definitely not content. Perhaps there are other issues, but content is certainly among them.

    3. “If you can force all districts to align their curriculum tightly to the state test and align that state test to NAEP, it is easy to raise state and NAEP test scores, but so what? … Thus, such gains on tests generally don’t translate very well into real-world competence or success.”

    If this were true, no state would have any difficulty to meet the 2014 “100% proficient” NCLB criterion. After all, all states by now have their standards, curriculum, and assessment aligned due to NCLB. As to translation of test scores to real-world competence, SAT scores, which are very much like any other of “those” tests, do correlate well with first year of college GPA. Are we to understand that this doesn’t count as real-world success? The evidence for both your claims shows otherwise.

    4. “The root problem is that “testing knowledge” and “more content” come from an outdated and less effective paradigm of education: authoritarian, factory style education. After decades of research have shown …”

    Heated words, but rather groundless. Decades of research have shown just the opposite, that content knowledge is crucial to development of critical thinking and deep understanding. Check, for example, http://www.aft.org/pubs-reports/american_educator/issues/summer07/Crit_Thinking.pdf .

  4. Karl Wheatley:

    For tim-10-ber,

    Teacher ed programs are often cash cows for some universities, and yes, we need to do better on quality control (as do all professions). I disagree with the assumption the goal is to shut them down–you need some kind of teacher ed/teacher certification programs, and graduates of the better ones currently perform very well. Rod Paige liked to talk about “Eliminating barriers to teaching, which sounded to my ear like “lowering standards for the people who teach our children.” Clever spin, but wrong idea.

    The question is how to routinely ensure better quality preparation. First, some colleges limit the size of a cohort, which allows for some quality control on the front end, and Cleveland State University, where I teach, has allowed us to do that. Second, if faculty are highly involved in designing and running the program (rather than implementing someone else’s cookie-cutter ideas) they seem to take quality assurance more seriously. For example, I’m the coordinator of the ece program (PK-3), and we’ve been working for years on designing key criteria that you must meet, or no teaching degree. So, if you can’t do “kidwatching” (a set of 5 skills, including desiging appropriate curriculum based on observations of children), or active listening (crucial both in teaching and conflict resolution), no license for you, no matter what your grades. Faculty can and often do enforce specific criteria–I have one assessment skill that students must show mastery of or else they get an incomplete in my curriculum class until they do–even if they have 100% on everything else. Then, we have a set of professional dispositions that students must demonstrate (e.g., ability to work professionally with others, completes work on time), and if faculty report problems in those areas (apart from grades) you can be called in to discuss the problem, and if you don’t fix it, you get removed from the program. We’re just getting that underway, but it’s proven quite promising already, and our middle school program has done it for years.

    You cannot “test” for many of the things that matter most in teaching, if you mean standardized bubble-in tests. A March 2009 meta-analysis by D’Agostino and Powers in American Educational Research Journal found that “test scores [on teacher licensure tests like Praxis] likely do not provide additional information beyond preservice performance to safeguard the public from incompetent teaching” (p. 146). I’ve taken the Praxis tests for the licensure program I teach in, and I found myself looking at some of the questions and thinking “No one needs to know that in order to be a good teacher.”

    What we need to move to is more analysis of actual videotapes of students teaching, solving classroom conflicts, etc., and tighten up quality control in teacher ed.

    The last key thing may be to have community panels make a final determination about licensure, maybe two faculty, a master teacher, a parent, etc. Teacher educators often have a hard time failing people by themselves. Maybe we’re too nice to boot out even people who should find another career, but that does children and families no favors, so these community panels can work very well for quality assurance. It also reflects American values nicely.

  5. Karl Wheatley:

    To Ze’ev,

    Thanks for your thoughtful remarks.

    Regarding your point 1, test-based education has led to much greater use of direct instruction, because direct instruction seems to people like a logical way to teach to the test (even though it doesn’t seem to work better in the long run for test scores, and often works worse—witness the decreased growth in NAEP math scores during NCLB), and because countless people have been running around saying the Follow Through Study shows direct instruction works best (not knowing perhaps that analyses shortly after Follow Through was completed concluded that the study was too poorly done to allow any clear conclusions (I believe it was in a 1975 Harvard Ed Review plus a GAO report). If you go to the Association for Direct Instruction website, you find the hundreds of lessons they believe are necessary for teaching reading, writing, spelling, etc.

    The National Reading Panel Report, perhaps through no fault of the people on the panel, is simply not a usable document for making policy or curriculum decisions. If you do a beautifully executed scientific study of 1000 Clevelanders about what is the correct color to wear on football Sundays, you’ll learn that orange and brown are the “scientifically proven” correct colors to wear. The composition of the NRP panel was deeply problematic (having more people who actually teach young kids would have been a good start, and people who have a systems understanding of education), the way their charge was framed was problematic, how they chose studies was problematic, and then how they used the studies was problematic too, for example, missing where whole language kids did just as well as kids in skills-based teaching, or where the studies they cited undermined the conclusions they drew from them. Many books have already been written on the problems (Allington, etc.). If you study what’s the best way to teach reading by assuming you can ignore culture and motivation, and assuming a factory-style instructional approach, and assuming separate subject instruction, well, there’s no reason for us to care what you find, because the best approaches for achieving a combination of skills and love of reading take culture and motivation very seriously, use integrated curriculum much of the time, and don’t use factory-style instruction. We spent tens of billions on Reading First and didn’t get any real improvement for our money because the government guidelines are based on flawed assumptions about learning, motivation, education, and research. Learning to read appears difficult because schools follow the wrong paradigm, while lots of homeschoolers report it being a pretty easy process (if you have the right paradigm). We need paradigm change in education, not tinkering with broken models.

    Regarding point 2, my fault for not being clearer. I agree content knowledge matters for teachers, and I agree some teachers and prospective teachers should be more knowledgeable, and in some cases, the issue will be content knowledge. However, there are other things that matter as much, and I was just whining about the tendency to define “highly qualified teachers” based on subject matter knowledge alone. If you saw my earlier response, I noted that researchers in a recent AERJ article have found these tests don’t seem to have very good predictive validity for teaching performance, so there’s reason to doubt their utility as assessments of teachers’ subject matter knowledge in those domains, I don’t know. It’s important not to assume the tests are good predictors of what matters. As a university teacher, I was asked to teach a lot of new course my first year, I wasn’t so expert in them then, but I learned the material quickly because I was teaching it. So there’s room to be skeptical about whether a test prior to teaching predicts content knowledge a few years later, and the lack of added predictive validity for the teacher tests suggests we shouldn’t be taking them too seriously in policy discussions.

    Regarding your point 3 about SAT scores, I meant raise average scores, not raise every kid’s score to a reasonable “proficient” level. Proficiency for all is fine for specific low-level skills, but proficiency for all in a whole subject domain is an oxymoron as we usually define proficiency—which means you are better than maybe 2/3 of the others. If everyone gets better, we move the bar upwards, and kids who had just gotten above the proficiency bar are now below it again. Ditto for “on grade level” it’s a norm-referenced idea, and you can’t get all kids on grade level any more than you can make all kids average height by feeding them better. As all boats rise, the on grade level and “proficiency” bars rise too, remaining forever above many kids heads, unless you re-define “proficiency”–which this strange NCLB target forced states to do. When people who understand education saw those targets (100% proficiency, all on grade level), they instantly understood that the NCLB architects didn’t understand education very well.

    The r-squared for SAT scores and college completion and ending GPA is in low single digits the last time I checked, and doesn’t predict life success which is perhaps why over 800 colleges and universities, including many of the best, are now test-score optional (see Fairtest website). In some professions, GRE scores have been negatively correlated with professional success. Forty years after the First International Math Study (FIMS—those students are now in their late 50s), test scores from that test were negatively correlated with national wealth, rate of economic growth, creativity (patents), degree of democracy, and happiness in the mid-2000s. If you toss out the so-called Asian Tigers, TIMMS scores also show very weak to non-existent correlations with economic outcomes for the U.S. and many other countries. Peter Sacks’ book Standardized Minds is a good primer on the limitations of test scores. We want great assessments, but bubbling in answers doesn’t fit the bill. Ironically, when I began in teacher education, it was assumed that the weak teachers were the ones who had to give a lot of tests to assess, and use a lot of rewards to motivate. Now, doing that is the centerpiece of official federal policy—no wonder teachers and researchers are not amused!

    Regarding point 4, I’m totally in favor of critical thinking and totally agree content knowledge matters—I’ve know of that research for 2 decades, so it must be at least 3 decades old. The question is how we get those outcomes AND all the other ones we want. I’ve been working on a book about NCLB/educational models for about 5 years, and whether or not this is true of you, people who start by saying we need more content, basics first, longer school year, rigorous core curriculum, more testing of knowledge, more homework—usually end up advocating for traditional factory-style, test-driven schooling. The best longitudinal research I know of suggests that approaches with substantial integrated curriculum, student choice, interest-based learning, and authentic assessment yield similar test scores to traditional teaching, but consistently yield deeper understanding, better application to real-world problems, better creativity and critical thinking, greater acceptance of diversity, and greater love of learning. You can look to the Eight-Year-Study (1930s, 40s), Herb Walberg’s 1986 meta-analysis of open vs. traditional education, and various studies on constructivist, interest-based teaching. In motivation research, autonomy supportive approaches, not controlling others, is consistently associated with better long term outcomes. Germany did a study of 50 play-based kindergartens versus 50 academic ones, and at the end of elementary school, the kids from the play-based programs were doing better on every single academic indicator. Developmental psychologists have been shouting from the rooftops that kids need more play, more recess, and fewer worksheets, if you really want what works best for them (see the book Einstein Didn’t Use Flashcards, or Pelligrini’s work). Vars also did a good meta-analysis finding that integrated curriculum yields just as good basic skills (and then you see it has all these other side benefits. These approaches to education aren’t what most of us grew up with, and they are as messy as democracy itself, but they work better in the long run for achieving the goals we value, and they fit better our core American values.

    – Karl Wheatley

  6. Ze'ev:

    To Karl Wheatley,

    First, sorry about mangling your name in my previous post.

    1. Your dismissal of Project Follow Through results is cavalier. The critique you quote as showing the study to be “too poorly done to allow any clear conclusion” is irrelevant to the question of what works. What the critique said is that the study design did not allow for variation of the failing educational models to evaluate if they could perhaps be somehow “saved.” What it did not say was that Direct Instruction was ineffective. In simple words, in a race of a dozen pedagogical “competitors” one decisively won, and the critique complained that the losers were not given an opportunity to keep trying to modify their behavior to achieve a less decisive loss. As to “hundreds of [Direct Instruction] lessons”, isn’t this like saying that a coach has no clear strategy because he gives the team a slightly different, even if consistent, pep talk before every game?

    As to the NRP, you are not the first, and unfortunately probably not the last to slight and slander the NRP through innuendo. The panel members are so accomplished that they really don’t need my defense. I understand your unhappiness that the National Reading Panel did not include Whole Language supporters. It also did not include patent medicine providers or UFO observers, and for the same reasons.

    Using Allington as an authority on reading is not very promising. His major beef with the NRP, in his own words, is “I thought their choice to focus only on experimental research was too narrow.” In other words, he would like a national panel to make nationwide recommendations based on non-experimental “research.” I have no doubts that under such definition of “research” even Whole Language would qualify–wasn’t that the whole purpose of Allington’s critique anyway?

    Finally, I do agree that we often follow the wrong teaching paradigm in the classroom. However, we disagree on the nature of of the wrongheadedness. The broken model is not fostering decoding skills or phonemic awareness, but rather prematurely immersing kids in irregular text, insisting on heterogeneous grouping for reading, and expecting teachers to work miracles in them.

    2. Regarding the importance of content knowledge, if you read that AERJ paper you must have noticed that there were very few studies that actually evaluated content knowledge—only 30 out of over 700 measured effects were based on content. Had you read further, you would also notice that only three studies potentially used objective assessment (tests) to evaluate content knowledge rather than use subjective observations. Then, going to these three studies it turns out that only two of them had actually measured content knowledge with a test—the Ayers(79) and Ayers(88) studies. Finally it also happens that Ayers used the high school ACT results as the measure of content knowledge of college graduate teachers four or more years later. As if four years in college never contributed anything to their knowledge.

    In other words, the results of the AERJ study as regarding the predictive value of content knowledge for teacher effectiveness are quite meaningless. We don’t have good studies that show the effect, but there are no robust studies that show that there is no effect. Based on the results of Praxis II that I mentioned, and on accreditation examination results in California, I can confidently say that content is definitely a problem, even if quite possibly not the only one. When it comes to academic content no meta-analysis will convince me that one can teach what one doesn’t understand.

    3. As to value of testing, your argument about proficiency shows fundamental misunderstanding. “Proficiency” has little meaning in the context of norming, which is what you claim. Proficiency has only meaning in context of criterion referenced test, and there is nothing strange or unique with having large fraction fail on criterion-referenced test (how many pass the bar exam the first time?) or having a large fraction pass (think DMV driving test).

    We have grade level standards in every state since NCLB and there is no more reason to define proficiency in terms of norms. In fact, no state does it anymore. Again, you seem to be confusing between the two.

    Your argument about poor SAT prediction of college completion is irrelevant. I don’t believe the College Boards makes any claims with regard to any correlation beyond first year college GPA, and there the correlation is about 0.5. Please don’t shift the discussion. If SAT was claimed to represent innate ability it would make sense to try and explore long-term correlations. As SAT claims to represent current achievement rather than ability, it makes no sense to search for correlation with college completion four-six years down the line.

    Similarly, your reference to FIMS seems immaterial. I am unaware of such study (providing better references in general would help) but in ny case it is quite meaningless. There were only a handful of nations represented at FIMS, its sampling methods were problematic, and, in any case, the ranks of states have radically changed over the decades due to major demographic shifts. The clearest example is Israel, which took the first place on FIMS and fell deep into the bottom half of achievers on all TIMSS programs since 1995. Further, why toss out the Asian Tigers for economic effects? Because they do show the correlation that destroys your theory? Second, if you don’t mind, I’d rather use reputable economists like Woessmann and Hanushek to evaluate such correlations rather than a journalist (http://www.atypon-link.com/doi/abs/10.1257/jel.46.3.607).

    4. Finally, I am not going to start an argument about open education vs. traditional. One approach seems better on academics, another on non-academics. I agree we need aspects of both, but we should not believe that we can create creative thinkers without content knowledge. That’s fool’s errand.

  7. Karl Wheatley:

    To Ze’ev,

    Wow, thanks again for another thoughtful reply.

    Sorry, I don’t read the research as giving direct instruction any clear victory, I see the research as showing that direct instruction is better in the short run for the low-level facts and skills, it’s a wash between the two models in the long run on low-level knowledge and skills for the kids still around in the two models, and the democratic constructivist models are better on a wide range on social, emotional, motivational, and behavioral outcomes, including creativity, complex social skills, initiative, problem-solving, critical thinking, and love of learning. Direct instruction runs on control, control undermines intrinsic motivation and creates resistance and behavioral problems, and the problems just ripple out systematically from there (Decades of self-determination theory, Deci & Ryan, are good start for seeing the bad ripples).

    If kids learn material just for a test, they forget it faster than if they learn it for an authentic reason, and simply put, passing a test is the wrong reason to learn to read or learn science—it’s a distortion of the learning process. Kids are hard-wired by evolution to be great at learning if their basic needs are met, so we created schools a century ago that don’t meet their basic needs and don’t make use of intrinsic motivation, so motivation ramps down through the elementary years and behavior problems ramp up. Then we yell at the teachers, blame the kids, administer drugs, and tell everyone to do the same model harder. (Learning to read is relatively easy if you have the right paradigm, so all this over-the-top effort in the accountability movement and endless language arts instruction is a giant canary in the coal mine that they’ve got the wrong model).

    Also, it’s pretty clear we have different ideas about the nature of science and the degree to which education is a science like low-energy physics, and the degree to which it is an art. You think Hanushek and Linnea Ehri are great sources and Allington is disreputable, I think Allington is a better source, and Hanushek has done some questionable work, although I haven’t read him in the last year. Economists suffer from the problem of “When the only tool you have is a hammer, every problem looks like a nail.” Ehri has done some technically excellent work, but when you ask the wrong questions and have the wrong assumptions in research, you get the wrong answers.

    We seem to disagree about the degree of the problems with the Follow Through Study, and you seem to be taking test scores far more seriously than I do. I assume maybe 10-20% of what matters is on the tests we’re giving kids now, and we can’t trust the scores as much anymore even for that 10-20% because high stakes make the scores less reliable. Also, if you’re in favor of standards-based education, high-stakes testing in incompatible with real standards based education, because it violates assessment standards, leads teachers to teach in ways that violate standards for teaching and curriculum, and violates ethical standards for teaching because high-stakes testing creates real and totally unnecessary harm. If there were official motivation standards, it would violate most of those.

    Regarding point 2, there is no such thing as objective assessment and never will be. It’s one of those linguistic short cuts we took years ago (people used to say tests of objective facts, which is far more defensible). Humans sit in rooms and decide what matters, and how much it matters, and whether or not to give partial credit and whether to use the version of the item that slightly favors Black students or the version of the same content item that slightly favors White students. They decide whether to use multiple choice or constructed response, often on totally human reasons such as what the budget is, even though constructed response may favor one group and multiple choice will favor a different group. The subjectivity is built in at the factory, so it’s harder to see. All assessment, including standardized testing, is heavily culture laden and subjective.

    Furthermore, if test scores don’t predict our economic success very well or other outcomes we care about, I don’t want to pursue test scores just because they correlate with the economy of Korea (perhaps as much because their economy is pushing test scores as better education is pushing their economy and test scores).

    I also disagree about proficiency, which at root is a norm-referenced construct in assessing a domain even if we use criterion-referenced assessments. How do we decide which assessments to use and what criteria to hold fourth graders to? We base that decision on moving norm-referenced judgments about what is pretty proficient for fourth graders, at this point in history. Can all fourth graders tire their shoes in ten seconds is a criterion-referenced assessment—assessing “proficiency” in a domain is always based on underlying norm-referenced judgments and goals. Grade-level indicators were developed based on norms for each grade level: I teach about standards, benchmarks, and grade level indicators all year round and have students analyze the connections between their activities lessons and projects and those indicators. Thanks, but I’m not at all confused about this.

    We stopped using Praxis I because it was so unhelpful, we’ve poured millions into Praxis II and the best defense of it is that we don’t have conclusive evidence it’s a failure (even if one of the best recent studies finds no reason to keep using it), we’ve given up on Praxis III because it wasn’t helpful, great colleges are finding they can do without the SAT and even get better students, China is moving away from testing, Singapore realized their great test scores weren’t creating students who set the world on fire, so they tried to figure out how to pivot and develop creativity, Finland can’t figure out why on earth we want a national test or want to teach to the test, and meanwhile, test-based education yields worse test score growth, following the most –test-based reading policies ever seems to be undermining students reading comprehensions, while tossing billions behind the implementation of the NRP ideas falls flat. With high-stakes testing, we’re on the wrong side of history, psychology, and it ruins nice family evenings at home with homework that is usually unnecessary for real-world competence (at least in K-6 or so).

    I’m not shifting the discussion about the SAT—the point is what is effective in the long run, and the reason the College Board constrains its remarks to the first year of college is that the SAT is not useful or helpful in the long run. At the core of educational/developmental/parenting research is the fact that what’s effective in the short run for boosting low-level knowledge and skills or gaining immediate compliance is generally counterproductive in the long run. Fad diets are effective in the short run, using the high interest VISA card is effective in the short run for boosting one’s living standards, yelling at the kids to get them to shut up is effective for short-term-compliance—but all of these shortcuts … and skipping your workout, and going on too little sleep, and direct instruction as the overall instructional model—are counterproductive in the long run.

    Stanford’s Ramirez and colleagues weren’t all that impressed with the more recent correlations between test scores and the economy.

    The non-traditional approaches yield similar test scores (and much of what is on tests is trivial, and is only chosen because it’s a quick correlate, not because it matters), and the non-traditional approaches consistently yield better critical thinking. Hmm. How could non-traditional education yield better critical thinking if you claim it’s worse on content knowledge and also claim content knowledge is key to critical thinking? Well, because much of the low-level knowledge assessed on standardized tests is not crucial for solving real-world problems (we’re assessing reading skills by having kids decode nonsense syllables, for heaven’s sake), and alternative approaches (constructivist, democratic, student centered) tend to yield deeper understanding—meaning these non-traditional educational approaches are better for helping kids learn the content that matters most.

    Now, when you assume the factory teaching model, it seems sensible to claim that teachers can’t teach what they don’t know, and you assume teaching is teacher-directed lessons. Child-initiated learning is often superior, and if you study homeschooling, or graduate students, or anything outside of formal school lessons, it’s obvious that learning often requires no teaching and often proceeds better without anything that resembles a formal lesson (and motivation to learn is more likely to remain intact, and behavior problems often disappear). My students, having been briefly brainwashed by the Reid Lyon line that you have to teach reading directly because reading is not “natural” in the way that oral language, when I questioned them, admitted that they had all figured out how to use their cell phones without any lessons, despite the fact that cell phones are not “natural” to humans. Hmm.

    Humans are wired to learn, and as long as their basic needs are met, are far better at learning than what we see in formal schooling. In fact, kids’ drive to learn is probably the second most important natural resource on earth, and if we want great education, we build it around that realization (and a few others), not around behaviorist ideas that a half century ago we knew didn’t even work so well with rats and pigeons.

    - Karl Wheatley

    P.S. See Hull and Spence if you’re curious about the rats and pigeons, but the reality is that much of curriculum content and testing content is still informed by behaviorist ways of slicing up content, and a lot of it never needs to be learned by anyone!

  8. Ze'ev:

    I am sorry, but it seems to me that long term results of Direct Instruction are just the opposite of what you describe. Not only was DI the only pedagogical approach (of the 9 tested) that strongly and consistently achieved the desired outcomes of Follow Through, but the research on the long-term impacts of DI show string positive and lasting effects. For example, Linda Meyer(1984) found that of the three DI NYC cohorts that she tracked, the DI students graduated high schools at almost double the rates as compared to the control cohorts (62:38, 64:38, 62:32), dropped out from school at much lower rates, and were accepted to college at almost double rates. Gersten, Keating & Becker(1988) observed large educational effect sizes in reading and math in multiple DI cohorts in 9th grade. Please, we are all for creativity, social skills and critical thinking, but all the non-DI approaches couldn’t demonstrate even making kids read better by third grade, let alone developing higher level skills years later. How would they acquire them anyway—by osmosis? Certainly not through reading, as all the non-DI pedagogies left kids with reading ability around 20th percentile level.

    You say that “Kids are hard-wired by evolution to be great at learning if their basic needs are met.” Well, kids are naturally curious, and we are ‘hard-wired” as you say, but only for spoken language and not for reading or for math above basic single digit arithmetic. Don’t believe me – believe leading cognitive psychologists like David Geary, Bob Siegler, or David Klahr. Reading, writing, and any math above very basic arithmetic is not innate and needs explicit teaching and learning. If reading and writing are hard-wired, how do you explain that societies than never developed reading and writing? But we don’t know of any mute society.

    When it comes to relationship between education and economics, who would you rather believe—journalists, or leading economists? You argue that Hanushek “has done some questionable work.” Perhaps he did, although I certainly am not qualified to make such a statement. But do you know of any leading academician that was not “questioned” by someone at some time? Let’s stop these insinuations. If you disagree with Hanushek’s work on correlation between education and economic growth, attack it directly and specifically, but don’t bring some journalist’s kvetching as “evidence.”

    “We seem to disagree about the degree of the problems with the Follow Through Study, and you seem to be taking test scores far more seriously than I do.”

    Indeed we are, and indeed I do.

    “I assume maybe 10-20% of what matters is on the tests we’re giving kids now”

    Can you support you assumption as it relates to academic subject matter? We may agree that school tests don’t assess ethics or honesty, but when it comes to academic subject matter like math, professional mathematicians and psychometricians quite unanimously agree that well done (multiple choice, or not) math tests assess knowledge and understanding of math quite well. It is the teachers that often question this, not the professionals.

    “[and that] we can’t trust the scores as much anymore even for that 10-20% because high stakes make the scores less reliable.”

    I cannot speak for you, but I do trust the scores, when it comes to content, quite a lot. As to high-stakes making tests “less” reliable, that is quite true. But high-stakes don’t necessarily make unreliable test scores, just slightly less reliable. Unless teachers cheat, that is.

    “I also disagree about proficiency, which at root is a norm-referenced construct in assessing a domain even if we use criterion-referenced assessments.”

    Seems to me that we may be using the word “norm” in different senses. Clearly the definition of what is “proficient” is normative. “Proficiency” is defined by a panel of experts (another normative adjective) deciding—through a rather elaborate process—what goes into being proficient. In the context of testing, however, “norming” is used in the sense of equating distributions and defining rankings. This is what you implied when you wrote earlier:

    “Proficiency for all is fine for specific low-level skills, but proficiency for all in a whole subject domain is an oxymoron as we usually define proficiency—which means you are better than maybe 2/3 of the others. If everyone gets better, we move the bar upwards, and kids who had just gotten above the proficiency bar are now below it again. Ditto for “on grade level” it’s a norm-referenced idea, and you can’t get all kids on grade level any more than you can make all kids average height by feeding them better.”

    There you were writing about the ranking aspect of proficiency. Please don’t hide now behind the definition of proficiency that is tautologically normative.

    “when you assume the factory teaching model, it seems sensible to claim that teachers can’t teach what they don’t know, and you assume teaching is teacher-directed lessons.”

    Perhaps it is time to address this supposed pejorative “factory teaching model” that you so frequently throw in. What would you like to have instead? An “artisan teaching model”? Would you also like to have artisan-made car? Or perhaps and artisan made vaccine or antibiotic? Are you sure you will be able to afford them? Or rely on their quality? Alexander was lucky enough to have Aristotle as a tutor. But he was a sole son of a king. We have insufficient resources to educate 3 or more billion people in one on one setting; and not enough Aristotles anyway. There is nothing wrong with “factory model” for basic education, as long as it is well done and has a decent quality control. We are lucky that we don’t need an artisan to manufacture a penicillin pill anymore, and we shouldn’t need an artisan to teach reading. One may get the artisan model when one gets to Harvard. At least if one believes Harvard’s PR (smile).

    “My students, having been briefly brainwashed by the Reid Lyon line that you have to teach reading directly because reading is not “natural” in the way that oral language, when I questioned them, admitted that they had all figured out how to use their cell phones without any lessons, despite the fact that cell phones are not “natural” to humans. Hmm.”

    You almost convinced me. Let me send you a new gizmo (an fancy answering machine, perhaps?) that has all the menus and instructions in Japanese, and after you will provide me with the equivalent instruction manual in English (w/o asking for translator’s help) I shall be convinced even more.

  9. Karl Wheatley:

    To Ze’ev,

    I can’t stop you from trusting in the Follow Through Study, but my recollection is that the variance within models was greater than between the models, some of the folks teaching in non-DI sites didn’t know the basis on which they’d be assessed, the DI folks did—of course, DI works better for teaching to the test in the short run, but then you see faster forgetting after the test. I worked at High/Scope for 9 years in the 1980s, and worked with two of the original Perry Preschool teachers who also worked in the Follow Through sites, and what they told me suggested that one of the models (High/Scope) was partly being created as the study progressed. I also recall that the other assessments (not the MAT) weren’t very good. I analyzed one of the Met Achievement Tests in the early 80s and recall some real problems … again, you’re welcome to take Follow Through seriously, but I don’t see it as providing a basis for any policy decisions. A longitudinal study based on a deeply flawed initial study and comparing DI to other models that largely no longer exist in those forms is no help whatsoever.

    Then, when you see more recently … TIMSS researchers complaining that the relatively poor understanding in math in the U.S. is because of our skills focus and back-to-basics stance, and recent findings in reading that following direct instruction approaches seems to undermine reading comprehension, well, it’s time to put away the scripts and flashcards, and do some real teaching for understanding. That means flexible curriculum, open-ended discussions, not spoon-feeding kids right answers (which interferes with understanding), and following kids’ lead a lot.

    My doctorate is actually in ed psych, with learning and development as my specialty, so I don’t need to “consult” these folks as I teach these topics at the master’s and doctoral levels. The miscommunication we’re having is I am saying kids are hard-wired to learn (it’s a motivational point), and you apparently think I’m saying kids are hard-wired to spontaneously spout forth spelling words or reading, or quadratic equations. The point is that kids explore, observe, imitate, ask questions, and for example, can learn to read without any lessons on short “e” sounds or syllables or prefixes or many of the things schools (and direct instruction) assume are necessary to teach. I happen to also know quite a bit about the interest-based approach to homeschooling called unschooling, and if you don’t fill kids’ days with lessons and homework, it’s rather amazing what they learn and do. When you DO fill kids’ days with lessons and homework, you understandably attribute all learning to formal teaching, and don’t know what they are really capable of (except in response to adult prompts). Kids immersed in a society, if the adults develop positive relations with them, want to understand and master the ways of that society. But the process gets turned around, perhaps as reflected in a George Bernard Shaw comment “What we want to see is the child in pursuit of knowledge, not knowledge in pursuit of the child.”

    Behavioral control tends to lead to the latter, as kids lose a sense of ownership of their learning, and school becomes a game of just trying to please the teacher, with all the associated problems. If you watch how kids learn to read in interest-based homeschooling, you come to realize that much of the “essential academic content” taught in reading is only a by-product of how the subject matter of reading got sliced into hundreds of bits characteristic of behavioral task analysis. So we have a lesson on each letter, one on prefixes, one or more on syllables, one on short “e” words, long “e” words, etc. A sampling of all these things then goes onto reading tests, because it is assumed kids need to learn these things to read, because that is how you have defined reading and assumed instruction proceeds. Under direct instruction conditions, you may find a reasonable correlation between progress on these subskills and actual reading, but if kids learn to read in and interest based way, you realize they can bite off much bigger chunks that direct instruction (DI) suggests, and in fact, you learn that much of that academic content is actually “academic” in the less flattering sense—not necessary for real-world performance.

    What happens is “academics” like me put the categories they use to study reading development into the tests even though they may be superfluous for the learner —or we do so because they fit the DI sequence. So, after a colleague of mine bemoaned a terribly boring lesson on short “e” words, I tried quizzing two homeschooled children who had learned to read in the interest based way. Both are fluent readers, understand what they read, and read a LOT. One of the two knew what syllables were, neither knew what prefixes were, nor which words had short e vs long e sounds, although they could read all the words on the list and knew what they meant. Similarly, I have sat with classes and analyzed one of the third grade texts from one of the top reading series, and found myself flipping through the gorgeously illustrated (and expensive pages) saying—“you don’t need to learn that to read, yu don’t need to learn that …” I’ve published on the positive uses of academic content standards, but a lot of academic content is simply academic, and shouldn’t be taught, and needs to be purged from the curriculum.

    Sorry if you thought I was attacking Hanushek personally. I wasn’t, and perhaps he has done some great work recently. When I was chairing a dissertation committee, there were some problems with something he did on class size, and again a few years later I noticed something else in a later piece that made me lose some faith. was just trying to establish that some of the people you’re citing don’t necessarily sway me, just as I stopped citing Allington with you. I don’t recall your original Hanushek point, so I’m missing you “kvetching” point.

    Why would you constrain the discussion of student outcomes to low-level knowledge and skills in a few academic subjects? Schools should focus on developing and teaching what matters most to society, not sticking with how we defined our mission long ago. In 2009, let’s get out of the box we built for ourselves in the 1850s. Subject knowledge doesn’t make the top 5 in many CEOs’ or parents’ lists of desirable outcomes, and subject matter is easier and more fun to learn when you aim for those other targets’ simultaneously. So, the tests assess maybe 10-20% of the student outcomes schools should be promoting, and analyses of state tests indicate they overemphasize lower level content and underemphasize higher-level content standards, so they’re skewed and under-representative even in standard subjects.

    Professional mathematicians and psychometricians all trust the tests? Not a chance. A recent Ed Researcher had an analysis indicating the mathematicians thought maybe 1/4-1/3 of the NAEP math items had validity problems. Let people who study student learning and a few other academic specialties look at the remaining items, and we may be at 50+ % of the NAEP items having problems with validity. .. in MATH, the subject that folks think is so straightforward. Teachers and parents—who work with kids daily, will spot problems the math folks will never think of, and many psychometricians do not know learning and kid’s responses well, so they have their own blind spots (In fairness, I’m no good at HLM). Lorrie Shepard did a study some time back interviewing or surveying the lead psychometricians in each state department of ed I believe. Came up with a bi-modal distribution with two quite different camps in terms of beliefs about learning and uses of assessment.

    Hmm, leading psychometricians have also pointed out that that multiple indicators is the professional standard, and what we are doing with tests violates professional standards for test use, and is, in a sense, malpractice. We are ignoring the warnings the testmakers themselves have put on their tests, and no, the scores can be quite unreliable if low-level knowledge is at hand. I once memorized the whole deck of Trivial Pursuit cards—the kind of feat that gives a wildly inaccurate assessment of actual mastery of the domain—and precisely the kind of stunt that teaching has too-often turned into—especially in poor schools. Some teachers in Cleveland just practice old test items—that’s the curriculum much of the time. A test ceases to be an assessment of a domain of knowledge when people are able to teach just selected things highly likely to be on the tests, and can stop teaching the whole domain, and all the interrelationships that make a domain a real network of understanding.

    I think factory schooling is an apt metaphor, and no, I don’t think the choice is between teaching as an art or as a science. It has elements of each, and elements of coaching, manager, point guard, counselor, etc. You beat up that artisan metaphor quite handily, but teaching is a unique profession that has elements of many others in it, but it is not a scientific process like low energy physics, and if we take away from teachers the ability to make professional judgments on the spot, it’s like telling the point guard the exact route they must run down the floor, no matter what the other guy does. The best systems teach teachers well, and give them substantial authority for running the system—I’d like to see more power in parents’ hands and in students’ too, but we need to end the micromanagement of teachers by bureaucrats who have no understanding of the complexities of teaching. We trust people to vote for president, we have to trust them to design teaching and adjust teaching as needed, and have reasonable processes for review and quality assurance.

    I didn’t get the Japanese analogy, but kids can teach themselves much of reading, writing, math, science, U.S. presidents, history, etc. with an enriched environment and decent scaffolding now and then. Kids in homes where two langauges are spoken pick up both easily, not because Spanish and English are wirde into their brains, but because the leanring capacity and motivation is. I’ve watched the above happen, I’ve run camps where kids learned these things through their own initiative, and lots of homeschoolers report the same. And if you don’t know, you can take initiative and Google it! However, the more kids are MADE to learn, the longer it often takes for that self-motivated learning to get back to full speed.

  10. Ze'ev:

    Kurt,

    “I can’t stop you from trusting in the Follow Through Study, but my recollection is that the variance within models was greater than between the models”

    This is an incorrect statement erroneously made by the final Abt report. Subsequent analysis of the data showed that the variability of data for DI schools was not “greater than between models.” (Gersten, R. (1984). Follow Through revisited: Reflections on the site variability issue. Educational Evaluation and Policy Analysis, 6, 411-423)

    “what they told me suggested that one of the models (High/Scope) was partly being created as the study progressed. I also recall that the other assessments (not the MAT) weren’t very good.”

    How would one know if the newer versions of the programs are actually any better? Isn’t it like the publishers changing editions every other year by 2%, so whatever one establishes on any particular edition, it is immediately rebutted by “The study was done on an obsolete edition that is no longer in use. We already have a new edition that is much improved”?

    You clearly don’t like Project Follow Through findings. You are not the only one, as they shattered the romantic view of educators and ed-schools that neither rigor nor practice is necessary for meaningful education. So the education milieu did a successful hatchet job on the results. Enough said on this topic.

    “Then, when you see more recently … TIMSS researchers complaining that the relatively poor understanding in math in the U.S. is because of our skills focus and back-to-basics stance, and recent findings in reading that following direct instruction approaches seems to undermine reading comprehension”

    I am unsure where exactly you are taking your information from. When it comes to elementary math curriculum, U.S. curriculum is actually unfocused, broad, and repetitive, and hence the “mile wide, inch deep” moniker. It is not focused on anything—neither skills, nor understanding. When it comes to actual instruction, you are correct that observation of US classrooms indicated heavy reliance on recall and lower level problems in contrast to country like Japan. Funnily enough, however, a large fraction of teachers (around 95% if I recall) said that they were familiar with math (NCTM) reform ideas and believed they were actually implementing them in the classroom when they were teaching those low-level lessons. It goes back to our previous discussion about the pervasive lack of content knowledge among our elementary teachers, so they confuse form with content.

    I am unaware of any TIMMS researcher that attributed anything about our TIMSS results to our “back-to-basic stance.” In fact, US significantly improved on TIMSS since 1995, by 11 and 16 points in grades 4 & 8 respectively. Possibly that is not enough—we would like to be number 1—but certainly the standards-based reform of the last two decades seems to have helped us, perhaps by insisting on more content knowledge from teachers.

    “The miscommunication we’re having is I am saying kids are hard-wired to learn (it’s a motivational point), and you apparently think I’m saying kids are hard-wired to spontaneously spout forth spelling words or reading, or quadratic equations.”

    I quite agree that curiosity is hard wired (BTW also in animals), and that we should take advantage of it. I also never discussed anything beyond elementary skills with you, so I suggest we stick to that.

    “The point is that kids explore, observe, imitate, ask questions, and for example, can learn to read without any lessons on short “e” sounds or syllables or prefixes or many of the things schools (and direct instruction) assume are necessary to teach.”

    I worry that you may be making a classic mistake of a teacher of teachers—you assume that what is boring to you and to your adult students is also boring to young children. Children love not only to explore, but also to be explicitly taught the “system” behind reading or arithmetic, as it gives them enormous feel of power. Young kids love repetition and practice that adults find boring, because it gives them a sense of control. In fact, many kids actually hate lessons where teacher does not bring topics to explicit closure and they feel “cheated” in a way. Teachers often take pride that some kids “figure it out on their own” next day, when in reality those kids were, more often than not, explicitly taught at home after they asked their parents the question that the teacher refused to answer in class.

    “Under direct instruction conditions, you may find a reasonable correlation between progress on these subskills and actual reading, but if kids learn to read in and interest based way, you realize they can bite off much bigger chunks that direct instruction (DI) suggests, and in fact, you learn that much of that academic content is actually “academic” in the less flattering sense—not necessary for real-world performance.”

    You keep painting DI, and implicitly any explicit instruction, as shallow, useless and boring. I am sure that it can be boring in the hands of a boring teacher, as much as inquiry-based class can be boring and frustrating in the hands of a boring or unskilled teacher. And then you keep ignoring the documented fact that DI was the only Follow Through pedagogy that actually increased both reading and math skills, and that its effects lasted robustly until high school and college. So much for its shallowness.

    “Why would you constrain the discussion of student outcomes to low-level knowledge and skills in a few academic subjects? Schools should focus on developing and teaching what matters most to society, not sticking with how we defined our mission long ago.”

    The reason I tend to limit the discussion to fundamental knowledge (there is nothing “low level” in learning to read) is because that is where I believe our biggest weakness is. If a kid doesn’t break the code for reading—and without explicit instruction many will not—no amount of “high-level” talking by the teacher will solve the kid’s problem. In California, where I live, we have more than tripled the number of Algebra takers in 8th grade since 1998, and the average scores have increased despite this huge increase in the pool of test takers. Nevertheless, 45% of kids still don’t take Algebra as they are unprepared for it, and they take a grade 6-7 test instead. Our State Superintendent argues that he needs to “double the number of Algebra teachers in the state” to be able to teach all those kids Algebra in grade 8. Rubbish! Those kids are failing miserably on grade 6-7 test, not on Algebra test! Those kids have can’t add, subtract, or multiply and they were failed by their second and third grade teachers, not by the absence of Algebra teachers in 8th grade. That is why I’d rather focus on what you call “low-level knowledge.”

    “In 2009, let’s get out of the box we built for ourselves in the 1850s. Subject knowledge doesn’t make the top 5 in many CEOs’ or parents’ lists of desirable outcomes”

    Really?
    - Finding #1 (98%) in business leaders’ expectations from teachers: content knowledge. (U.S. Chamber of Commerce, 2006)
    - Finding #1 of business leaders: “High School Graduates are ‘Deficient’ in the basic knowledge and skills of Writing in English, Mathematics, and Reading Comprehension” (“Are They Really Ready To Work?” The Conference Board, 2006)
    Content is still king in the workplace. Only in the academia and among teachers there is this strange belief that soft skills replaced content. Sure, we do want soft skills. But for students that have the content already mastered.

    “So, the tests assess maybe 10-20% of the student outcomes schools should be promoting”

    The 10-20%–yet again—is your belief. Unfortunately you still did not provide any substantiation beyond that personal belief of yours. Because of NCLB all states assess by now most of their content standards on the tests. This is by no means only 10-20% of school outcomes, unless you believe that the majority of school outcomes ought to be non-academic. I don’t.

    “analyses of state tests indicate they overemphasize lower level content and underemphasize higher-level content standards, so they’re skewed and under-representative even in standard subjects.”

    This, unfortunately, is quite true. We should strengthen the depth of test items. If you know anything of psychometrics, I am sure you also know that this does not necessarily mean more constructed-response items or portfolios.

    “Professional mathematicians and psychometricians all trust the tests? Not a chance. A recent Ed Researcher had an analysis indicating the mathematicians thought maybe 1/4-1/3 of the NAEP math items had validity problems.”

    I did not write that they all trust (all) the tests. I wrote that ”almost all of them agree that well done (multiple choice, or not) math tests assess knowledge and understanding of math quite well.” You are right though that NAEP does have many badly done items and it behooves NAGB to have a professional review team of NAEP items rather than trust the contractor as they do today. But there are many states that have well done tests and their findings are quite solid. TIMSS is another well done test, while PISA is another rather poorly done one.

    “Teachers and parents—who work with kids daily, will spot problems the math folks will never think of, and many psychometricians do not know learning and kid’s responses well, so they have their own blind spots.”

    That is exactly what many states already do. They do include teachers and parents in their review teams. Unfortunately, they rarely include professional mathematicians.

    “Hmm, leading psychometricians have also pointed out that that multiple indicators is the professional standard, and what we are doing with tests violates professional standards for test use, and is, in a sense, malpractice.”

    Excuse me, but this is rubbish. The APA/AERA/NCME Standards suggest that “a decision or characterization that will have a major impact on a student should not be based on a single test score.” All the current NCLB testing has zero impact on students—the only impact is on schools. And anyway those are multiple scores in multiple subjects. The only case where there might be an impact on students is with high school exit examinations, and there the test is made up of multiple parts on multiple subjects, and can be taken multiple times. So there is no high-impact for single scores. Please stop these ugly insinuations on topics you are unfamiliar with. Please don’t take all your assessment information from Fairtest—there has never been a real test that Fairtest liked, and I am quite sure there will never be one. If they were ever to find one, they will need to commit a collective suicide on the spot.

    Kurt, I suggest we stop here. We have circled few times around the same topics and we seem to go nowhere. You deeply believe that explicit teaching is harmful to kids and kills their joy of learning. You are distrustful of tests, and presumably want to rely on teacher assessment. Unfortunately, you couldn’t provide a lot of support for your beliefs beyond pretty rhetoric. Let’s leave it there.

  11. Ze'ev:

    Karl,

    It is the second time I need to apologize. First time I mangled your name through a typo. This time it is much worse — I am in contact with another person with a name similar to yours, and I simply switched your names. This is worse than a typo — if you wish, you may call it a thinko. My sincere apology.

    Ze’ev

Leave a Reply