Back in December Michael Feldstein wrote a terrific post about Pearson’s new initiative around “efficacy.” There has been a great thread of comments attached to his (as always) excellent piece of writing. I’ve been wanting to add my thoughts on the topic for a while. I’m finally getting around to it.
The Conversation Can’t Be About Efficacy (Only)
Many of you know I am hugely inspired by Bloom’s work on the “2 sigma problem.” In many ways, Bloom’s work is the last word in instructional efficacy – for three decades now there has been no mystery whatsoever about the most effective way to teach. Bloom and his group showed conclusively that the average student who:
- received their instruction via individual (or very small group) tutoring, and
- whose tutors took a mastery-based approach to instruction
performed two standard deviations better than the average student who received traditional classroom-based instruction (hence the “2 sigma” name). Another way of saying this is that the average student in the tutored group outperformed 98% of students in the traditional classroom group. So if Pearson or anyone else wanted to evangelize and propagate truly effective education, they just need to train tutors on mastery-based approaches and provide one for every student in the world.
But there’s a problem with this approach, and here is one of the rare moments when the heady realm of educational research actually reaches down and touches the lowly earth. Bloom recognizes that his discovery of this incredibly effective way to support student learning is of only academic interest because, even though it shows the amazing potential of the average student to learn significantly more than s/he does in a typical classroom environment, there is no way to make it work in the real world:
“The tutoring process demonstrates that most of the students do have the potential to reach this high level of learning. An important task of research and instruction is to seek ways of accomplishing this under more practical and realistic conditions than one-to-one tutoring, which is too costly for most societies to bear on a large scale. This, then, is the ’2 sigma’ problem.” (p. 6; emphasis in original)
In today’s jargon, this kind of tutoring doesn’t scale. And this is the reason we refer to Bloom’s work as the “2 sigma problem.” The problem isn’t that we don’t know how to drastically increasing learning. The two-part problem is that we don’t know how to drastically increase learning while holding cost constant. Many people have sought to create and publish “grand challenges” in education, but to my mind none will ever be more elegant than Bloom’s from 30 years ago:
“If the research on the 2 sigma problem yields practical methods – which the average teacher or school faculty can learn in a brief period of time and use with little more cost or time than conventional instruction – it would be an educational contribution of the greatest magnitude.” (p. 6; emphasis in original)
So the conversation can’t focus on efficacy only – if there were no other constraints, we actually know how to do “effective.” But there are other constraints to consider, and to limit our discussions to efficacy is to remain in the ethereal imaginary realm where cost doesn’t matter. And cost matters greatly.
Analogous Problems in Big Pharma and Big Publishing
In this sense, the problem isn’t unique to pedagogies (e.g., tutoring). Imagine, for example, a Big Pharma company that discovers a cure for cancer. They announce to the world with much fanfare that “the three most common cancers are cured!” and that the medication is available worldwide immediately. However, the pills cost $1000 every six months and must be taken for four years to affect the cure. What percentage of people in the world with these three cancers will actually be cured? Far less than 1%. There’s a significant difference between the “effectiveness” of a product in a clinical study, where you guarantee that every participant will take your pill, and the real world where insane (immoral?) pricing means that only the extremely wealthy are able to take your pill. Big Pharma’s claims about having “cured” anything seem tone deaf at best when the cure is priced out of the reach of normal people.
Textbooks are to students as medicines are to doctors – that is, faculty prescribe the textbooks but students are the ones that have to pay for them. And when a Big Publisher releases an exciting report about the efficacy of a new textbook or product like MyMathLab, that study will have been conducted in a controlled lab setting where Pearson guarantees that every student is using the product. “Great!” a faculty member reading the report might think, “we’ve cured math cancer! I’ll adopt this product!” But if MyMathLab is so expensive that the majority of students in a course can’t afford to buy it, we’re back to not having cured anything. Big Publishers’ claims about “highly effective materials” seem completely out of touch to students (and their parents) who can’t afford them.
Again, we can’t talk exclusively about efficacy. When we try to, we lapse back into the tree falling in a forest – if a huge proportion of the population can’t get access to some company’s textbook / MyLab / whatever, can we honestly claim that it’s highly effective? Access is critical. There can be no efficacy without access. And when access conditions in the research lab do not mirror access conditions in the real world, efficacy studies tell us nothing about the actual efficacy of a product. We have to add a consideration of students’ ability to actually access and use (and as I have argued elsewhere, own a copy of) the product to discussions about efficiacy.
Efficiency and the Golden Ratio
If we want to actually change the experience of students in the real world, rather than talking about efficacy we need to talk about the relationship between efficacy and cost – efficiency. And to me, the best way to talk about efficiency is using a measure I’ve been calling the “golden ratio.” This is a measurement that puts efficacy in the numerator and cost in the denominator, and though I’ve been talking about “standard deviations per dollar” for five or so years now, my own conception of the measure continues to evolve, and I want to develop it further in this article. But suffice it to say that any conversation of efficacy that ignores cost is purely academic in the worst sense of the word (see definition 2).
Let’s begin with something obvious but important: by definition, an average investment typically results in average learning. In other words, by definition, paying the costs typically associated with putting a teacher at the front of a room of students generally results in the amount of learning we typically see. Now, the golden ratio doesn’t account for these baseline costs or this baseline amount of learning – the ratio compares changes in costs to changes in learning. Let’s return to the Bloom 2 sigma work as an example.
Bloom has already provided us with a numerator – when students each receive their instruction from a tutor (instead of as part of a larger class), there is a two standard deviation gain in learning over students who receive traditional classroom-based instruction. However, Bloom doesn’t give us a denominator – he doesn’t tell us what that tutoring would really cost. If we assume that a reasonable full-time tutor costs about the same as the average full-time teacher gets paid, and subtract the cost of the classroom teacher pro-rated across all students, then this Bloom-style tutoring adds about $50,000 per student to the annual cost of instruction. In the golden ratio (rg) way of thinking, we might ask “is an additional 2 standard deviations of learning worth an additional $50,000 per student? Does rg = 0.0004 seem like a good deal?” Of course, whether it’s worth it or not is completely irrelevant because there is no way the overwhelming majority of organizations could afford it.
The Golden Ratio and Publisher Textbooks
I’m persuaded that one of the reasons we see lower rates of student success in higher education, especially among at-risk students, is that they cannot afford access to the core instructional materials intended to support learning in their courses. In a recent survey of over 15,000 students, 23% of students report that they frequently do not buy required textbooks due to cost, and 64% of students report skipping required textbooks at some point due to cost. Textbooks and related services are, quite simply, immorally expensive (US $1.5B adjusted operating profit in 2012 for a single publisher?), and the disappearing ink strategies publishers and schools recommend aren’t legitimate responses to the problem.
Now, I’m not saying that I believe that a better designed textbook can make a 2 sigma difference in student learning – because I don’t – but textbooks are in the same two-part problem category as Bloom’s 2 sigma problem. There is a potential benefit that textbooks, online services, and other products can provide (a numerator), and there are associated costs that we ask students to bear (the denominator). How do we maximize the numerator while holding the denominator steady? Unfortunately, we almost NEVER talk about these educational products in these terms. In fact, in higher education we almost never talk about changes in learning beyond changes to the metric “percent completing with a C or better.” But I think we can still use this more crude numerator to calculate some golden ratios that are both interesting and useful. For the remainder of this post let’s define rg as pass rate (in percent) divided by the cost of required textbooks per student (in dollars), and I’ll use “textbook” as a shorthand for all kinds of materials, online services, etc.
As an example, what is the pass rate for college algebra courses where faculty assign only a Pearson textbook, and what is the pass rate for other sections of the same course where they assign the Pearson textbook plus MyMathLab? And how much extra does MyMathLab cost? Even if we could force every student in a course to purchase the product (so the real world would mirror the lab), would a 7% increase in pass rate be worth an additional $80 per student? Would a 3% increase? Would a 35% increase?
But more interestingly, in the real world where we can’t replicate controlled lab settings and increased costs will lead to decreased student access, will the additional number of students who might pass because of their use of MyMathLab exceed the number of students who will skip the textbook + MyMathLab bundle altogether due to cost and suffer academically as a consequence? What will the actual net impact on learning and pass rates be when a faculty member adopts a $170 bundle like this? As amazing as it seems, the field appears to have neither the vocabulary nor the conceptual frame to even carry on this conversation.
The Golden Ratio and OER
One of the reasons I continue to be so interested in Open Educational Resources generally, and open textbooks specifically, is that they can be made to meet and even exceed Bloom’s practical impact requirements – they’re things “the average teacher or school faculty can learn in a brief period of time and use with little more cost or time than conventional instruction.” Not only can faculty learn to use an open textbook instead of a commercial textbook in a very brief period of time, open textbooks are significantly less expensive for students. So open textbooks can actually attack both parts of Bloom’s challenge – they can improve outcomes (by increasing access to all students, even if the quality of the instructional design is similar) and decrease cost. While open textbooks may not be able to improve learning by two standard deviations for the same cost today, they absolutely can result in better than typical learning at significantly lower than typical cost.
For example, beginning in 2011 we helped a college in the northeast move their College Algebra course away from a $170 MyMathLab bundle to an open textbook, open videos, and a hosted and supported version of MyOpenMath – an open source platform for providing online, interactive homework practice. In Spring Semester 2011, when every section of the course used the $170 bundle, 48% of students completed the course with a C or better. In Spring Semester 2013, after all sections of the course had transitioned to the OER and open source practice system (which Lumen Learning hosts and supports for $5 per student, paid by institutions and not students, for institutions who don’t want to host it themselves), the percentage of students completing with a C or better grew to 60%.
So for a scenario like this one, the two ratios would be:
- Old model: rg = (48% pass rate) / ($170 required textbook cost) = 0.28 percent passing per required textbook dollar
- New model: rg = (60% pass rate) / ($5 required textbook cost) = 12 percent passing per required textbook dollar
The golden ratio provides a simple, intuitive way to talk about the overall impact of an educational product. It also provides a similarly straightforward way to compare the overall impact of two products. Given all the hype around learning analytics and the sophisticated analysis of big data in education, it seems amazing to me that we can’t get this basic, basic, basic level of data from vendors – the same vendors who are all megaphones and mailing lists about the advanced capabilities of their innovative new data-driven systems. (I wonder why they don’t provide it to us?)
We can also calculate an “OER impact factor” which I’ll designate w (omega for open) – the overall effect of switching from publisher materials to OER – by dividing the golden ratio for OER by the golden ratio for the previously used publisher materials:
- w = 12 / 0.28 = 42.85
I think this would be an extremely interesting metric for open initiatives to explore and report.
rg, w, and the Future of the Efficacy Conversation
Obviously these models could be more nuanced. It’s not clear to me whether the additional clarity we might achieve through increasing their sophistication would be worth the potential drop in understandability. There’s an appealing simplicity to comparing percentage pass rate to required textbook cost. I’ll continue thinking about how the models might be improved (you may have some ideas as well!). And while it will take some time for me to wrap my head around the implications of these models, I think we can already draw out some interesting initial implications. Here are two examples:
- You can never have an rg larger than 1 when the required textbook cost exceeds $100. There’s something beautiful about that.
- Placing cost in the denominator – which by definition can never be zero – of rg acknowledges that there is always a cost (even if it is very, very tiny) associated with using OER. It also accounts for those costs whether students pay for their materials or institutions do.
It also occurs to me that in K-12 settings, where institutions pay for textbooks and other materials instead of students, these measures might contribute to ongoing conversations about the efficient use of public funds.
I’m looking forward to thinking more about these measures and employing them in my own research (there’s no better way to understand their utility!). I think w, the OER impact factor, will be particularly useful in talking about the true impact of OER adoptions. There are some issues around the interpretability of specific values for rg and w, and I’ll need to explore the space of possible values and provide some guidance about how to interpret those, too. But overall, even if these simple models don’t end up being used by anyone other than me, I hope they inspire a more realistic discussion of efficacy across the field. There’s no efficacy without access, and access is largely a function of cost. You can’t talk (responsibly) about efficacy without simultaneously addressing cost. Bloom pointed this out 30 years ago, and rg provides one simple way of doing it. Maybe you’ll devise a better one!
Postscript. The power of open goes far beyond cost savings, a point I continue making with my work and advocacy around the 5Rs (more). However, raising that issue during this discussion of rg and w felt like it would just muddy the primary argument about the need to consider cost in addition to efficacy.