E-literate recently ran a story about the emergence of a genuine science of learning. Keith Devlin follows many who came before him in making an analogy to medicine. Generally speaking, I don’t like comparisons of education to medicine. I think they’re problematic for a range of reasons I’ve written about in the past. But in the context of this article, the biggest problem with the comparison has to do with the role of data.
Educational technologies have worked their way into the everyday lives of millions of students, resulting in an explosion of learning-related data that could be analyzed by researchers. The logic of the argument is that the ease and scale of capturing these data will result in an unprecedented corpus of “big data” about learning whose analysis will rapidly drive new discoveries in learning science. Now, while medical instrumentation has made incredible advances in the past century, these instruments haven’t worked their way into the everyday lives of millions of people, resulting in a similar explosion of medical data available for analysis using “big data” techniques. (Data-obsessed athletes and quantified self advocates being the closest thing we have to an exception here.)
If we truly believe that the coming revolution in learning science will be enabled by the unprecedented levels of data collection enabled by pervasive educational technologies, perhaps the analogy to medicine is mistaken since medicine has never undergone a similar transformation. Perhaps we should look for another field where a technology-facilitated explosion of data drove progress forward. How about meteorology?
You can read Meteorology on Wikipedia for more detail, but I’ll summarize here. Weather is happening everywhere, all the time – the amount of potential data to capture is “big.” The development of reliable instruments and common classification scales led to a standardized way of collecting and talking about weather data. Data collection networks were then established to use these new instruments to systematically collect this data in a wide range of geographical contexts. The invention of the telegraph made it possible to bring massive amounts of this data together in a single location quickly. As the scale of these data grew, people began talking about numerical weather prediction, beginning with the 1904 paper “Weather Forecasting as a Problem in Mechanics and Physics.”
Any of this sounding familiar yet, educational technologists?
Maybe there would be some value in a closer evaluation of this analogy. Weather is a dynamical system, and so is learning. Both are highly sensitive to initial conditions (like a student’s prior knowledge). The study of each will always result in probabilistic forecasts rather than deterministic predictions.
Given the efforts pundits are making to persuade us that education is poised to enter a new golden age thanks to all the data that are now available, we’d do well to familiarize ourselves with relevant history. Who knows – maybe one day learning scientists will be as accurate at predicting learning as weathermen are at predicting the weather. While that may sound like a dig against both professions, I actually mean it simply as an acknowledgment of how incredibly complex and dynamic both phenomena are.
On reflection, the impact of learner agency on the learning process makes me doubt that we can ever be as good at predicting learning as we are at predicting weather. Still, I think there might be value in thinking about this analogy a little more.
What do you think? What other fields have seen rapid progress occur through a technology-facilitated explosion of data? What could we learn from them?
Hello David. Use of data in sports has grown rapidly in the past decade. Baseball strategy, in particular, has change dramatically from the game I played as a kid. Wins over replacement (WAR), on-base percentage (OBP), walks per inning pitched (WHIP), ultimate zone rating (UZR) indicates defensive range and efficiency, and my new favorite, an estimate of a team’s expected winning percentage based on runs scored and runs allowed, Pythagorean expectation. As you would guess, the 2016 Chicago Cubs had the highest winning percentage in MLB (.640) and the highest Pythagorean record (.699). Numerous one-run games caused the Cubs to come ten games shy of their predicted record of 113 wins. Minimally, there are hundreds of measures of statistical data for baseball. For me, it makes the game more interesting even though it remains highly unpredictable.
How about consumer marketing? The promise of consumer tracking is that companies can provide us more relevant product recommendations by analyzing all the individual data points related to our interest (or lack of interest) in other products. Not only is every product purchase analyzed for clues about our interests, but also adding something to my cart (or taking it out), adding to my wishlist, looking at a product page, etc.
The lesson? Despite the theoretical appeal to such a promise, the actual algorithms are complex enough that most companies are still recommending me products that I already bought from them. Amazon uses the data better than pretty much anyone else, but even when I look at two or three products as a gift for someone else, that’s enough to foul up their future recommendations for at least a couple of weeks. For some reason those couple of outliers are strong enough to over-rule the large number of products I look at that I’m really interested in.
And even with the unmanageable number of variables in a good product-recommendation algorithm, that’s still small compared to the number of variables involved in the “optimal” strategy to teach math to poor third-graders.
Maybe another example is the national security industry. They track everything. And while I don’t know a lot about their systems, from what I understand they’re actually drowning in the amount of data they’re collecting. In other words, they have so much information that the actual security threats such data is supposed to reveal just can’t be found in all the noise. Lesson: more data may not solve your problem; it may actually make it worse.
I see this applying to education too. Students make an impression and… confirmation bias rules. (“I can’t do math!”)
Maps & directions.