The LHC and Education

I’ve always been impressed by the idea of the Large Hadron Collider. It’s an unthinkably expensive, large-scale experimental apparatus designed for the sole purpose of generating and collecting data. Why would countries spend so much money on data? Why would so many people dedicate the better part of their lives to a project like the LHC? Because the so-called “hard” sciences – fields like physics and astronomy – have made the remarkable progress they have in understanding the structure of matter and the nature of the universe because they really care about data. They care about data in a way that educators have a difficult time comprehending, let alone understanding.

The data that we, educators, gather and utilize is all but garbage. What passes for data for practicing educators? An aggregate score in a column in a gradebook. A massive, course-grained rolling up of dozens or hundreds of items into a single, collapsed, almost meaningless score. “Test 2: 87.” What teacher maintains item-level data for the exams they give? What teacher keeps this data semester to semester, year-to year? What teacher ever goes back and reviews this historical data? After a recent tweet on this topic, a number of colleagues accused me of having physics envy. Believe me, you don’t have to wish you were a physicist to be disappointed by the quality of data educators have access to.

I’m beginning to believe that we’ve got it completely backwards. For decades we’ve been trying to use technology to improve the effectiveness of education. How, specifically, have we tried to use technology? At a high level, we’ve tried to use it to deliver content to learners. The goal has been to “find something that works,” and then deliver that something (interactive content, etc.) to learners at high fidelity and low cost. In our attempts to deliver effective content at scale, I believe we have had a nationwide (if not worldwide) encounter with the reusability paradox, which I first wrote about at length in 2001. Briefly stated, the reusability paradox says that, due to context effects, the pedagogical effectiveness of content and its potential for reuse are orthogonal to another. This finding is too inconvenient to accept, as it would destroy or severely maim the prominent paradigm of educational technology research, and so it has been roundly ignored by the educational research community.

While using technology to deliver content seems to have had no noticeable impact (or even a slightly negative) on the effectiveness of education, using technology to deliver content has had a huge impact on the accessibility of education. Think of distance learning… Think of opencourseware and open educational resources… Think of the millions of people who now have access that never would have had access otherwise. The impact of using technology to deliver content on increasing access to education is completely unassailable and totally undeniable.

So, if using technology to deliver content is not improving the effectiveness of education, is there another way we might use technology that can? I believe there is. I believe it so strongly that for the first time in several years I am opening a new line of research. I believe (and I fully admit that it is only a belief at this point) that using technology to capture, manage, and visualize educational data in support of teacher decision making has the potential to vastly improve the effectiveness of education. Think of it as “educational data mining” or “educational analytics.” For example, think of all the data, algorithms, and resources that go into selecting ads to show in search engine results and other places around the web, and then think of using all that horsepower to make suggestions to teachers about appropriate opportunities to intervene with students.

The Open High School of Utah is the first context in which I’m studying this use of technology. Because it is an online high school, every interaction students have with content (the order in which they view resources, the time they spend viewing them, the things they skip, etc.) and every interaction they have with assessments (the time they spend answering them, their success in answering them, etc.) can all be captured and leveraged to support teachers. The OHSU teaching model, which we call “strategic tutoring,” involves using these data to prioritize which students need the most help and enabling brief tutoring sessions. A teacher’s typical day involves visiting the dashboard, viewing the first student in a prioritized list of students, seeing what s/he needs help on, and engaging him/her by Skype, phone, IM, or other means, for a very brief, very targeted individual tutoring session. Then the next student, then the next student, etc. Students who are on track or working ahead in the online curriculum don’t have to wait for an interaction with the teacher (they’re succeeding, after all), and those who need help get it – individualized, just in time, and sometimes before they even know they need it. From a caring human being – not a supposedly intelligent tutoring system.

Now, if the OHSU wasn’t delivering content online we couldn’t capture all this data. So in one sense, it’s key to deliver content online – if only to get the types of data we need to support teachers supporting students. But currently, we’re stopping short, confusing the means for the end.

Another realization that comes part way down this path is that our instructional design programs may teach people how to design instruction that is motivating and engaging, but we don’t even begin to teach people how to design materials and systems that capture the right kinds of data. We don’t even discuss what the “right” kinds of data might be.

Coming back to the LHC, I think meaningful progress in education will depend on educators becoming infected with a passion for data like the LHC embodies. Not rolled up percentile scores, coarse-grained data that obscure all the meaningful details we might care about. We need access to real-time data on every individual student every day of the year, we need tools and techniques for supporting teachers in interpreting the data, we need new teaching models that leverage the existence of these data and tools, etc. This is what I think technology-enhanced education is supposed to be.

The investment it would take to deploy such an infrastructure would rival the cost of the LHC, but would be almost impossible to make – because educators either don’t care about data or have a vision of data that is limited by their own experience recording things in a gradebook or spreadsheet. Using technology in creative ways could provide us with so much more data it would boggle the imagination… It could transform the teacher’s work from one based on hunches and intuitions to one actually based on data. And low and behold, we might actually move the needle a bit when we combine the best of hardcore empiricism with the best of caring, nurturing people.

We’ll certainly never meet Bloom’s 2 sigma challenge if we think the proper role of technology in education is simply delivering content (whether interactive, intelligent, or otherwise). However, if we get serious about capturing and using data to support teacher decision-making and improve student learning, we may have something.