Categories
open content

Learning Engineering and Reese’s Cups

Reposting this message I sent to the Learning Analytics mailing list earlier this morning.

When I hear people say “learning engineering” I hear them talking about Reese’s cups.

I hear them talking about delicious chocolate (instructional design, or applied learning science or whatever you like to call it) and yummy peanut butter (learning analytics, or educational data mining, or whatever you like to call it). Chocolate and peanut butter are two things that, individually, taste great. And they taste even better together. In fact, they taste so much better together that people gave the combination its own name! They didn’t give this heaven-sent sweetie its own name in order to exercise dominance over either the chocolate or peanut butter industries. It was just really convenient to have a specific name to talk about this utterly fantastic combination of things. “I want a Reese’s cup!”

As I understand it, learning engineering is nothing more or less than a specific way of combining ID/ALS and LA/EDM techniques in order to engage in the iterative, data-driven continuous improvement of products designed to support learning:

  • You design something intended to support student learning (could be content, software, courseware, whatever).
  • You put it in the field and get students using it.
  • You measure its success at supporting student learning using a variety of analysis techniques.
  • You zero in on the parts that aren’t supporting student learning as successfully as you had hoped they would.
  • You re-design them.
  • You re-deploy them.
  • You re-analyze the degree to which they successfully support student learning.
  • You rinse and repeat.

That’s how I understand “learning engineering.” I could just as easily say, “the combination of specific instructional design and learning analytics techniques in support of iterative, data-driven continuous improvement.” Well, actually, no I couldn’t say that just as easily. 🙂

Categories
open content

S3: A Holistic Framework for Evaluating the Impact of Educational Innovations (Including OER)

This fall I’m once again teaching IPT 531: Introduction to Open Education at BYU (check it out – it’s designed so anyone can participate) and today I’m beginning a pilot run-through of the course redesign with a small number of students. I wanted to include a reading summarizing my current thinking on ‘evaluating the impact of OER’ in the course, so I’m letting some thoughts spill out below. This framework will be continuously improved over time.

In the past I’ve written frequently about how we evaluate the impact of OER use. These writings included ideas like the golden ratio (2009), the OER impact factor (2014), and thinking more broadly about impact (2018). Today I want to pull many of these thoughts together into a holistic, unified framework for measuring the impact of OER use (and other educational innovations). This new framework has three components: success, scale, and savings – hence the name S3. Below I’ll define each component, describe how to calculate it’s value, and describe how to aggregate the individual values into an overall score. I’ll then describe the applicability of the framework to evaluating educational innovations beyond OER.

Defining and Measuring Success

Student success is the first and most important component of the framework. In the S3 framework, “success” means “completing a course with a final grade that allows the course to count toward graduation.”

Research studies evaluating the impact of OER frequently report whether or not the grades received by students changed after their faculty began using OER. In the majority of cases the answer to this question is “no,” for reasons described by Grimaldi and his colleagues at OpenStax. In those cases where changes in final grade have been reported (by myself and others), they have often been reported as change in final grade percentage (86% versus 84%) or in GPA units (2.4 versus 2.6). While these changes are occasionally shown to be statistically significant, it is difficult to interpret their practical significance – because in almost every case (that I’m aware of) students who received the “higher” final grade actually appear to have received the same letter grade. In other words, a typical finding among the few papers that show a grade benefit associated with OER use is that while control students earned Cs on average, treatment students earned slightly higher Cs on average. This is what I mean when I say it’s difficult to determine the practical impact of such an “improvement” – GPAs likely don’t change in this scenario, meaning that students won’t qualify for scholarships at higher rates, improve their chances of getting into graduate school, or notice any other practical benefit.

The place where impact occurs most clearly is around the boundary between C and D final grades. Here a small change in final grade makes the difference between having to retake the course (i.e., paying tuition again, delaying graduation by another semester, possibly paying for the course materials again, etc.) or being able to count the course toward graduation. While the practical impact of the difference between a C and a slightly higher C is difficult to interpret, the difference between a D and a C is as wide as the ocean. Some research in OER has already begun using final grades of “C or better” (or the inverse, the DFW rate) as the measure of interest (e.g. Hilton, et al.) and more OER impact research should follow that lead.

In the S3 framework, success is measured using the C or better rate from before OER were used (an average of multiple previous terms is a more stable measure than the single prior term) and the C or better rate after OER began being used (again, an average of multiple terms is a more desirable measure than a single term) as follows:

$$success\;=\;\frac{C\;or\;better_{OER}\;-\;C\;or\;better_{Control}}{\;1\;-\;C\;or\;better_{Control}}$$

The maximum value for success is 1, and this occurs only when the C or better rate is 1 for OER users. This encodes both the idea and the goal that each and every student should succeed in the course. The value is undefined when the C or better rate is 1 for control students, as it isn’t possible to improve success in this case. You can explore the full range of values here.

Defining and Measuring Scale

Scale is the next most important component in the S3 framework. Scale means “the proportion of students being reached.” If work that is highly impactful in a single setting cannot be scaled to multiple classrooms, it does little to advance the cause of improving success for each and every student. For example, supplementing a classroom instructor with an instructional designer, a researcher, and a pair of graduate students may allow incredible things to happen in that classroom, dramatically increasing the success of the students in the class. However, if there’s no practical way to adapt this model to other classrooms, we can’t use this model to help improve success for all students.

In the S3 framework, scale is measured using the number of students in sections of courses using OER (e.g., the number of students in sections of Intro to Psychology using OER) and the total number of students in all sections of those same courses (e.g., the total number of students in all sections of Intro to Psychology), as follows:

$$scale\;=\;\frac{Number\;of\;students_{OER\;Sections}}{Number\;of\;students_{All\;Sections}}$$

The maximum value for scale is also 1, and this occurs only when all relevant course sections are using OER. This encodes both the idea and the goal that each and every student should be included. Scale can include a single course (e.g., Intro to Psychology) or multiple courses (e.g., all general education courses), depending on where OER is being used at an institution.

Defining and Measuring Savings

Savings is the final component in the S3 framework. Savings means, “the amount of money spent on course materials by the average student using OER compared to the amount of money spent on course materials by the average control student.” When calculated accurately, the savings measure takes into account several factors:

  • The materials assigned to control students are available at many price points
  • Some control students don’t spend any money on course materials
  • Some OER students spend money on printed copies of OER (or on printing OER)
  • Some OER students spend money on courseware or homework systems
  • Printed copies of OER frequently cost as much or more than courseware or homework systems (e.g., see the prices of printed OpenStax books on Amazon)

Contrary to popular belief, savings is very difficult to measure accurately and some estimation and guesswork is almost always involved.

In the S3 framework, savings is measured using the average amount of money spent by OER users and the average amount of money spent by control students, as follows:

$$savings\;=\;\frac{Average\;amount\;spent_{Control}\;-\;Average\;amount\;spent_{OER}}{Average\;amount\;spent_{Control}}$$

The maximum value for savings is also 1, and this occurs only when no student who was assigned OER spends any money on course materials. This value is undefined when no students in the control group spend any money on course materials, as it isn’t possible for OER users to save money in this case.

Calculating an Overall Score

In aggregating the individual scores into an overall score, we must consider the amount each individual component will contribute to the overall score.

I’ve argued above that success is the most important component of the three in this framework. As I wrote about at length in Taking Our Eye Off the Ball, I believe it is a huge mistake for us to look at 30% graduation rates from US community colleges and say, “the most important thing we can do is make that abysmal outcome less expensive.” (Likewise for the 60% graduation rate from US universities.) Making a 70% failure rate (or a 40% failure rate) more affordable is not the most important work we can do. We have to begin making meaningful progress on student success, and it should be weighted most heavily of the three components in the framework.

I’ve argued above that scale is the next most important component of the framework. Affordability (or savings) is one of the characteristics of a scalable innovation, as students can’t benefit from something they can’t afford. But it takes a lot more to make an educational innovation scale successfully than simply being affordable (or even free) – like how attractive it is to faculty or how easy or hard it is to implement successfully. Inasmuch as scale includes much savings and a range of other factors, it should be weighted more heavily than savings alone.

That said, I think it is important to continue to include savings as a standalone measure, if for no other reason than the historical importance of cost savings in the research on the impact of OER. But more importantly, including savings as a standalone measure of impact helps us remember the problems and mistakes of the past (and present) with regards to the pricing of course materials. Hopefully, keeping these errors front and center will decrease the chances that new innovations will travel down that path again.

As I have pondered relative weights for the components, it seems that success is at least twice as important as scale, and that scale is at least twice as important as savings. If we use:

$$impact\;=\;4\times success\;+\;2\times scale+1\times savings$$

We get a measure of impact with a maximum value of 7. I like that.

Extending the Framework to Educational Innovations Beyond OER

Viewed from a high level, OER is just one of hundreds of innovations whose proponents say has the potential to improve education. To the degree that is true, there is no reason to develop an OER-specific measure of impact. Indeed, it would be eminently useful to have a common measure of impact we could use to compare a wide range of would-be innovations.

There is nothing OER-specific about S3. You could use it to measure the impact of “things” like learning analytics or iPads or augmented reality. You could use it to measure the impact of “approaches” like problem-based pedagogies or collaborative problem solving pedagogies or active learning pedagogies. The questions S3 keeps us focused on are:

  1. How much does this innovation improve student success?
  2. How many of our students are benefiting from this innovation?
  3. How much money does this innovation save students?

These are all good questions to ask.

Categories
open content

The Revisability Paradox

Long-time readers will be familiar with “learning objects” and the “reusability paradox.” If you’ve been working in educational technology since the 1990s, you might want to skip the first section below. Or you may find it a sentimental walk down memory lane.

Learning objects and the reusability paradox

A learning object is “any digital resource that can be reused to support learning” (Wiley, 2000), and the goal of the learning objects movement was to design learning materials that were sufficiently small and self-contained as to be easily reused across many different learning contexts. Remember the joy of digging into a bin of Legos, pulling out random pieces and assembling them into whatever your heart fancied? This was the promise of learning objects, which were compared to Legos in almost every conference presentation and journal article on the topic.

The reusability paradox describes a difficulty at the heart of the learning objects idea. Here’s how I described it in the late 1990s:

1. The “bigger” a learning object is, the more (and the more easily) a learner can learn from it. For example, there’s only so much you can learn by studying a single photograph of a mountain (an image is a “small” learning object). On the other hand, you can learn quite a lot from a chapter on mountain formation, with multiple images, animations, and explanatory text (a chapter is a “large” learning object).

2. The “bigger” a learning object is, the fewer places it can be reused. For example, a single image of a mountain can be placed into a wide range of learning materials (e.g., it could be embedded in chapters about geography, history, photography, etc.). On the other hand, there are only so many places you can reuse an entire chapter on mountain formation.

To state it briefly, there is an inverse relationship between reusability and pedagogical effectiveness.

Because pedagogical effectiveness and potential for reuse are completely at odds with one another, the designer of learning objects faces a difficult choice. They can either (1) design smaller objects that are easier to reuse but require significantly more effort and assembly on the part of the instructor before they are useful for learning, or (2) design larger objects that more effectively support learning but have limited potential for reuse.

(If you’ve ever thought my writing was insufferably pedantic before, I promise you ain’t seen nothin’ yet. Try this detailed elucidation of the reusability paradox from 2002.)

The modern reader will no doubt scratch their head and ask, “why not just start with a larger, more useful learning object and adapt it to meet your needs?” The answer, of course, is that in the late 90s and early 00s the open content movement was only just beginning. We didn’t have the 5Rs back then. The book on learning objects I linked to above was published under the Open Publication License because Creative Commons didn’t even exist yet. The universal (and universally true) assumption was that learning objects were traditionally copyrighted, meaning you had to reuse them exactly as you found them (just like Legos).

OER and the Revisability Paradox

That bit of history prepares us to discuss open educational resources (OER) and the revisability paradox.

Open educational resources are teaching, learning, and research materials that are either (1) in the public domain or (2) licensed in a manner that provides everyone with free and perpetual permission to engage in the 5R activities. I don’t believe readers of this blog need much additional context about OER.

The revisability paradox describes a difficulty at the heart of the OER idea. Here’s my current best attempt at explaining it:

1. The more research-based instructional design is embedded within an open educational resource, the more (and the more easily) a learner can learn from it. For example, there’s only so much you can learn by reading an explanation of what a conifer is (a “simple design” OER). On the other hand, you can learn quite a lot from an activity that (1) isolates and explicitly describes the critical attributes that separate instances of conifers from non-instances and (2) provides you with the opportunity to practice classifying trees as instances or non-instances, coupled with immediate, targeted feedback (a “research-based design” OER).

2. The more research-based the instructional design of a OER is, the harder it is to revise and remix without hurting its effectiveness (that it, the more instructional design expertise is necessary to revise and remix effectively). For example, many different kinds of changes could be made to a simple explanation of conifers without changing its effectiveness in supporting student learning. On the other hand, there are many ways of changing the research-based design that would cause it to be no more effective than the simple explanation.

In essence, without instructional design expertise, looking at well designed learning resources is like watching a sporting event whose rules and nuances you don’t understand. Remember watching hockey/baseball/soccer/cricket for the first time? Remember the first time you watched with someone who deeply understood the game, and you started to realize how much you were missing – even though you were both watching the same game? It’s like that, but with the research on supporting learning effectively. Without instructional design expertise it’s easy to look at something like the explicit isolation of critical attributes and think “Boring! I have a way more interesting way of explaining that!” …and we’re right back to explaining.

In other words, there is an inverse relationship between revisability and pedagogical effectiveness.

Implications

One of the amazing things about the OER movement is how the “OER way of thinking” has democratized access to the creation of learning materials. Anyone with domain expertise and a word processor or WordPress instance can write definitions, descriptions, and explanations. This won’t result in particularly effective learning materials, but it will result in OER that look a lot like their traditionally copyrighted counterparts, are less expensive than their traditionally copyrighted counterparts, and that anyone can revise or remix without doing any damage.

The Jenga by Ed Garcia from https://flic.kr/p/4GW2c2 is licensed CC BY.

Revising or remixing OER with a research-based instructional design is much more like playing Jenga blindfolded. When you don’t fully understand the instructional functions of the different elements of the learning materials, there’s no way to know whether pulling one out or swapping it for something else or changing it in some other way will cause the whole efficacy tower to collapse. And the biggest problem, of course, is that when you destroy the efficacy tower you don’t know you did – because you don’t know the rules of the game and can’t really see what’s happening.

Choices… and a Question

The designer of learning objects can either (1) design smaller objects that are easier to reuse but require significantly more effort and assembly on the part of the instructor before they are useful for learning, or (2) design larger objects that more effectively support learning but have limited potential for reuse.

Likewise, the designer of open educational resources can either (1) create “simple OER” – resources with rudimentary instructional designs that aren’t particularly effective at supporting student learning but are easy to revise and remix without decreasing their effectiveness, or (2) create “complex OER” – resources using research-based instructional designs that are far more difficult to revise and remix without decreasing their effectiveness (i.e., they’re “easy to break”).

Which leads me to wonder… What is the role of instructional design / learning science / learning engineering / related forms of expertise in the creation – or revising and remixing – of learning materials? Insisting that this expertise is important feels like it pulls against the democratizing power of modern conceptions of openness in education. But denying that this expertise matters feels like it joins the broader anti-expertise chorus currently eroding public policy.

So… now what?