Do We Need a National Open Education Strategy?

tl;dr – In order to be relevant today and in the future, a national open education strategy must (1) know exactly what it is trying to accomplish and (2) deeply integrate generative AI.

WICHE is convening a series of conversations this week and next titled, “Do We Need a National Open Education Strategy?” This essay is my (very) personal contribution to that conversation.

How We Got Here

In 1998, when I launched the OpenContent project and the first open license for educational materials and other creative works (that weren’t software), I encouraged anyone and everyone to openly license anything they were willing to openly license. I was inspired by the transformational potential of the internet – only available to the broader public for a few years at that point – and the open source software movement. Combining open licenses with the internet’s capacity to share instantaneously around the world seemed to have the potential to revolutionize education. I had no strategy in terms of making open content easy for educators and learners to understand, adopt, or use – I was just trying to convince people the world wouldn’t end if they shared their work under open licenses (because most were convinced it would). The materials shared during those first years were totally random – essays, photos, technical documentation, etc. Similarly, when Connexions launched at Rice University in 1999, it promoted sharing individual bits of content as well.

In the early 2000s, MIT OpenCourseWare put forth a more coherent strategy of openly licensing the entire collection of materials faculty had developed for a specific course. (This is also when the Creative Commons licenses were launched.) Most faculty still seemed convinced the world would end if they shared their course materials, but the more coherent presentation of open content as “the collection of materials used to support a course,” together with the power of MIT’s brand, helped people begin to catch the vision of what was possible with open content. After the launches of MIT OCW and Creative Commons were announced, a UNESCO convening decided that we should all start calling open content used in education “open educational resources.”

In the late 2000s and early teens, something like a movement-wide strategy began to coalesce around the idea of packaging open content as a textbook in order to make it easier for teachers to understand and adopt. (The Free High School Science Textbooks project had been one of the pioneers of this model in the early 2000s.) Flat World Knowledge began publishing “open textbooks” in 2007, and Connexions at Rice changed their name to OpenStax and started publishing open textbooks in 2012.

And that’s essentially where innovation stopped. To hear some OER advocates describe it today in 2024, the same format that was being used in the late 2000s – traditional-looking textbooks published under open licenses – is the state of the art when it comes to open educational resources. From this perspective, perhaps the most important innovation of the last twenty years was the idea of formatting open textbooks as course cartridges so they’re easier to upload into a learning management system. 🙁

Of course innovation with OER didn’t actually stop with openly licensed traditional textbooks. OER have also been used as part of personalized, interactive courseware systems, too. For example, both the Open Learning Initiative at Carnegie Mellon University and Lumen Learning use OER this way. But courseware and related efforts have been shunned by many in the OER community because during the 2010s much of the community changed their focus from ‘improving access to educational opportunity’ to ‘ensuring that course materials are free for students.’ And while this is a related goal, it is a different goal in critically important ways.

Free, No Matter the Cost

There’s nothing wrong with changing your focus – I’ve changed my focus over the decades, too. When I initially had the idea to openly license educational materials in the late 1990s, it was exciting to me because of its power to increase access to educational opportunity. And I became a vocal advocate for using open content (and later OER) to increase access to educational opportunity. But over decades of doing the work, I’ve seen first-hand that increasing access to opportunity isn’t enough by itself. Consequently, I changed my focus and became an advocate for using OER to improve student success. I don’t just want things to be possible; I want them to actually be better. I don’t want to stop at opportunities, I want to get all the way to results.

From my perspective, the pivot of many in the OER movement to an insistence that educational materials be free has trapped them in that 2010 world of traditional-looking, openly licensed textbooks. This is because when you demand that there can be no costs whatsoever associated with learning materials, the most sustainable path forward is creating a nicely formatted PDF for students to download. And when you enshrine that zeal for “free” in policy – by making ‘zero-cost’ materials mandatory (or very strongly preferred) in certain courses or across campus – not only do you trample on faculty’s academic freedom to choose the best course materials for their students, but you place a priori limits on the kinds of innovation that are possible. And if you look at the national data on student outcomes in higher education – especially for our BIPOC and low income students – you can’t say, “This is working great! We just need to make it cheaper!”

Research is crystal clear that when students engage in interactive practice with immediate feedback they learn dramatically more than when they read or when they watch video (see, for example, Koedinger, et al., 2016; Koedinger, et al., 2018; Van Campenhout, et al., 2022; Van Campenhout, et al., 2023). To be frank, the active opposition of many in the OER community to interactive courseware has likely saved students money by harming students’ learning. I think of this choice to prioritize lower costs over better learning as a “free, no matter the cost” mentality. (Bear in mind that many interactive OER courseware offerings are well within the national consensus “low cost” range of $40 and below.)

The Role of Generative AI in Open Education

The resistance of many in the OER community to courseware, because it has a cost, does not bode well for the community’s future relationship with generative artificial intelligence. “Inference,” what we call it when a generative AI system like ChatGPT generates a response to a user prompt, costs money. Consequently, even very well-funded nonprofits like Khan Academy charge a fee to use their generative AI tool (Khanmigo). If a meaningful portion of the OER community continues its insistence that educational materials be zero-cost, and continues advocating for institution-wide, system-wide, or state-wide zero-cost policies, that zero-cost advocacy will exclude students from having access to the most powerful educational technology since the creation of the internet.

There are many ways to think about the role generative AI could play in a nationwide open education policy in the US. I discuss the three most obvious of these roles below. But I fully acknowledging that we still don’t know what the most powerful roles for generative AI will be.

Part 1: GenAI and Traditional OER

The first, most obvious role, is that generative AI can be used to create what I will call “traditional OER” – the traditional-looking openly licensed textbooks, chapters, essays, images, etc., that have been the focus of the mainstream OER movement for the last fifteen years. The US Copyright Office has been consistent in asserting that products generated by AI tools aren’t eligible for copyright protection. And if you look to the Constitution’s Copyright Clause, specifically if you look at the rationale it lays out for giving Congress the power to grant copyrights and patents, there’s no way that works created by AI should be eligible for protection. The purpose of copyright, as described in the Constitution, is to provide an incentive for creators to create. AI doesn’t need an incentive. Consequently, the US Copyright Office has consistently said that works generated by AI are not eligible for protection.

This means that everything created by ChatGPT, or Claude, or Bard, or DALL-E, or any other generative AI tool is OER. Because these creations are in the public domain (i.e., not copyrighted), you can legally engage in any of the 5R activities with them (retain, revise, remix, reuse, redistribute). That makes them OER.

By decreasing the amount of time it takes to create a complete first draft from months to days, generative AI can dramatically reduce the time and cost of producing traditional OER. Using generative AI can also dramatically reduce the time and cost of updating and maintaining traditional OER, helping with (but not fully solving) what has been known as “the sustainability problem” with OER.

Part 2: Open Prompts

Generative AI creates its outputs in response to prompts provided by a user. These prompts are comprised of instructions written as text. Some prompts are too simplistic to be eligible for copyright protection, but as prompts become more sophisticated and powerful, the likelihood that they would be eligible for copyright protection increases. These more sophisticated prompts might be able to direct a large language model to act as a highly skilled tutor in an extended teaching interaction with a student, for example. Prompts which are copyrightable can be shared under open licenses, making “open prompts” a kind of OER.

Prompts are recipes for automatically creating personalized OER in real-time. A single, well-written prompt could be used by millions of people to generate billions of written lines of personalized explanations, examples, practice questions, feedback, and other educational materials. For this reason, they are significantly more powerful in terms of their ability to increase access to educational opportunity and improve student outcomes than traditional OER.

Consider the amount of time and money invested in the creation of a typical open textbook. Think of all the effort that goes into creating a static, generic resource which cannot respond dynamically to the needs and interests of individual learners. Now imagine what might result if that same amount of time and effort was instead invested in creating and refining openly licensed prompts crafted to cause generative AI systems to engage in evidence-based teaching practices as they interact with learners.

The potential to combine open prompts with systems that already use OER interactively, like personalized courseware, suggests a best-of-both-worlds scenario.

(I’ve ignored multi-modality and the dynamic generation of prompts here for the sake of simplicity. But their inclusion would not change the conclusion that time and effort invested in creating open prompts will provide dramatically more educational impact than the same amount of time and effort invested in creating an open textbook. Including multi-modality, etc., would just make the explanation more complicated.)

Part 3: Open Models

Openness in generative AI can go beyond just the prompts. The 5Rs framework can also be applied to model weights, which are like the source code of generative AI models. If you’re allowed to download model weights (retain), fine-tune them using RLHF or DPO or some other technique (revise and remix), use the updated model weights for any purpose (reuse), and share your updated weights with others (redistribute), then we can talk about a generative AI model being “open” in the same sense that OER are open. And in fact we already see a lot of this kind of activity on HuggingFace, which is a community where people share open models, fine-tune them, share those refined models, and compare them to each other to see how well they perform on various tasks.

Models can be fine-tuned for a wide range of purposes, including giving them greater domain knowledge. For example, a model fine-tuned on biology content will be more likely to provide accurate answers to biology questions. Models can also be fine-tuned using research articles about evidence-based teaching practices and examples of those practices being enacted. This will allow highly effective teaching practices to scale broadly. As new generations of foundation models are created, and new forms of fine-tuning are invented (DPO being the most recent example), the degree to which we are able to successfully steer model behavior will continue to improve.

There are two points we should stress about open models, however. First, openly licensed models still have to run somewhere. That is, generative AI models with open weights still need large amounts of computing power to be used at any kind of scale, and so there will still be costs associated with using open models. If generative AI is to be a meaningful part of the future of OER, OER advocates are going to have to get comfortable with the idea that there will be some cost associated with learning materials.

Second, it is true that there are some smaller generative AI models that can be run on higher-end laptops or desktops. But the accuracy and power of these models, as measured by their ELO and other metrics like those on the HuggingFace leaderboard, will likely always be lower than the performance of larger proprietary models like GPT-4. If OER advocates’ zero-cost generative AI strategy is to argue for the use of models that can be run locally on a student’s laptop (i.e., with no additional compute costs), those advocates will (again) create rather than solve equity problems. In the same way that zero-cost advocacy and policies prevented many students from being able to use interactive courseware that is demonstrably more effective than traditional textbooks, this strategy will prevent students from using more effective generative AI models. Ironically, the zero-cost strategy intended to improve equity will instead primarily function to widen the academic success gap between our most at-risk students (assigned to use zero-cost, less powerful models) and their peers (using state of the art models). “But David,” you protest, “perhaps one day open models will be as powerful as proprietary models.” Never say never. But that day isn’t today. And it’s likely not within the time horizon of a five-year plan for a national open education strategy.

Conclusion

Imagine a group of education advocates working back in 1998. Imagine them campaigning against faculty assigning students to read webpages for class because there’s a cost associated with connecting to the internet from home. Now cast your mind forward to 2024 and imagine those same people still lobbying against using the internet in education – campaigning against online courses and online degree programs, emailing announcements to students, providing access to recorded lectures, or distributing a syllabus via the LMS. Lobbying against online class registration, online degree planning tools, filling out the FAFSA online, etc. All because connecting to the internet costs money. And making this argument in the name of equity. How relevant would a “no internet” higher education strategy be in today’s world? More importantly, how equitable would it be?

Generative AI will revolutionize the world to at least the same degree as the internet, probably much more so. In the same way that many of us looked at the internet in the 1990s and tried to imagine the incredible possibilities it presented for education (including open content and OER), we need the same kind of imagining now in the context of generative AI. A national strategy for anything – and particularly a national strategy for education – that fails to anticipate the possibilities of generative AI dooms itself to irrelevance. There are incredible opportunities for us to expand access to educational opportunity and actually improve student success if we leverage these possibilities proactively.

Surely, surely, the national open education policy conversation is not just an attempt to extend existing zero-cost course materials policies across more of the US. Surely it will be something more forward looking – and more concerned about improving student success – than that.

I believe the most important questions that could be answered during a conversation about a national open education strategy for the US are these:

What is the goal you’re trying to accomplish by using “open education?” Is it just saving students money, or is it something more? If there’s not a clearly articulated, agreed upon goal, the strategy won’t matter.
Over 25 years later now, is it possible there’s a more effective way to accomplish whatever that goal is than using open education as it was originally imagined in the US decades ago (i.e., via traditional OER)?
What could open education look like if it were reimagined from the ground up, acknowledging the advent of generative AI and trying to leverage its unique affordances?