RAG and Fine-tuning ARE Instructional Design

All of the analysis, design, and development that happen in conjunction with retrieval augmented generation (RAG) and fine-tuning of large language models is instructional design.

The instructional design process often begins with an analysis of the learner(s). What do they already know? What skills do they already have? What prior knowledge can we assume will be there, so we can design our instruction in a way that successfully builds on their understanding? In the case of LLMs, “prior knowledge” is the set of capabilities models get from their pre-training. When we begin the process of prompting, fine-tuning, or setting up retrieval augmented generation (RAG), we’re not starting from scratch. The first step is understanding the set of capabilities in your base model the same way you would want to understand the prior knowledge of your target learners.

One of the primary concerns for instructional designers is managing working memory. Learners have a limited amount of working memory, and effective instructional designs minimize or eliminate extraneous cognitive load while promoting germane cognitive load. Large language models have context windows, which are the equivalent of working memory in a learner. Prompting and RAG strategies have to effectively manage this constraint so the model does not become overloaded and lose access to critical information during inference.

Sometimes the goal of instructional design is not to help a learner learn something permanently. Say, for example, a technician is responsible for maintenance of a complex aircraft. It may not be realistic to expect them to memorize the names, locations, and functions of every system and sub-system in the plane. So instead of designing instructional materials with a goal of promoting memorization, instructional designers often create “job aids” that provide the learner with reference and other information they can use as needed and then immediately forget without consequence. (This approach is sometimes called “performance support”.) In the case of aircraft maintenance, perhaps the instructional designer creates a laminated card with a series of pictures, steps, and explanations that provide the technician with all the information they need to perform a task. The goal here is successful performance of the maintenance task, not committing the specific maintenance process to long-term memory.

RAG is the equivalent of a job aid. RAG is not designed to change a model’s behavior over the long-term. Instead, RAG is intended to provide the model with short-term access to the information necessary to answer a specific question or complete a specific task. (And one of the critical questions at the heart of RAG is equivalent to, “how do I help the technician find the right card for the job they’re about to do among the thousands of cards I’ve created for them?”) Once the question is answered or task is completed, the model can forget the information provided via RAG without consequence. And the choices you make about the way this just-in-time information is structured, stored, and accessed are instructional design choices.

On the other hand, fine-tuning a model is more akin to traditional teaching. The goal of traditional teaching is generally helping learners remember new information and skills over the long-term. Similarly, the goal of fine-tuning is permanently changing a model’s behavior in a specific way.

Imagine you’ve been challenged to help a student learn a new skill or a specific bit of information, and the only instructional technique you’re allowed to employ is “showing examples.” How would you design those examples? Pause a moment and actually think about that question. If all you could do as a teacher was design examples and show them to your student*, think of the amount of creativity that would need to go into the design of those examples! What would be the critical elements of their design? Of course the answer to these questions depends entirely on the nature of the skills or information you’re trying to help the student learn. In some cases, you’re trying to help the student learn how to use a specific tool, like a scientific calculator (both when to use it, and how to use it). Other times you’re trying to help them learn how to classify different types of cells as either prokaryotic or eukaryotic. Other times you’re trying to help them learn how to critique a piece of writing. The important issue for us is that no matter what the specific skill or information is, you can only teach it by showing examples.

The instructional design process for fine-tuning models is constrained in exactly this way. You fine-tune a model by showing it examples of the new skill or behavior you want it to learn – tool use, classification, critique, and anything else. Hundreds, thousands, tens of thousands, hundreds of thousands, millions of examples or more. How do you design those examples? What are their characteristics? (Also: How many examples can you afford to create? Who (or what) will create them? Can you find suitable examples that already exist? How many examples can you actually afford to show the model?) These are the kinds of questions at the heart of fine-tuning. And the choices that you make about the nature and number of the examples in your fine-tuning data are instructional design choices.

Instructional designers should be at the very forefront of working with generative AI. This is literally what we do.

 

* Ok, ok. In addition to showing the student examples, you’re also going to grade their work and give them feedback to let them know how they’re doing. In fact, you’re going to let them practice and receive feedback after every example you show them. There are also important questions of how we grade and provide feedback (e.g., what the reward function looks like). And, in fact, since we get to design the architecture of the model, we also get to make choices about how the student will use the feedback we provide them to improve their performance (e.g., backpropagation). But for now let’s keep the discussion simple.