Closing the Equity Gap with ChatGPT

Our work at Lumen is focused on eliminating race and income as predictors of student success in the US postsecondary setting. One thing we’ve learned as we’ve worked to erase this persistent gap in academic performance is that it is far easier to “slide the gap to the right” than it is to close it. In other words, interventions intended to benefit the lowest performing students often benefit all students, so that everyone’s academic performance improves. That’s great from one perspective – everyone learned more! But rather than decreasing the size of the gap, these interventions leave the gap in tact and nudge it up the scale to the right. Interventions that have an accurately targeted effect can be hard to find.

For this reason, I was particularly excited to see Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence, a new study from researchers at MIT which finds that access to ChatGPT dramatically reduces productivity inequality on writing tasks. The abstract reads:

We examine the productivity effects of a generative artificial intelligence technology—the assistive chatbot ChatGPT—in the context of mid-level professional writing tasks. In a preregistered online experiment, we assign occupation-specific, incentivized writing tasks to 444 college-educated professionals, and randomly expose half of them to ChatGPT. Our results show that ChatGPT substantially raises average productivity: time taken decreases by 0.8 SDs [37%, or from 27 minutes to 17 minutes] and output quality rises by 0.4 SDs [a 0.75 point increase in grade on a 7 point scale]. Inequality between workers decreases, as ChatGPT compresses the productivity distribution by benefiting low-ability workers more. ChatGPT mostly substitutes for worker effort rather than complementing worker skills, and restructures tasks towards idea-generation and editing and away from rough-drafting. Exposure to ChatGPT increases job satisfaction and self-efficacy and heightens both concern and excitement about automation technologies.

That kind of result gets me excited! Simultaneously decreasing inequity while increasing self-efficacy and satisfaction? Yes, please!

Section 2.3 explicitly discusses productivity inequality, describing how access to ChatGPT helped close that gap:

The control group exhibits persistent productivity inequality: participants who score well on the first task also tend to score well on the second task. As Figure 2 Panel (a) shows, there is a correlation of 0.49 between a control participant’s average grade on the first task and their average grade on the second task. In the treatment group, initial inequalities are half-erased by the treatment [access to ChatGPT]: the correlation between first-task and second-task grades is only 0.25 (p-value on difference in slopes = 0.004). This reduction in inequality is driven by the fact that participants who scored lower on the first round benefit more from ChatGPT access, as the figure shows: the gap between the treatment and control lines is much larger at the left-hand end of the x-axis. (p. 5)

There seems to be real promise here for making progress toward closing the equity gap in education. However, what we see positively as “productivity gains” in the world of work is often seen negatively as “cheating” in the world of school. And while there are certainly challenges to navigate here, results like those in this paper from MIT make our efforts to navigate them effectively all the more critical as we work to close the equity gap.