Last week I wrote up some notes on Sora, OpenAI’s new service, as yet unreleased to the public. There I looked into the technology, as best as I could see from the traces available, and speculated about possible directions.
Today I’d like to continue by thinking through Sora with a focus on its possible implications for higher education. Let’s look ahead a year or two.
For now I’m going to assume Sora will appear in the world as a consumer grade application relatively soon, as in calendar 2024, maybe early 2025. It might not be just OpenAI’s service, as other providers - Google, Microsoft, big Chinese AI companies, various startups - maybe launch their own versions, either as web services or open source apps. In today’s post I’ll just lump them all together as “Sora.”
I’m also going to posit Sora works as promised, more or less. Users can input a text description and Sora will emit a short video accordingly. “Short” will probably become longer over time as the software improves, which seems likely, based on recent history. Additionally, I’m assuming faculty and students can use Sora in class. Lots of assumptions here, and I’ll try to get to them at the end.
1: Some academic uses
In a class setting, in-person or virtual, the leading Sora use I can think of is content production. That’s one of the major established uses of generative AI in general: making stuff. We’ve seen that with textbot (ChatGPT, Gemini), images (DALL-E, Midjourney), and slideshows (Slides.GPT), primarily. If Sora works at the general user level, we have democratized, prompt-based video production.
Start with the instructor, who can ask Sora to make videos to visualize all kinds of curricular content. A planetary disk coalescing, the assassination of Julius Caesar, a factory producing cars: a faculty member can load such videos into a syllabus or learning management system before class, or generate them on demand during meetings, in response to student discussion. Here the pedagogy ties into decades of work on teaching with video, both licensed by professors and created by them. (My first book is partly about the latter.)
Naturally, students can use Sora as well. Such video creation could be in the service of many assignments and purposes, including visualizing their own and developing their creativity. Students could generate videos in response to instructor clips, a video to video sequence. Student could also work individually or in groups. At a meta level this could be a good way to learn prompt engineering.
I’m not sure to what extent research uses of Sora would differ from today’s digital video production. We could see Sora speeding these cases up, like videos representing certain functions or attempts to communicate findings for a broader audience. As with the video we now experience, we should expect media studies faculty to research this kind of creative work, along with computer scientists.
We should expect a third academic use of Sora for non-pedagogical, non-research purposes, based on what we’ve seen with generative AI so far. Academic faculty, staff, and students can use it for various operations, from visualizing changes to a campus’ physical plant to crating visualizations for communication purposes.
Here’s a recent example, sort of, from Sora’s lead researcher. The prompt was “fly through tour of a museum with many paintings and sculptures and beautiful works of art in all styles":
Across all three of these domains I think we can anticipate one shared use in particular. Sora might be a handy storyboard application. We can use it to visualize ideas we’re thinking through and want to share, such as a classroom configuration or a graduation ceremony’s layout.
I want to reserve a space here for new, emergent academic uses of on-demand video production. Maybe some will use Sora to produce a version of animated gifs and populate discussion threads with them. Perhaps researchers will add clips to scholarly articles. Or we might stop thinking of Sora as a video tool and instead consider it a spatial creator, especially for game or extended reality content. I find it’s not useful to underestimate human creativity.
2: Problems and limits
Let me return to the caveats I mentioned earlier. Even though we don’t have access to Sora, we can still anticipate some problems.
One of the most evident challenges facing large language models is the quality problem. LLMs keep producing errors and misfires, such as the recent Gemini “woke AI” debacle. I don’t want to get into the issue deeply here (should we call the errors hallucinations or not?) but want to place a marker for this problem when Sora appears for general users. Think about faculty asking the app to produce, say, a French history video, but the people involved have the wrong clothing, skin, tools, etc. We can correct a textbot’s errors easily, through a text editor. Few will be able to edit a video clip, and the complexity might be beyond anyone other than high end professionals. Iterating text prompts is the established response, but I can imagine this not yielding satisfactory responses, either due to dataset limits, user ability or fatigue, or Sora’s limits.
A second problem concerns the huge amount of video content we already have. Would Sora be redundant to what academics could easily find through a YouTube or Giphy search? Put another way, why would students, faculty, or staff take the time to learn how to make Sora work when they have been immersed in Tiktok, Netflix, an institution’s licensed video content, and more, for years? Moreover, based on our experience of LLMs to date, we should expect training datasets and AI code to generate content unevenly - i.e., plenty of Hollywood-ish videos, but fewer sources from, say, Egyptian cinema.
Another issue arises when we consider access. Already we have seen subscription fees blocking some academics, notably students, from using top-end AI applications. There are also challenges of infrastructure and technical knowledge. We might also see internet service providers or local hosts reducing or blocking access to Sora for a range of reasons (security, copyright, privacy, politics, etc.). In other words, we shouldn’t assume full access to this hypothetical tool.
Some academics might resist Sora. Faculty, staff, and students might oppose its operation on campus for the reasons we’ve heard from critics for years, amplified by the huge popularity of video. Think of faculty asking campus IT to block access to Sora, or a teaching and learning center’s staff refusing to teach people how to use the app. Moreover, we’ve already seen at least one academic project aimed at degrading LLM content. More such resistance could great Sora at a given college or university.
One more issue concerns preservation. Where will copies of Sora video live? How might, for example, a user find a Sora video they made a year or a decade ago? Perhaps they will have to trust OpenAI retains such copies and in a searchable way. Maybe this is where campus libraries and/or IT departments play a role - especially for rendering access to academics’ work in response to legal queries. Or we could outsource Sora preservation to other parties, such as governments or the Internet Archive.
Again I return to a theme I’ve sounded since I first started researching AI. The technology faces a series of threats beyond the academy, some verging on the existential. A judge could order OpenAI to shut down Sora for copyright infringement, or block its launch before it begins. Governments could regulate it strongly for security reasons. Popular culture could turn against AI in general. LLM output could degrade, especially if errors feed into training datasets. In other words, Sora might not make it into academics’ hands at all, or perhaps only in a diminished form. The many uses I’ve described above represent a path we might not be able to walk.
Dear reader, how do you think academics might respond to Sora in the wild?
The increased environmental cost as more and more energy intensive AI apps come online is a huge concern that needs to be mentioned in every story about AI at this point.
On the issue of availability to students, higher ed institutions will have to make decisions on institutional subscriptions soon (some have already I know, but not many). But because the market is so fluid this will be a very difficult decision. With most other software there is often an obvious product that most people use, or a fairly well-established set of choices, but that just isn't the case with AI products at the moment. But the longer you wait for a clear winner to emerge the longer the problems with uneven access persist.