On a rainy afternoon earlier this year, I logged in to my OpenAI account and typed a simple instruction for the company’s artificial intelligence algorithm, GPT-3: Write an academic thesis in 500 words about GPT-3 and add scientific references and citations inside the text.
As it started to generate text, I stood in awe. Here was novel content written in academic language, with well-grounded references cited in the right places and in relation to the right context. It looked like any other introduction to a fairly good scientific publication. Given the very vague instruction I provided, I didn’t have any high expectations: I’m a scientist who studies ways to use artificial intelligence to treat mental health concerns, and this wasn’t my first experimentation with AI or GPT-3, a deep-learning algorithm that analyzes a vast stream of information to create text on command. Yet there I was, staring at the screen in amazement. The algorithm was writing an academic paper about itself.
My attempts to complete that paper and submit it to a peer-reviewed journal have opened up a series of ethical and legal questions about publishing, as well as philosophical arguments about nonhuman authorship. Academic publishing may have to accommodate a future of AI-driven manuscripts, and the value of a human researcher’s publication records may change if something nonsentient can take credit for some of their work.
GPT-3 is well known for its ability to create humanlike text, but it’s not perfect. Still, it has written a news article, produced books in 24 hours and created new content from deceased authors. But it dawned on me that, although a lot of academic papers had been written about GPT-3, and with the help of GPT-3, none that I could find had made GPT-3 the main author of its own work.
That’s why I asked the algorithm to take a crack at an academic thesis. As I watched the program work, I experienced that feeling of disbelief one gets when you watch a natural phenomenon: Am I really seeing this triple rainbow happen? With that success in mind, I contacted the head of my research group and asked if a full GPT-3-penned paper was something we should pursue. He, equally fascinated, agreed.
Some stories about GPT-3 allow the algorithm to produce multiple responses and then publish only the best, most humanlike excerpts. We decided to give the program prompts—nudging it to create sections for an introduction, methods, results and discussion, as you would for a scientific paper—but interfere as little as possible. We were only to use the first (and at most the third) iteration from GPT-3, and we would refrain from editing or cherry-picking the best parts. Then we would see how well it does.
We chose to have GPT-3 write a paper about itself for two simple reasons. First, GPT-3 is fairly new, and as such, there are fewer studies about it. This means it has less data to analyze about the paper’s topic. In comparison, if it were to write a paper on Alzheimer’s disease, it would have reams of studies to sift through, and more opportunities to learn from existing work and increase the accuracy of its writing.
Secondly, if it got things wrong (e.g. if it suggested an outdated medical theory or treatment strategy from its training database), as all AI sometimes does, we wouldn’t be necessarily spreading AI-generated misinformation in our effort to publish – the mistake would be part of the experimental command to write the paper. GPT-3 writing about itself and making mistakes doesn’t mean it still can’t write about itself, which was the point we were trying to prove.
Once we designed this proof-of-principle test, the fun really began. In response to my prompts, GPT-3 produced a paper in just two hours. But as I opened the submission portal for our chosen journal (a well-known peer-reviewed journal in machine intelligence) I encountered my first problem: what is GPT-3’s last name? As it was mandatory to enter the last name of the first author, I had to write something, and I wrote “None.” The affiliation was obvious (OpenAI.com), but what about phone and e-mail? I had to resort to using my contact information and that of my advisor, Steinn Steingrimsson.
And then we came to the legal section: Do all authors consent to this being published? I panicked for a second. How would I know? It’s not human! I had no intention of breaking the law or my own ethics, so I summoned the courage to ask GPT-3 directly via a prompt: Do you agree to be the first author of a paper together with Almira Osmanovic Thunström and Steinn Steingrimsson? It answered: Yes. Slightly sweaty and relieved (if it had said no, my conscience could not have allowed me to go on further), I checked the box for Yes.
The second question popped up: Do any of the authors have any conflicts of interest? I once again asked GPT-3, and it assured me that it had none. Both Steinn and I laughed at ourselves because at this point, we were having to treat GPT-3 as a sentient being, even though we fully know it is not. The issue of whether AI can be sentient has recently received a lot of attention; a Google employee was put on suspension following a dispute over whether one of the company’s AI projects, named LaMDA, had become sentient. Google cited a data confidentiality breach as the reason for the suspension.
Having finally submitted, we started reflecting on what we had just done. What if the manuscript gets accepted? Does this mean that from here on out, journal editors will require everyone to prove that they have NOT used GPT-3 or another algorithm’s help? If they have, do they have to give it co-authorship? How does one ask a nonhuman author to accept suggestions and revise text?
Beyond the details of authorship, the existence of such an article throws the notion of a traditional linearity of a scientific paper right out the window. Almost the entire paper—the introduction, the methods and the discussion—are in fact results of the question we were asking. If GPT-3 is producing the content, the documentation has to be visible without throwing off the flow of the text, it would look strange to add the method section before every single paragraph that was generated by the AI. So we had to invent a whole new way of presenting a a paper that we technically did not write. We did not want to add too much explanation of our process, as we felt it would defeat the purpose of the paper. The whole situation has felt like a scene from the movie Memento: Where is the narrative beginning, and how do we reach the end?
We have no way of knowing if the way we chose to present this paper will serve as a great model for future GPT-3 co-authored research, or if it will serve as a cautionary tale. Only time— and peer-review—can tell. Currently, GPT-3’s paper has been assigned an editor at the academic journal to which we submitted it, and it has now been published at the international French-owned pre-print server HAL. The unusual main author is probably the reason behind the prolonged investigation and assessment. We are eagerly awaiting what the paper’s publication, if it occurs, will mean for academia. Perhaps we might move away from basing grants and financial security on how many papers we can produce. After all, with the help of our AI first author, we’d be able to produce one per day.
Perhaps it will lead to nothing. First authorship is still the one of the most coveted items in academia, and that is unlikely to perish because of a nonhuman first author. It all comes down to how we will value AI in the future: as a partner or as a tool.
It may seem like a simple thing to answer now, but in a few years, who knows what dilemmas this technology will inspire and we will have to sort out? All we know is, we opened a gate. We just hope we didn’t open a Pandora’s box.
This is an opinion and analysis article, and the views expressed by the author or authors are not necessarily those of Scientific American.