At a glance
- Individual parts of scientific publications, such as introductions, always follow the same narrative pattern. Therefore, AI tools can write such texts.
- Already today, scientists publish more than can be read. With AI, there could be even more papers.
- Therefore, the publication practice in science as it is today seems outdated and not a criterion for scientific quality.
- Updatable FAIR data sets should form the backbone of scientific work in the future, instead of repetitive narratives and static prints and PDFs.
How text-generating artificial intelligence will challenge narrative scientific publishing.
As the moon stood high in the sky, casting a soft glow over the rolling countryside, the heroine stood alone on the balcony of her family’s magnificent estate. She stared into the darkness, lost in thought, until a figure emerged from the shadows. It was the hero, tall and strong, his piercing eyes fixed on hers. They had met before, but always in secret, for their families were bitter rivals. But their love was too strong to be denied.If this plot line sounds familiar, you may have come in contact with dime novels or soap operas before. Don’t be embarrassed: All of pop culture is based on a number of specific elements that characterize works of fiction, such as sensationalized plots, stereotyped characters, and simplified language. Now compare the introduction of a dime novel to the following:
In recent years, the field of materials chemistry has emerged as a critical area of research driving advances in areas such as energy production and medicine. A key challenge in this field is the development of new materials with optimized properties, including improved strength, durability, and functionality. In this article, we present a novel approach to the design and synthesis of advanced materials based on modern techniques in materials chemistry and nanotechnology. Through a combination of computational modeling and experimental synthesis, we demonstrate the successful creation of a new class of materials with improved properties, including superior mechanical strength and chemical stability. Our results represent a significant advance in the field of materials chemistry and offer great opportunities for a wide range of applications, from aerospace engineering to medical devices.Now this reads like the beginning of a scientific article. Although there is no obvious connection between the two introductory paragraphs, they both follow an interchangeable narrative pattern. This pattern is so obvious that the ChatGPT language model from US-based OpenAI has no problem recognizing it. In fact, ChatGPT has written both introductory paragraphs with the instructions “Write a typical introductory paragraph for a romantic dime novel” and “Write a typical introductory paragraph for a scientific article for a materials chemistry journal.” In nearly ten years as an independent researcher in materials chemistry, I must confess that I have read and probably written worse prose than what ChatGPT conjures up in seconds. This pattern can easily be applied to the general introductions of many research articles.
Recurring narrative patterns
In response to the question, “What are the similarities between the distinctive literary elements that make up the opening paragraphs of a romantic dime novel and the opening paragraph of a scientific article?”, ChatGPT provides insight into narrative writing:
While the theme and purpose of a romantic pulp novel and a scholarly publication may be very different, there are some similarities between the characteristic literary elements that make up the introductory section of each paper: (i) setting/background information, (ii) conflict/research question or hypothesis, (iii) mood and motivation/goals and scope, (iv) hook/significance and implications.
Overall, although the topic and writing style may vary, both introductory sections serve to engage the reader, provide context for the work, and outline the central question or hypothesis that the work seeks to address.
The pattern for text structure is easily understood by anyone who has attended a workshop on “How to publish in Nature/Science/renowned journal,” and it follows almost verbatim “An effective introduction for a paper. “ It is also a lesson in Aristotelian storytelling in science that I would like to impart to each new generation of graduate students. For they should not only experiment, but also communicate the results to the public that funds their research.
We should not be surprised that ChatGPT knows the basics of storytelling: ChatGPT-4 has been trained on a massive textual data corpus comprising about 570 GB of records, including web pages, books, and other sources – presumably including Nature guidelines.
In April 2023, my group published a paper in which we used ChatGPT to generate both the abstract and the introduction, as disclosed in the “Acknowledgments.” The Editorial Board of Chemical Science of the Royal Society of Chemistry subsequently updated its author guidelines, to include AI-generated information in academic papers.
What should make us all wonder: If the template for an academic introduction is so obvious and trivial, so easily adaptable to different research topics, and so easily mimicked by ChatGPT, why are we still writing, publishing, and reviewing such narratives? And who is reading all this writing?
Here I don’t want to discuss our ability to invent new narratives – like Star Trek’s utopian societies and visionary technological gadgets. I want to challenge the academic equivalent of the nth iteration of a Hollywood blockbuster – all that iterative science paid for by taxpayers and turned into static, dead, and often unread text files by profit-driven publishers.
Figures in academic publishing
Most top-quartile chemistry journals limit their articles to between 2500 words for short communications and 5500 words for full articles. The narrative and summary sections of the main manuscript – abstract, introduction, and conclusion – take up an average of 20 percent of the text.
It takes an experienced writer about three to five days to write a communications article from scratch. After submission to a journal, the editor will spend up to 30 minutes considering whether the text is worthy of publication. Then he or she will notify up to five academic reviewers by email, and two to three of them will spend between two and five hours reviewing the article and writing a recommendation. According to an estimate in the 2018 STM report, 2.5 million scientific articles are submitted each year. That’s a lot of text and man-hours.
Big numbers mean big business. In 2020, science publishers generated $20 billion in revenue – that’s on par with Netflix ($30 billion). The academic publishing sector generates profit margins of over 40 percent, while the entire Hollywood film industry only achieves about 14 percent. Deutsche Bank examined this phenomenon back in 2005, describing the for-profit publishing industry as a “bizarre” system of “triple pay”, in which “the government funds most research, pays the salaries of most reviewers of research quality, and then buys most published products.”
Scientists publish because they need written evidence that demonstrates their expertise and is necessary to receive funding. We have allowed the resulting numbers game of citations, journal impact factors, and h-indexes to dominate our personal and institutional evaluations. “The higher the numbers, the better the science” has been the credo for decades.
In business, numbers have a real countervalue: money, products, people. In science, however, there is no correlation between large numbers and excellence. Harry Kroto, the winner of the 1996 Nobel Prize in Chemistry, did not have a high h-index. A single publication on fullerenes earned him the prize. The numbers game in science has instead inspired the emergence of paper mills, in which authorship is sold on publications and authors agree to cite each other regardless of thematic relevance.
One example of misaligned incentives in academic publishing I witnessed firsthand during my time in Prague between 2016 and 2018 is that scientists in the Czech Republic receive a tiered lump sum for each publication, regardless of its scientific merit. While the reward is greater for publications in journals with higher impact factors, there is a minimum amount for each publication. The result is a flood of irrelevant papers in obscure journals that no one reads. As a 2007 study from Indiana University Bloomington found, half of scientific articles are read by no one other than authors, reviewers and journal editors.
Funding agencies such as the Dutch NWO and the European Research Council, ERC, have recognized this. In accordance with the San Francisco Declaration on Research Assessment (Dora), the impact factor of a publication should no longer serve as the basis for evaluation at the NWO and ERC as of 2021.
Who writes and who reads?
Going back to our publication, which ChatGPT co-wrote, the initial reaction of one of my co-authors was dismay: would the AI work diminish the human contribution? No. It was human scientists who designed the study, collected and analyzed the data, and drew conclusions. The AI merely provided the summary, repetitive text on the state of the art – citation- and number-based credit to colleagues working in the field.
As scientists, we are too enamored with the notion that there is something special about the way we communicate our scientific discoveries, that it is our genius that permeates a publication. None of this is true: we are seeing a flood of iterative publications that are becoming less groundbreaking and less reproducible.
The gold standard in academic publishing – peer review for quality – is increasingly failing to separate the good from the bad. The number of tertiary education staff – the number of potential reviewers – increased by only about five percent globally between 1993 and 2014, while the number of academic journals increased by seven percent over the same period. Scientists are finding less and less time to read and review an ever-increasing volume of research articles – even those from their own groups. Some scientists, on the other hand, are achieving hyper-prolific publication rates of one publication every five days. Sometimes I wonder if any of the editors or reviewers have found the time to read our Chemical Science paper in its entirety and recognize the GPT reference we inserted.
AI as a tool
Given the amount of research publications, it is a challenge to find all significant scientific publications on a given topic and to keep track of the latest research. To solve this problem, today there are several specialized online services besides ChatGPT that can help in this context:
- Semantic Scholar is an online search service with an up-to-date database of over 211 million indexed articles, allowing users to filter and sort results by research field, publication period, author, and journal while being able to save selected papers in a library for future reference.
- Elicit is an online service by the non-profit research lab Ought for machine learning that uses Semantic Scholar’s database to answer users’ questions through relevant papers and offers three filters and five sorting parameters to organize results.
- ResearchRabbit is a unique search engine that allows users to upload or search for papers and can find papers based on their DOI, but it lacks filtering and sorting options and doesn’t allow users to load additional search results; users can organize selected papers into collections and add comments or explore related papers through clickable clusters, but the interface can quickly become overwhelming.
- Explainpaper helps users understand research papers by summarizing and explaining text passages highlighted by the user using GPT-3. The tool is free to use in its Hobbyist version and can generate longer explanations of multiple pages in the paid Reader version, although this is prone to errors. The upcoming Researcher version will summarize complete articles and multiple papers in one go.
- ChatPDF is a tool that allows users to upload a PDF and ask questions in a chat window, with the tool providing short answers and page references, helping to improve access to information.
- Scispace combines the functions of ResearchRabbit and Explainpaper by allowing users to create a collection of papers and providing explanations for difficult sections. The service searches more than 200 million articles and 50 million Open-Access PDFs, and results can be filtered by various criteria and sorted by relevance, citations, age, or alphabetically. However, the service does not disclose how it generates its answers, and users should exercise caution when relying on its explanations.
The services listed still have teething problems in terms of accuracy and truthfulness. Users are well advised to critically examine the AI-generated results. For most authors, however, it is easier to correct and add to a text than to edit a blank page.
Given the steady and rapid improvements in AI-assisted contextualization and writing technology, it is worth considering whether profit-driven, numbers-based publishing is still relevant and worth our time and money. Publishing the results of publicly funded research in static, uncorrectable print and PDFs seems anachronistic – consider how quickly venerable reference documents like the Encyclopædia Britannica have been surpassed in quality by living and editable wikis. Errors happen in research publications – corrections often do not.
The intellectual helplessness of academic institutions that fail to recognize the opportunities of new AI-based tools is a testimony to our unfamiliarity with new techniques. Bans on the use of ChatGPT – as recently imposed and later lifted in Italy – are frantic and paranoid reactions reminiscent of the historical mistake of banning the printing press.
Initiatives to build an open and FAIR (findability, accessibility, interoperability, reuse) infrastructure for data management in research such as NFDI4Chem and the European Open Science Cloud are setting trends. Data repositories under Open Data and Open Science will be the foundation for more useful, reliable, and reproducible academic work instead of the current practice of authority-based citations.
There is more science to consider than the top 10 results on Google Scholar, and the statement “to the best of the authors’ knowledge” seems lazy and tired in the face of powerful tools for data retrieval and AI-based summaries.
However, this kind of institutional change needs to be top-down and tied to research funding – ministries of science and education should implement FAIR-data-deposition requirements and enforce their own rules. This will require time allocation for key roles in academic data curation and control.
A different way of communicating knowledge
A narrative-based rather than a numbers-based evaluation of proposals, individuals, and institutions, as suggested by Dora, foregrounds the value and novelty of academic discoveries. This means that evaluators will spend more time reading about ideas and people rather than looking at numbers.
Academic publications should be decoupled from citation metrics and become living, self-correcting, and evolving data collections. The academic workflow should resemble that of wikis, with the deposition of FAIR-research data and its contextualization, correction, and growth online.
The majority of summary review articles in academic journals are phone book-like lists of “who-has-worked-on-what” rather than meaningful assessments of a field. Contemporary reviews serve only to increase citations from friends, colleagues, and network partners. Only in a few cases are they helpful as references for authors of research proposals and theses. As a literary form, reviews will be rendered obsolete by summative AI tools.
These proposals will upend publishers’ business models and academics’ usual workflows. Both stakeholders claim excellence – so both should unite behind the opportunities for more reliable, reproducible, and excellent pursuit of scholarship that new tools put within our reach.