My response in Nature – the director's cut

Feb 2, 2021 4 min read Machine learning and AI, Writing, Paper

I have a “reader response” published in Nature. Let’s discuss.

A few months ago a news piece came out in Nature about a project called SciTLDR that uses machine learning to write summaries of scientific articles. In my opinion that piece didn’t sufficiently cover the potential downsides of such a tool, how it takes away authorial intent, how it may be unreliable, and how it may be abused. Here’s a post I wrote at the time: Let’s let the AIs summarize our research. What could possibly go wrong?

Well, the saga continues.

Out now in Nature is a comment from me about the SciTLDR article. Nature correspondences are short, so I encourage you to check it out. Yes, I got worked up enough to convert my original post into a formal reader correspondence!

But my submitted correspondence was quite a bit longer than what made it into Nature (350 words vs. about 200), so I thought this post would be a good space for the “director’s cut” of my comment. The full comment is down below.

As an aside, it’s interesting to see how length constraints really boil an argument down to its essence. Going from an 800-word blog post to a 350-word correspondence to a 200-word revised correspondence is not easy, and it forces some hard choices about what points you can make (talk about killing your darlings). Nothing in school prepares you for this task (if anything, you learn how to pad out writing to hit a minimum word count).

The point I most wish made the cut

Sadly, I was not able to include my concluding remark about the risks of SciTLDR for communicating with non-experts. I am particularly worried about this because the creators of SciTLDR explicitly mention this as future work. From the Nature news piece:

[SciTLDR’s] summaries tend to be built from key phrases in the article’s text, so are aimed squarely at experts who already understand a paper’s jargon. But [Semantic Scholar group manager Daniel] Weld says the team is working on generating summaries for non-expert audiences.

My (cut) response:

Further, while SciTLDR is currently intended for expert readers, I worry about how such tools may be used to promulgate misinformation among non-experts. Rather than relying upon automatic and potentially unreliable tools, giving support to projects such as the Alan Alda Center for Communicating Science is likely to be more effective and more reliable at engaging non-experts.

Troubling.

And remember, the tagline of the Allen Institute for AI (which runs Semantic Scholar) is “AI for the Common Good”…

TL;DR - How well do machines summarize our work? (director’s cut)

The SciTLDR software tool (scitldr.apps.allenai.org) uses machine learning to summarize scientific texts [1]. I found using their online demo to be quite instructive.

In many ways, SciTLDR produced clear summaries—it is impressive how far Natural Language Processing has come. Often the method will extract one or two key statements from the original text and edit them into a cohesive sentence, sometimes removing parentheticals and swapping out common words or phrases with synonyms. While such changes may be innocuous, there remain risks that important information is lost. When SciTLDR removes a parenthetical, it is stripping out qualifiers the authors deemed relevant. When it replaces “we investigated” with “we identified,” for instance, it has rendered a significant change in meaning away from setting context and toward enumerating results.

I become further troubled when I consider the potential broader impacts of such a tool. What happens when these tools are applied to antivaccination research or papers denying climate change? I submitted to the demo abstracts from fraudulent, retracted works and it produced summaries that were often stronger statements of the results than the original fraudulent work and lacking extenuating context. Indeed, it did not seem to treat these texts appropriately, acknowledging retractions as a human writer might. Given the critical subject matter of science and medicine, and the long-running threats posed by anti-science movements [2], care should be taken when developing and deploying tools such as SciTLDR.

As an author, I do not find it particularly burdensome to provide a single sentence summary of a manuscript, as journals often request. Indeed, crafting such summaries can help sharpen one’s thinking on a subject. Yet ceding authorial control and intent to machine learning carries risks: stripping away important extenuating context and over-amplifying results can harm scientific discourse. Further, while SciTLDR is currently intended for expert readers, I worry about how such tools may be used to promulgate misinformation among non-experts. Rather than relying upon automatic and potentially unreliable tools, giving support to projects such as the Alan Alda Center for Communicating Science [3] is likely to be more effective and more reliable at engaging non-experts.

[1] Perkel, J. M. & Noorden, R. V. tl;dr: this AI sums up research papers in a sentence. Nature News (2020). URL https://www.nature.com/articles/d41586-020-03277-2.

[2] Holton, G. J. Science and anti-science (Harvard University Press, 1993).

[3] Eise, J. What institutions can do to improve science communication. Nature Career Column (2019). URL https://www.nature.com/articles/d41586-019-03869-7.

Jim Bagrow

Associate Professor of Mathematics & Statistics

My research interests include complex networks, computational social science, and data science.

My response in Nature – the director's cut

The point I most wish made the cut

TL;DR - How well do machines summarize our work? (director’s cut)

Jim Bagrow

Associate Professor of Mathematics & Statistics

Related