Text recycling in STEM: A text-analytic study of recently published research articles.
Text recycling, sometimes called "self-plagiarism," is the reuse of material from one's own existing documents in a newly created work. Over the past decade, text recycling has become an increasingly debated practice in research ethics, especially in science and technology fields. Little is known, however, about researchers' actual text recycling practices. We report here on a computational analysis of text recycling in published research articles in STEM disciplines. Using a tool we created in R, we analyze a corpus of 400 published articles from 80 federally funded research projects across eight disciplinary clusters. According to our analysis, STEM research groups frequently recycle some material from their previously published articles. On average, papers in our corpus contained about three recycled sentences per article, though a minority of research teams (around 15%) recycled substantially more content. These findings were generally consistent across STEM disciplines. We also find evidence that researchers superficially alter recycled prose much more often than recycling it verbatim. Based on our findings, which suggest that recycling some amount of material is normative in STEM research writing, researchers and editors would benefit from more appropriate and explicit guidance about what constitutes legitimate practice and how authors should report the presence of recycled material.
Volume / Issue
Start / End Page
Electronic International Standard Serial Number (EISSN)
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)