34

General background

Some time ago, I was reading a blog post, where there was some discussion about how many people read journal articles. I think that such an estimate is important when trying to assess the impact of research on society. However, whereas internet sites readily track usage. Such information seems a little more difficult to come by when it comes to readership for a particular journal article.

Initial Ideas

  • Articles vary: Obviously journal articles vary in many ways and just as with citation counts, readership is likely to be highly skewed, perhaps something like a power function. In addition to academic impact, presumably articles that are available for free on the internet are read more.
  • Time since publication: The number of reads increases over time, but the rate of readership presumably varies over time (perhaps a spike on initial release, and then gradual decline as relevance dissipates).
  • Definitions of reading vary: Read counts would also increase or decrease based on how reading is defined. At the low end is a glance at an abstract. At the high end is carefully reading the entire article. I'd be happy with a working definition that involved reading at least two pages.

Initial Data

  • PlosOne article statistics: As a very rough guide, it suggests that mean views per article is around 800 per year.
  • Journal of Vision: this article reports some download statistics: "In the most recent accounting in July, 2008, the top five articles were each downloaded between 1,993 and 3,478 times."
  • Some journals list subscription counts

Initial Guess

I find it useful to have a ball park estimate of these things. My own initial guess, based on minimal data, is that readership is between 50 and 1000 times the citation count for the article. Linking the estimate to citation count makes it easier to estimate for a given article and should incorporate effects like time and journal prestige.

Question

  • What is a good estimate of how many people read a given journal article?
  • What data and sources of information justify this estimate?
  • Is there any established literature that can inform such an estimate?
10
  • 3
    I'd love to know a good answer to this (with variations based on different definitions of reading, etc.). I think the estimate of 50 to 1000 times the citation count is way off, certainly at least once you get more than a few citations. For example, there are plenty of 200-citation papers in fields that just don't have 10,000 plausible readers. For PLoS journals it seems to be a decent approximation if you count each page view as a separate reader, but that overcounts readers and is an exceptionally permissive way of counting reading. Commented Apr 20, 2012 at 0:50
  • 5
    Page views should overestimate readership: many people bring up a page without doing more than glancing at it while web searching, and each actual reader probably views it several times. The PLoS data may be the best we have, and it's certainly interesting, but I'm having trouble reconciling it with my own papers. At 500 readers per citation, there just wouldn't be enough possible readers in the world to account for the citations. For example, the total number of research-active mathematicians worldwide is probably in the low tens of thousands, and only a small fraction have read my papers. Commented Apr 20, 2012 at 1:15
  • 4
    One key question is what the ratio is between the number of papers you read and the number you cite. It's clearly not 500 (unless you read constantly and hardly cite at all), and even 50 seems implausibly high: if you cite just 20 papers total per year, which is not very many, you would be reading 1000 papers per year. So the only realistic way to achieve average ratios of 50+ is if they include huge numbers of readers who aren't doing any citing, and those don't seem likely to be legitimate readers who understand the article. Commented Apr 20, 2012 at 1:34
  • 18
    The standard joke is that most journal papers are read at most twice: at most once by the author, and at most once by the three referees.
    – JeffE
    Commented Apr 20, 2012 at 5:40
  • 9
    No, just twice. (That's part of the joke.)
    – JeffE
    Commented Apr 20, 2012 at 9:24

2 Answers 2

17

Sounds like a Fermi Problem :)

A question I asked myself recently, based on the many cases of plagiarism by top-politicians in Germany in humanities, was, are in humanities more articles/texts published than scholars can actually read completly. The amount of copied text in single phd thesis showed by plagiarism-detection communities in Germany like Vroniplag or Guttenplag is shocking to me. Often 50% of text is not marked correctly as citation. Even the supervisors at the local universities look like they never read some of these thesis completly. I really hope this is not representative, but fear it might be the tip of the iceberg in humanities (in Germany).

Personally, coming and working in a STEM field, I did a very specialized thesis, there are often less than a dozen groups worldwide working on such a narrow-specialized topic (matter of scientific competition/finding a niche, time, expertise and lab hardware in such fields). So there will be articles in peer-reviewed journals that are not really interesting to more than 20-50 researcher and probably a similar number of industry-researchers worldwide in STEM (competition between companies and research groups being not that different due to economic contraints). Without modern search engines, most non-scholars/private men would have a hard time to find such articles. This is another point in your estimation. The reader count for nature/sciene vs. very specialized journals varies a lot, I don't think any average number really helps you a lot or is that interesting. If you know your specialized field, you should notice pretty fast studying some journals, how many scholars have really a interest in that field.

Your PlosOne link is interesting. I can back this up a bit to give you at least a rough magnitude of order, what the reader count of top, specialized, ... journals is. I think it's quite normal, to read articles not completely (even if you cite them), but I take a close look on articles I downloaded, often due to the fact that I use many keywords and google operators to really filter out the stuff I'm looking for. This is something that varies also a lot between different scholars/students. I'm often shocked how students make use of search engines, if it is laziness or ignorance of search operators. This can save you so much reading time. Therefore, I think the extrapolated reader count based on citation factor might be more representative and reliable than using site views/downloads due to scholars, private people, laymen often downloading articles with information they didn't look for because of bad search engine use. Growing redundancy/plagiarism is a further factor here.

Some possible heuristics:

  • comparison of published aricles per month and web site/interface visitors per month on download platforms like PlosOne, arxiv, nature.

arxiv has around 6000 published articels per month, unique visits 100000, 12,4 million downloads by academic institutions, 50 million overall vs. 12x6000 articles 2011 means downloads/view of abstract of around 170 (I used 12,4 million here), of course, that doesnt count articles not published in that year, so the average read count of a single arxiv article is probably lower than 170 and more touching the 20-50 mark I explained above. But here you have IMO a reasonable and quite objective minimum and maximum limit for a scientific article other scholars are really interested in, 50-170

nature has 900000 unique visits per month, around 200 articles per month, so you see why having an article published in nature is probably more worth than 10 articles on arxiv, PlosOne or many other specialized journals in a distinct branch, even if they are peer reviewed ;)

  • looking up bibliographies of a some phd thesis in your field at your local university, the number of cited articles is in STEM often in the range of 50-200 (You see even here it varies a lot what a single phd student will/has to read). Of course you do not cite all articles you read, but the factor shouldn't be higher than 2 between (or your search engine use is imho suboptimal) cited and read articles. Considering the phd student will publish 3-5 (in STEM reasonable number or 1 nature article :) ) articles during his phd work and multiplying 3-5*20-50 (average read count by institutional scholars) you also get the number of articles in a phd thesis bibliography of 50-200. Pure Chance?! Looks like a strange calculation, but there is a link between how much article input a average scholar needs and how much output he creates (thats why I multiply both values) and it strengthens my experience/analysis above that 10-100 readers is a reasonable magnitude of order for people being really interested in an single average article. To me it doesn't look like pure chance, but that's the main problem with Fermi questions and answers :)

PS: notice this analysis is focused on STEM, I believe the average read count is much lower in humanities and side-effects like different languages and plagiarism seem to play a bigger role to make a really objective guesstimate

4

Smithsonian.com recently noted that there are about 1.8 million scholarly articles and scientific papers published each year in 28,000 journals. About half of these are not read by anyone other than the author, a journal editor, and a couple of reviewers. They report that 90 percent are never cited by other papers.

4
  • 1
    @RGardner Is there a reference to this article. I'd be curious to know what methods were employed to estimate readership. I also think that citations is mostly a "zero-sum-game". Finally, I imagine that readership is related to the reputability of the journal. For example, in general, when estimating average readership, I'd be more interested in filtering, for example, by journals above a certain threshold (e.g., impact factor above 0.5 or 1.0). Commented Apr 11, 2014 at 3:57
  • 1
    @JeromyAnglim I added the link. Please let me know if you have problem reading that page.
    – Nobody
    Commented Apr 11, 2014 at 4:28
  • 4
    Thanks for the link.The claim seems to come from Meho "the rise and rise of citation analysis" which states "Indeed, as many as 50% of papers are never read by anyone other than their authors, referees and journal editors. We know this thanks to citation analysis, a branch of information science in which researchers study the way articles in a scholarly field are accessed and referenced by others." However, I fail to see how the lack of citations shows that an article has not been read by others. Commented Apr 11, 2014 at 4:33
  • 2
    As mentioned by @JeromyAnglim, the claim was made in an article by Lokman Meho, although Meho himself did not really write this; it was added by an editor without, so it seems, a lot of evidence: blogs.lse.ac.uk/impactofsocialsciences/2014/04/23/…
    – RafG
    Commented Apr 25, 2016 at 8:31

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .