32

I am working on an ML-based problem for my undergrad thesis, based on an experimental dataset and prediction results published in a paper two years ago by an associate prof. in a very high ranking journal - the paper has 200+ citations. For months, I'd been trying to replicate the paper's results using the methods provided as well as those of my own but all in vain. The paper states that the code used would be made available on request, so out of curiosity I mailed the corresponding author for it.

I received a quick response, and for another month I was trying to decode how the author obtained their prediction results. One day when I was discussing implementation issues in my code with my supervisor, I spotted a grave error (or manipulation trick, as it seems to me) that completely invalidated all results in both their main paper and supplementary information. I was dumbstruck.

I found a very similar question on Academia SE - but here's a slight difference. I asked my supervisor to permit me to write an independent comment on the article as I tried to reach out to the authors regarding the mistake (or intended manipulation, as the case might have been), which they did not respond to. The authors deliberately avoided producing graphs that could demonstrate the error (as most papers in the field do contain those graphs) and did not define the exact performance evaluation metrics.

The problem is that I'm unable to convince my supervisor as he thinks that (1) the paper might be retracted or significantly altered as it provides both the code and the experimental dataset and (2) it would be an malpractice on ethical/humane grounds to request the authors' code and then use it to bring their published, widely-cited article down that also helped them gain grants.

I believe I should at least publish a comment or request the authors to collaborate to work on this together as answered in the linked question above, but I think the latter is unlikely given that a likely retraction would affect their grants.

What would be the best course of action in this case? How do I convince my supervisor to help preserve academic integrity and not let my months of efforts go to waste (as unfortunately, academia and thesis committees still undervalue reporting poor/negative results in a thesis)?

EDIT: I found it all out too late as I have less than a week for my thesis defence.

EDIT 2: I think I must clear a few more things. The paper was the first to present the dataset and results of its kind. The authors provide the data preprocessing code publicly but the model development code on request. Also, I think I can analogize the exact error in the paper with a simple example. Imagine you want to compute estimates of a quantity and require scaling it down in proportion to a reference. Instead of scaling the quantity down, you scale down the root mean square of the estimation errors between the true unscaled quantity and your unscaled estimate and exclaim you get spectacular estimation results!

8
  • 3
    Two things are unclear in your question. What did you say when you contacted the authors about the mistake? And what does (1) mean - why does the possibility of the paper being retracted or significantly altered make your supervisor unwilling to act?
    – toby544
    Commented Apr 30 at 16:26
  • @Ian, probably, but I first-authored a journal paper published in the same field last year so at least I thoroughly understand the results.
    – Kartik
    Commented Apr 30 at 16:29
  • 3
    This sounds similar to the academic paper that persuaded many countries around the world to adopt austerity. Once the flaw was found, it made very little difference despite many people pointing it out. bbc.co.uk/news/magazine-22223190
    – Neil
    Commented May 1 at 10:41
  • 30
    The idea that it would be unethical or malpractice to use the content of a publication to refute said publication is so incredibly backwards. That's almost as concerning to me as the idea the the author intentionally fudged their results.
    – Layne B
    Commented May 1 at 22:01
  • 1
    @DanielR.Collins No, the author is from a different country and hasn't worked before with any of us.
    – Kartik
    Commented May 2 at 4:05

13 Answers 13

24

Your advisor is wary of the harm that could come from making any comment, and my inclinations would run the same way, but there is also harm in leaving things as they are. You say you spent months trying to replicate the paper's results: since the paper is highly cited, it is likely that others will also waste months of their time on this method if it goes uncorrected. Wasted months, especially early in a career, can have serious negative consequences: a PhD student might run out of time or funding and never complete their PhD, a postdoc who wastes months might fail to become competitive for a faculty position, and an early career academic who wastes months might not make tenure.

You did the right thing by reaching out to the authors first. Since they have not responded, I think writing a comment of your own is the right thing to do. If possible, frame it as a "clarification" of the method and its results rather than as a "correction". E.g., state that the results can be "misinterpreted", leading to over-confidence in the approach. Perhaps you could ask the original authors if they would like to be co-authors on this clarification, to give them maximum opportunity to save face. If it can't be framed as a clarification, then it is appropriate to frame it as a "correction", but you will minimise damage to the authors and your relationship with your advisor if you avoid implying that the original paper was deliberately misleading. Perhaps instead, you could present it as a case study for why there needs to be consistency in the types of figures and metrics presented in this type of work?

Alternatively, you could reach out to the editor privately to discuss your concerns. If this was really deliberate, the authors have probably taken a similar approach in their other work -- or might do so in future if they are never called to account for it. Unfortunately, editors vary in how receptive they are to this kind of discussion. Some would rather not know and would rather not get involved.

4
  • 19
    This is the way to go. I once requested the R code underlying a published paper. In this code I found a statement of the form x = a/b+c, which looked weird to me. Sure enough, when I disinterred the original FORTRAN code (of which the R code was a re-implementation), I found the correct x = a/(b+c). Now this was a pretty central equation. I wrote the first author pointing out the mistake only noting, 'I hope that your results do not change when you have fixed this typo', well knowing that sure they would. I never received an answer... Commented May 1 at 8:06
  • 11
    "Some would rather not know and would rather not get involved." I am saddened by this. :(
    – justhalf
    Commented May 1 at 15:31
  • 1
    Thanks for the answer. I'll try reaching out another coauthor regarding this first and then discuss with my supervisor to proceed as you say. I received the code from the first author when I mailed the corresponding author for the request, and I had been communicating with the first author until he stopped (or missed) responding.
    – Kartik
    Commented May 2 at 4:08
  • 1
    You are right, but being the whisteblower can also nix your career.
    – Deipatrous
    Commented May 2 at 15:33
58

I think both you and your supervisor are confused.

On one hand, you do not retract papers as a third party. The journal does, or the authors do. You can criticize existing papers. This does not retract them. You may want to be strategic about how you criticize as a junior researcher because not everyone takes criticism well and because unfortunately not everyone will take the necessary time to understand and come to agreement with you.

On the other hand, one of the main reasons for providing code, and the reason many journals now require authors to provide code, is to allow results to be verified by third parties. You've used it exactly as is appropriate.

Steps to take would involve contacting the authors about the mistake you've discovered and/or getting some input from others to make sure you correctly understand the mistake and aren't making one yourself. I think you're better off assuming a mistake rather than intentional fraud at this point; it's possible, but accusations of fraud cause people to behave defensively. I think if the goal is to correct the record that's the most effective path. It sounds like you've already done much of this; if nothing comes of it you might be stuck there.

As for your own work, you do not need to rely on a paper you've found to be flawed. If you need to benchmark your own work against something, don't use this example, or use your corrected version.

It might be worth publishing a Comment on the paper if it's sufficiently influential, but I think you need guidance and advice of a senior academic to help you through that process. I wouldn't advise continuing on your own.

As far as your undergraduate thesis, the expectations for an undergraduate thesis are generally low among academic work and typically you would not necessarily be expected to produce publishable work, though it may help your future career (i.e., graduate school applications). You will need to work with your advisor directly to understand how best to present what you have done. My personal opinion would be that taking some result in the literature, attempting to replicate it, failing to replicate, obtaining the original code, and demonstrating that there is a mistake in the original code sounds like a fantastic undergraduate thesis. I would not necessarily put this in the realm of "negative results"; usually a "negative result" means that you haven't changed the state of the art at all, whereas here you've found an accepted result is wrong. All the most important results are the ones that find accepted results are wrong, those that just confirm what's already known are boring. I'm not the one who grades your thesis, though, so check with the person who does.

I found it all out too late as I have less than a week for my thesis defence.

This by itself seems like a poor excuse, though; you also say:

For months, I'd been trying to replicate the paper's results using the methods provided as well as those of my own but all in vain

So overall, this isn't something that's come up just in the last week, and you've had months to work with your advisor on contingencies, alternative approaches, etc. You have more information just coming up recently if you've only recently received and processed the original code, but that's not your entire undergraduate thesis.

3
  • 7
    Thank you for the answer. Yes, I didn't mean we can retract somebody else's paper - I meant publishing a comment or mailing the journal's editors could lead to this - although we won't do the latter before discussing the mistake with the authors. Edited the question title to reflect that.
    – Kartik
    Commented Apr 30 at 17:18
  • 2
    I had a similar experience very early on in my PhD studies, and I received some very sound advice. First: "the guys who wrote that paper are far busier than you; you have spent more time reading it than they spent writing it". Second: "if they didn't know how to value critical input, they probably would never have reached such dizzy heights". So go for it. Also there's a big middle ground between making a simple error and committing fraud: namely taking a short-cut when you're busy and deadlines are looming. Commented May 3 at 14:04
  • @MichaelKay Many thanks for the insights! Indeed, I've seen this shortcutting happen in my field very often and going unnoticed as data/code is not provided.
    – Kartik
    Commented May 4 at 15:51
16

It's normal to make mistakes. I make them, you make them, everybody does. Pointing out others possible mistakes must be polite, friendly, and kind, because thats the way you would like to be threated if the positions were reversed - and they can well be, sooner or later. In my experience, I have found several flawed papers. I am happy about the outcome in one case in particular, which I am going to quickly share with you. To give some context, please note that my field is condensed matter experimental physics. I had several issues with one publication. I wrote the corresponding author and received some answers. I didn't find them compelling and asked again. The corresponding author never replied to me again. I then wrote the journal editor, said that I had these unanswered questions, that the corresponding author was not responding, that I thought that paper seriously wrong, and that I wanted to remain anonymous in the process. The editor took this seriously, even though it took them about 2 years to complete their process. They contacted external referees, who checked my comments and agreed with them. Asked the paper author to reply. The authors reply was reviewed, and found lacking by the referees. In the end, the editor published an expression of concern that lists all this info, saying they can't verify the validity of that paper and to be cautious with it. This way, I think I did my part, yet remained safely anonymous. I have several other examples but they didn't work out quite as well. So, there's hope. Good luck to you

3
  • That's an excellent outcome and suggested process. Thanks for sharing it here! Commented May 2 at 14:05
  • It's an interesting story, and well done protecting your anonymity, but the outcome is still weak sauce. Also, the OP makes it rather clear that his case does not concern "honest" mistake.
    – Deipatrous
    Commented May 2 at 15:35
  • Thanks for the insights. 2 years are enough for too many papers to cite erroneous papers and I'm not sure if authors care to remove references to a retracted paper unless it gains enough limelight. I concur that I should remain anonymous as long as direct communication with the authors doesn't work out.
    – Kartik
    Commented May 2 at 16:40
12

You can comment on the paper at PubPeer.

2
  • 2
    (+1) I absolutely agree with this. PubPeer is exactly was made for exactly these kinds of scenarios, as you can comment anonymously.
    – mhdadk
    Commented May 2 at 15:03
  • Will the authors be informed of my comment on their article at PubPeer even if they don't have an associated account on PubPeer? Edit: I noticed the website manually requires entering current email addresses for some authors to be notified; for some it extracts their emails, shown masked by *.
    – Kartik
    Commented May 10 at 8:49
8

I'm a senior professor in physics. I disagree with those who say to ignore your advisor's advice. Not because they are the boss or always right, but because you and they are collaborators and partners and should do things by consensus and do nothing to harm the other. But to the question - a PhD student of mine recently tried to replicate an experimental result published by a well established researcher elsewhere and got a negative result. We reached out to him and he and his student were supportive and shared all their experimental details, which we repeated again and still got a negative result. My student wrote a paper explaining our non-result and shared it with the well established researcher and he was fine with us submitting it to a refereed journal. But because we cannot explain how the others got the "erroneous" result, they are not sure if they will retract. The point is - research should be open and transparent. Anything less is unprofessional and hurts the field. I think your advisor should send the original authors a certified hard copy letter through the postal mail and explain the situation and politely challenge them to cooperate.

1
  • 1
    Thanks for indicating this as an option. I would definitely appreciate if the concerned authors would explore co-authoring a correction paper. I once found a minor error in a paper in an experimental field, mailed the author regarding it, and they were more than glad to correct it!
    – Kartik
    Commented May 2 at 16:34
6

First a bit banter.

Sadly, this is the sad truth about academic research. Peer review system is flawed and will never be able to sort adulterated results from the real ones. First of all it would take almost as much time as writing the paper to validate the results. No one will to review if that would be the case. It is simple to solve it in computer science: submission of source code to the journal in confidence can solve it. Simply by reviewing the code, many of the problems will be found. But, it is not that easy in other fields.

Now here is something that happened in my field:

A very influential paper has its results wrong. I am not sure about the reason, maybe they got the metric wrong or it is completely fabricated. Everyone comparing with that paper published another result which was consistent with each other. The original paper is never retracted or corrected. We always write that the measurement criteria was probably different. In reality, measurement method was given in the paper and same as everyone else. It is nigh impossible to prove any wrong doing even if there is one. Sadly, even in this digital age, publications are treated as static data. There is very little incentive to change it. Any disruption in this system might cause so much harm to innocent researchers who has made future plans.

6

Let’s start by assuming the authors are well intentioned and have made an honest mistake, and that you are completely correct that it is a mistake in their analysis which invalidates the large majority of the conclusions in their publication.

As an honest, well-meaning scientist would you feel comfortable continuing to mislead many other scientists with incorrect results, potentially wasting months of their life’s reproducing results that are not reproducible? I don’t think so. Even if no one wants to be in that situation, if not reporting means that potentially many other people will be wasting their time, then obviously the morally responsible thing to do is to report it. As an honest scientist I would also come to this conclusion myself. Now the other part of the question, should you report it? I would try to get the authors to admit it first (also to ensure you didn’t get anything wrong), and to give them the opportunity to correct it themselves and “do the right thing” on their own. From the point of view of the journal it is much easier to judge if the authors themselves admit something was incorrect rather than involving a set of independent experts to judge and decide. Now if the authors don’t want to admit it, it becomes more complicated (not morally, because morally it is clear), but in terms of social and political aspect. Here, you will be more limited by the support you get from a supervisor, you can always act on your own but just be aware that this kind of "politics" exist at all levels of science, from funding, to journal editors and reviewers.

3

What would be the best course of action in this case?

In terms of your thesis defense I am not sure any action is needed. It does make an interesting story, and you might relate it during your defense presentation in the "challenges" section.

In terms of the published paper and its conclusions and how they are affected by

the thing I believe to be a mistake (which was a simple numeric scaling on reported errors)

I agree you should proceed cautiously; either with your advisor or another individual established in the field.

But you could also, after considering the risks, just go for it and write something yourself and accept any possible s*** storm that may ensue, either from your advisor, the authors, or elsewhere. Some folks in the academe can be vindictive and paranoid when their own reputation is at stake. This is completely AYOR! (1, 2)

When someone says you can do something AYOR (at your own risk), it means you can do it, but you're responsible if you screw it up or hurt yourself. And there's a good chance you will.

In terms of the angst and frustration that something published seems to be wrong, the injustice of a high citation count from a seemingly wrong result, and the perception that this was achieved by possibly intentionally incorrect scaling of errors, all I can say is take a deep breath and get used to it.

We must coexist with (who we feel to be) unethical colleagues and peers. They can become our paper reviewers, our funding approvers, or our own Ellsworth Toohey (hidden antagonist) to our own (idealist) Howard Roark.


Obligatory XKCD (386) Duty Calls

enter image description here

3

Sounds like a complicated and annoying situation. Thankfully, the answer to your question is really easy:

You contact the editor and make a case for why you think the conclusions of the paper are incorrect/ not supposed by the data provided. I would suggest full transparency with the editor, so include the full story (editors are fast readers, so don't be afraid to be verbose)

Whatever happens next, happen next.

And for what it's worth, your supervisors opinion on this is completely irrelevant. You can take their advice and they are more than welcome to provide input, but the decision to contact the editor or do whatever else is yours and yours alone.

3

Be aware with how you think of the original authors. In your post you suggest a manipulation trick, which is a serious accusation. Given that they provided you with the materials readily suggests an honest mistake. If they would have consciously manipulated the outcomes they would probably have delayed or find excuses not to give you the code. Also keep in mind that your mail notifying them of the problem may have urged them to review their own work which may take a lot of time (especially since they probably no longer have time allocated to the work). So all in all I would suggest good faith in your future actions. But yes it will benefit science if this mistake (if shown to be as substantial as you claim) is corrected.

3

This is Science. We publish results to share knowledge, and as a check.

Put aside speculation as to motive.

Ask a polite question. You got an unexpected result using the data and code kindly provided by the authors. Perhaps you or they made a mistake. Perhaps both. Explain as clearly as you can what you see. Do not speculate. Ask for input from others.

Do this publicly.

For the sake of everyone, put aside ego and emotion. We are all Human and fallible.

2

Write to the author(s) who's code you spotted an error with. Inform them of your finding. Importantly, CC the editor of the journal the paper was published in while also in the body letting the author(s) know your doing so.

If the authors fail to respond to your inquiry, write a follow-up email and request support from the editor.

If both the author(s) and editor fail to respond, write a third email informing them you plan to publish a paper reporting the issue.

2

As long as you don't act with the intention of getting their paper retracted, but rather having the result corrected by whatever legal means necessary you are morally in the clear.

A note on Edit 2, when scaling a quantity, the uncertainty is also scaled by the same factor.

For example, if you have a measurement of 2 cm give or take 0.5 cm, and you multiply this measurement by 0.1, the new values are 0.2 cm give or take 0.05 cm. In this example it's not important if the uncertainty is from measurement, standard deviation, or standard error, they are all treated the same for this kind of scaling, unless there are multiple variables.

2
  • 1
    Thanks, I forgot to mention that they did not scale the exact error values but the RMS of those errors - updated EDIT 2 to reflect that.
    – Kartik
    Commented May 2 at 7:26
  • @Kartik this is not necessarily an incorrect thing to do. It sounds like the scaled the standard error which is the error of the mean and this is likely fine and correct, it really depends on a few things. If this is the first time they swapped to standard error, that's shifty, if they took the RMS of standard errors that's potentially wrong, do some reading up on error propagation. Remember when doing error propagation there's 3 kinda of errors you can work with, exact error, standard deviation, and standard deviation of the mean (standard error), the last 2 use the same rules.
    – Spodeian
    Commented May 13 at 3:23

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .