542

I went to a Late Answers review queue that had four items in it and started reviewing the first one by editing it into shape rather heavily.

By the time I was done, someone else had reviewed the same post simply by upvoting it, so I only got a "Next" button, not an "I'm done" button. My edit still counted as an edit; but no longer as a review.

Worse still, the other three items were gone from the review queue as well. Curious, I checked the review tab in the profile of the user who had snatched the first review out from under me. Sure enough, he also "reviewed" the other three items, all within a minute, without even fixing obvious typos, simply by upvoting. I had to go through his review list and review every single post again.

Except, of course, my reviews didn't count as reviews anymore. My review count went up by zero. His went up by four — for fixing exactly nothing in posts that absolutely could use fixing. To add insult to injury, this was on a site where collecting four review points is incredibly hard because the review queues there are pretty much empty 24/7. (Edit: and now that that is also true of SO, we are seeing the same behavior here.)

So what we get are:

  1. Subpar answers that do not get fixed and get upvoted instead and disappear from the radar of the people who are actually capable of and willing to fix them.
  2. Fastest-gun-in-the-West, single-click, faux reviews that count towards rare Gold badges, while elaborate, actual reviews don't.

This is severely broken.

And I can't think of an easy solution that wouldn't break other things.

  1. Simply not letting upvotes count as reviews is not an option (it discourages upvotes not just on crap, but across the board; and some posts really are fine as is and deserve nothing but an upvote).
  2. Simply always counting a review as a review, even if someone else is faster to review the same post, introduces review-point inflation and only solves the problem for one faux review, but not for several in a row, as in the scenario above.
  3. Simply letting several people review the same item, like with suggested edits, is not really "simply" anymore, and introduces more review-point inflation still.
  4. Likewise, allowing people to reject or override other people's reviews as not substantial enough opens all kinds of cans of worms.

And so on and so forth. I really can't think of anything Solomonic. But perhaps someone else can.

47
  • 33
    Sometimes it feels like pushing down a number outweighs keeping up the quality and standards
    – random
    Commented Oct 7, 2012 at 1:19
  • 11
    @BenBrocka: How did you get from what Dwight said to "let's get rid of reviews altogether?" The problem he cited isn't reviews; it's people reviewing badly and for the wrong reasons. Furthermore, he gave evidence of the problem (though admittedly without links): his specific run-in with someone clearly grinding out reviews rather than doing any actual reviewing. Commented Oct 7, 2012 at 4:32
  • 11
    Possible workaround? Upvote, open post in new tab. Mark reviewed, move to tab to perform actual review, remove upvote. You'll get the point and be able to do the real work that needs to be done, although I guess your review history will seem poor.
    – jscs
    Commented Oct 7, 2012 at 9:03
  • 7
    @Anna: just happened again, this time on SO: stackoverflow.com/review-beta/first-posts/766806. Someone upvoted this... thing (which is not just poorly worded and mistyped, but arguably a too localized NARQ to boot), and that counted as a review. While my fixing the title, the body, commenting, and voting to close no longer did.
    – ЯegDwight
    Commented Oct 7, 2012 at 13:40
  • 17
    That would certainly explain why the first-post review queue seemed to vanish so quickly.
    – pjmorse
    Commented Oct 8, 2012 at 17:56
  • 3
    Related: Is there an actual “accept ALL the suggested edits” problem? and What can we do to stop bad edits getting accepted? (And yes, I understand that the review system encompasses more than just suggested edits, but still.)
    – Pops
    Commented Oct 8, 2012 at 20:49
  • 12
    Totally agree: I just stumbled over this answer that got upvote despite the simple fact that the author copied word by word from other answer on the question. If one would see the answer along with the other answers rather than via the review system it would have gotten barrage of downvotes instead of upvote. Commented Oct 9, 2012 at 8:08
  • 11
    "severily" broken - that might be a typo, but I really really want it to be a neologism.
    – pjmorse
    Commented Oct 11, 2012 at 2:59
  • 5
    Just happened again. I click in to "review", get this post, set to work editing, and when I'm done it's been reviewed and all I have is the Next button and the queue is empty. I don't know why I keep trying.
    – pjmorse
    Commented Oct 12, 2012 at 0:53
  • 10
    Just to add another real-world example of how broken this is, see the voting on this question. It's the 6,001th incarnation of the "how do I rewrite URLs plz send code" question, and a crappy, unintelligible incarnation at that. It has three fracking upvotes.
    – Pekka
    Commented Oct 17, 2012 at 14:02
  • 17
    How about people upvoting non-answers: stackoverflow.com/a/12936258/19679 , stackoverflow.com/a/12926953/19679 , link-only answers: stackoverflow.com/a/12877379/19679 , or crappy error-dump questions: stackoverflow.com/questions/12913358/… ? I also need to find the few instances of outright spam that I've seen upvoted in the last few days. I'm starting to mod-message people spamming upvotes in the review queues, because this is pushing garbage ahead of better quality content. This needs to stop. Commented Oct 17, 2012 at 15:59
  • 4
    I totally haven't followed this whole review tool and the discussion around it, but dumb question: why does upvoting have to be part of review in the first place? Although I'm sure there is a Meta discussion about that. Searching....
    – Pekka
    Commented Oct 17, 2012 at 16:58
  • 63
    Obvious vandalism approved by multiple users without any regard. What's the policy on simply naming and shaming users who do pointless s**t like that?
    – Mac
    Commented Oct 18, 2012 at 0:49
  • 19
    @Mac ahahahaha that edit is hilarious. But the approve votes are sad.
    – Pekka
    Commented Oct 18, 2012 at 8:38
  • 14
    @Pekka: what's really interesting is that the same post was defaced again shortly after, but this time of the two that approved the previous edit, one approved the new change and one didn't. I guess one had a change of heart... BTW, if you're still after a laugh, make sure you read the explanation for the second edit.
    – Mac
    Commented Oct 18, 2012 at 9:55

27 Answers 27

282
+500

I think the foundation of the problem is incentivizing reviewing in the first place.

We incentivize asking questions. We incentivize answering them. We incentivize making questions better by editing them (for people with lower rep). However, all of these incentive processes have review.

You only get rep from questions when people upvote them. You only get rep from answers when people upvote them. You only get rep from edits when people approve them. In every case, some human being has to look at what you did and say, "Good".

Reviews don't work that way. Nobody approves a review. You get something that needs review, you review it, and you take an action. You get incentive points regardless of what action you take (as long as you take some action).

If you take away upvotes as a "reviewing" action, people will just make inconsequential edits. The system is too easy to game because there is zero oversight; as long as you do something, anything, to the post, you get a point.

This is why the only people who get rep for editing questions are those who have to have their edits approved.

As long as there's a shiny gold badge in it, someone will grind out reviews just to get it. Take away the incentive, and many people may stop reviewing altogether. Perhaps the latter might be a better option; at least then, you're getting people who actually care about reviewing involved.

31
  • 15
    So you think if you take away all badges, reputation, and reviews that people will still contribute? I highly doubt it. Commented Oct 7, 2012 at 4:17
  • 85
    @AustinHenley: Did you even read what I write? Because I'm trying to figure out how you got the impression that I thought all incentive was bad, even when I specifically called out the difference with how reviews work compared to sources of reputation. The very first sentence states my conclusion, "I think the foundation of the problem is incentivizing reviewing in the first place." Did you miss the emphasized word? Commented Oct 7, 2012 at 4:27
  • 5
    If you want people to do something, you have to give them an incentive. Commented Oct 7, 2012 at 4:33
  • 108
    @Austin: Nonsense. Giving them an incentive means that you'll get more people to do it. But you will get people reviewing without incentives. The reason why the other incentives work is because they're doled out by the community. Review incentives are pretty much on the honor system: we trust that you actually reviewed the material instead of just doing the bare minimum to get your candy. Commented Oct 7, 2012 at 4:43
  • 62
    @Austin - Right now, some people who used to review have stopped doing that, because we never get the chance to reject or improve an edit. It is almost always immediately approved by two other people anyway.
    – Bo Persson
    Commented Oct 7, 2012 at 10:25
  • 11
  • 33
    @tchrist: Yes, and when I care about what some guys on the internet think, I'll let you know. There is nothing "secret" or "annoying" about "incentivize." Commented Oct 7, 2012 at 17:04
  • 11
    “Nothing annoying”? To the contrary, I assure you that it is an excruciatingly annoying non-word. It’s like listening to fingernails on a chalkboard.
    – tchrist
    Commented Oct 7, 2012 at 17:06
  • 5
    @msPeachy: if you're editing and someone else reviews the post (e.g. with an upvote) and clicks "I'm Done", when you complete the edit, the only button available will be "Next".
    – pjmorse
    Commented Oct 10, 2012 at 1:09
  • 26
    @tchrist, you may find "incentivize" to be an annoying word that sounds like fingernails on a chalkboard, but how can you say it's a non-word? It has a clear and immediately obvious meaning ("to add incentives to") and is certainly well-attested in speech, text, and dictionaries. And for the record, I'm not even the slightest bit bothered by that word; your opinion is not shared by everyone.
    – Ben Lee
    Commented Oct 11, 2012 at 19:07
  • 9
    What if we reviewed reviews? (Joking ... I think) Commented Nov 1, 2012 at 2:01
  • 44
    I've seen Late Answer spam posts upvoted +3 or +4, which is just absurd. The majority of my Late Answer reviews are "Skip", as I mostly look for spam or non-answers, but I feel like I'm in the minority. I'd be all for removing the review badges; I'd actually be more inclined to review if there was no incentive, since I actually enjoy keeping SO nice and clean. Commented Nov 13, 2012 at 21:20
  • 4
    @ChrisGerken: That doesn't explain the high upvoting of unquestionably bad posts in the review queue. As LittleBobbyTables points out, even spam posts are being upvoted. That's evidence of people doing "reviews" by just upvoting anything. MSO not communicating a good edit has nothing to do with that behavior. Commented Nov 22, 2012 at 16:52
  • 6
    @tchrist For what it's worth: adding "-ize" to a noun to make it into a verb actually comes from ancient Greek, so as icky as I find "incentivize" at least we come by it honestly. Commented Dec 8, 2012 at 15:01
  • 12
    Why do people fix mistakes on Wikipedia? There are no gold badges there.
    – BlueRaja
    Commented Jun 15, 2015 at 17:08
83

Simply letting several people review the same item, like with suggested edits, is not really "simply" anymore, and introduces more review-point inflation still.

I think this is a tangible solution to at least some of the problems you have described. At least 25–40% of the time tonight, I have gotten in "foot races" where, like you, my edit was accepted, but the post was already marked as "reviewed" by the time I was done. My differences with the "upvote or no review" system aside, this method of refereeing multiple reviews is just encouraging rushed or avoided edits on posts that really need them for, if nothing else, the instructional value to the new user.

Failing a multiple user review, having each review post designated to only the first user to view that post until relinquished by a "not sure" click would be the most fair, assuming this would be feasible.

11
  • 12
    I like this idea. They also might be able to cache reviews in cases of foot races, and award the review point to the most substantial review. Commented Oct 7, 2012 at 13:07
  • @BilltheLizard By “most substantial review”, would you you mean most substantial edit? That is, the one whose diff is longest?
    – tchrist
    Commented Oct 7, 2012 at 16:42
  • 10
    @tchrist Yes, in the case of two edits. That might help encourage people to edit everything that needs improvement. (Also, I'd consider an edit to be a more substantial review than a vote.) Commented Oct 7, 2012 at 17:30
  • 4
    @BilltheLizard I worry what the way to cheat around that would be. I suspect it'll be fairly easy to game the metric. For example, by editing in HTML comments, zero-width spaces, etc. Or worse, adding visible text (though visible text is much more likely to be noticed).
    – derobert
    Commented Oct 12, 2012 at 15:20
  • 3
    @derobert If adding in HTML comments or zero-width spaces is less work than actually editing the post, I guess those things would have to be discounted. I think mostly this would encourage making substantive changes, rather than cheating. (Although any system that uses rewards to encourage participation will have some small amount of cheating.) Commented Oct 12, 2012 at 15:37
  • 4
    @BilltheLizard well, either of those is a copy&paste thing. I'm a bit worried about moving cheating from the upvoting (which seems relatively harmless) to editing (which strikes me as potentially much more harmful).
    – derobert
    Commented Oct 12, 2012 at 15:43
  • 4
    meta.stackexchange.com/questions/147060/… I really do feel the lack of a "No vote, but OK" or "Fit for Purpose" review button.
    – itsbruce
    Commented Oct 18, 2012 at 4:58
  • Can this be submitted as a feature-request? Commented Oct 20, 2013 at 16:35
  • @BlacklightShining I haven't reviewed edits in a while, but it seems to me it was resolved shortly thereafter. Have you still run into this "footrace" issue?
    – jonsca
    Commented Oct 20, 2013 at 16:38
  • Not sure…I do know that the review queues (the two I can access, anyway—Late Answers and First Posts) are usually empty on Super User though. I have lost a couple of footraces, but I didn't think to check if the other user's review was any good. Commented Oct 20, 2013 at 16:42
  • @BlacklightShining I think that may be a different issue. Sometimes the system over counts the number of reviews that are available. If you keep refreshing the review page, there are usually additional posts coming in. When we were experiencing the issue outlined in my answer with the suggested edits, someone else would be able to approve a suggestion out from under the person trying to simultaneously improve the post.
    – jonsca
    Commented Oct 20, 2013 at 16:47
76
+500

Q&A works so well because posts can be voted, commented, rolled back, deleted, etc by the community. /review fails because reviews cannot be voted, commented, rolled back, deleted, etc by the community, let alone the moderators. So, there has got to be some system for that, or we have to remove any incentive for /review such as badges and reputation.

To start, I'd stop giving every monkey with 2K instant access to /review and start with a moderator-controlled system. Most straightforward example would be some kind of an invite-only system wherein each user — starting with moderators — can invite/propose ~10 other users for access to the review system. Moderators should be able grant/deny access based on the activity of the invited/proposed person. With this we can start with better quality reviews and thus end up getting better reviewers. (striked 6 march 2015; I imagined this would be too much manual work)

To start, I'd stop with a reputation based system and switch to a flag ratio based system for access to /review. Say, only allow users with a minimum of X-amount of flags (perhaps relative to the total number of flags on the platform) of which an Y-amount has been approved, access to /review. With this we can start with better quality reviews and thus end up getting better reviewers.

The alternative is to remove the incentive for /review. Ones who are really willing to cleanup the site the right way would generally not care about badges and reputation for that anyway (like as in Wikipedia). Stats should however be kept visible in user profiles, that would be tremendously helpful for among others future moderator elections.

9
  • 19
    lol "monkey with 2k access" ... but limiting access sounds like a good idea, IMO
    – Pekka
    Commented Oct 17, 2012 at 21:24
  • 1
    Awarding the bounty to this answer because I personally and very subjectively agree the most with it. No disrespect is meant to the other answers, some of which are great and contain valuable ideas!
    – Pekka
    Commented Oct 21, 2012 at 19:10
  • 5
    For monkeys that do more obscure languages, 2K is hard to get and an effective barrier. Hence -1. Those monkeys will also not be noticed much by moderators as they work on obscure questions.
    – tomdemuyt
    Commented Nov 19, 2012 at 23:45
  • @tomdemuyt: I don't forsee problems if their suggested edits have a relatively high acceptance rate (by the real reviewers, once the review system is revised, of course).
    – user138231
    Commented Nov 20, 2012 at 13:02
  • 7
    The alternative is to remove the incentive for /review. Ones who are really willing to cleanup the site the right way would generally not care about badges and reputation for that anyway (like as in Wikipedia). Stats should however be kept visible in user profiles, that would be tremendously helpful for among others future moderator elections. I'd do that personally incentives make people want incentives. Remove the incentives and you've got people who care remaining.
    – Wes
    Commented Nov 27, 2012 at 14:22
  • 3
    Monkey speaking: You can spend three years on SE without surpassing 3k and yet gain a feeling of how the sites work. But I agree with removing the badges Commented Jul 4, 2013 at 11:28
  • 7
    ...anyway, -1 for basically suggesting a social network of elitist reviewers Commented Jul 4, 2013 at 11:29
  • 2
    @TobiasKienzler: just do some good suggested edits? Surely you'll be caught by eye of the reviewers and finally be appointed as another reviewer. Or we can even expand on that, creating a list of all top suggesters based on the amount of approved suggested edits and their ratio to rejected ones. Saying that this is an elitist system goes far overboard. -1 for your comments. The /review is in its current form absolutely tearjerking broken. They should never never never have offered badges/reputation for that. Humans just doesn't work that way in quality control.
    – user138231
    Commented Apr 30, 2014 at 15:24
  • 1
    @Tobias Kienzler: which alternative do you like: annoying and glitchy “audit” (formerly “honeypot”)? Wikipedia-style anarchy, where every monkey without anything can participate in virtually everything? Note: Ī̲ was in Wikimedia projects for 8 years, but now you see me here, not there. IMHO a responsibility-based governance/management necessarily introduces some kind of elitism. Commented Sep 5, 2015 at 18:58
62
+150

Just a quick status update: we are taking this seriously, but don't want to jump to conclusions (or make rash changes unless they can be demonstrated to help).

  • As Geoff mentioned previously, we've fixed the bug whereby if two reviewers review the same item only one gets credit for it (realistically, this only affected some types of reviews, but editing was one of them so that's particularly bad).

  • Emmett has implemented some better analysis tooling for these queues, which should let us get a better idea of the scope and extent of the problem (once they're enabled).

update 2012-12-13

Manually suspending reviewing privileges has been introduced:

...for folks who fail multiple review audits in a short time...

update 2013-01-29

Automatic review suspension has been introduced:

...We're kicking blatant abusers out of the queues automatically now...

7
  • 5
    Oh my. Looking at the answer that is linked to in the new bounty description, this is an even worse problem than I thought.
    – Pekka
    Commented Oct 23, 2012 at 13:15
  • Interestingly, one of the votes on that answer didn't come from the review queue, and one of the reviewers did flag it (resulting in deletion). Although multiple reviews are somewhat unintentional in queues where voting is possible, it happens enough that this might be a useful metric...
    – Shog9
    Commented Oct 23, 2012 at 19:19
  • 2
    by "might be a useful metric", you mean the flagging and deletion?
    – Pekka
    Commented Oct 26, 2012 at 7:28
  • 5
    Question: There's a certain user in the Late Review queue (I'm sure you can all guess who it is) who robo-upvotes everything. I've even reported them to [email protected] for upvoting spam, and yet they're still robo-upvoting to this day - that's at least 1700 suspect upvotes. I know change takes time, but this is getting a little out of hand. Is there anything more we can do right now, or just sit back patiently and wait? Commented Dec 5, 2012 at 13:54
  • @Little now I'm dying to know who that user is.
    – Pekka
    Commented Dec 5, 2012 at 23:15
  • @Little: this user (and others) are on my list (three failed audits). I've started contacting some folks directly over this, but we have other tools in the works as well.
    – Shog9
    Commented Dec 5, 2012 at 23:34
  • 7
    Not wanting to rush changes is only good when the current system is good (or at least not bad). When the current system is bad (like the current review system), then waiting to implement those changes because we don't want to rush things is actively worsening the content of Stack Overflow. The best way would be to freeze the current review system (don't allow more reviews) until a solution is found.
    – Pablo
    Commented Dec 12, 2012 at 10:25
42

A radical solution:

Either add a "Looks Good" button that increments the review count or make the "Not Sure" button increment the review count, but not count to removing the post from the review queues for anyone else. There needs to be a way to let people say:

I've reviewed this item, but I don't think it requires any attention from me.

The fact that "Not Sure" doesn't increment the review count discourages people from clicking that perfectly valid option.

That way people won't be tempted to vote up just to get the increment in the view count.

We'd have to change the requirement for the badges though - a simple increase in the numbers (500 and 2000 for silver and gold) might be enough, or have the requirement be that you must have > 50% of real actions (edits, etc.).

7
  • 9
    +1 for have the requirement be that you must have > 50% of real actions (edits, etc.)., not sure about increasing the numbers; nothing increases value like scarcity, so this could make those that are gaming more likely to sprint through reviews.
    – StuperUser
    Commented Oct 17, 2012 at 15:25
  • 4
    Also ` make the "Not Sure" button increment the review count`. Nothing would stop a gamer clicking that button 1000 times and "earn" a gold.
    – StuperUser
    Commented Oct 17, 2012 at 15:26
  • 4
    @StuperUser - which is why I suggested the %age of real actions being required.
    – ChrisF Mod
    Commented Oct 17, 2012 at 15:35
  • Oh yeah, of course it would, my bad.
    – StuperUser
    Commented Oct 17, 2012 at 15:40
  • I say just add a "looks good" button and make it functionally identical to the "not sure" button.
    – Servy
    Commented Oct 17, 2012 at 15:55
  • 6
    @Servy - that might work in the short term, but when people see that their review count isn't incrementing they'll go back to spurious upvotes again.
    – ChrisF Mod
    Commented Oct 17, 2012 at 15:56
  • @ChrisF True. 15 char.
    – Servy
    Commented Oct 17, 2012 at 15:57
40

Reviews could be made flaggable, and moderators could remove the review capability from a user for a while. Surely not ideal but it could educate some users.

Addendum: We cannot see later what a reviewer actually did (at least, for now), so this will probably not work. See ↓ the comments.

5
  • I suppose you could flag the post itself, and fill in the "other" section to indicate a problem with the review.
    – S. Albano
    Commented Oct 17, 2012 at 0:37
  • 2
    Unfortunately, this doesn't work in the "upvote and move on" case that I've been seeing a lot of lately (we're even getting spam being upvoted now), because we can't know for certain that someone voted for a post. That makes it difficult to track back who made a bad review in that case. Commented Oct 17, 2012 at 15:44
  • Can we not see if a review was made without an edit? That must be a vote. But you are right, we cannot know if the upvotes are from the reviewer.
    – fuxia
    Commented Oct 17, 2012 at 15:52
  • 1
    @toscho It's also possible that the reviewer upvoted a comment (possibly a comment indicating what was wrong with the post, which would be a perfectly acceptable review), they could have posted a comment (which could have been deleted by the time you see the post, so you can just look for their name in comments), they could have flagged the post (which would be the right thing to do for a lot of really bad posts). They could have voted to close a question, etc. An edit or comment is really the only review task you can track as a non-developer.
    – Servy
    Commented Oct 17, 2012 at 16:05
  • 3
    @toscho: Very often I see a review that ignored significant issues with the post (like a question being entirely unintelligible due to bad grammar) and looking at the user's other "reviews" confirms that there is a system behind that. Then I don't really care what the "reviewer" did - I want to flag that review. And moderators shouldn't just temporarily remove the review capability (the game will simply continue once the ban expires), the user should also lose the review points he got by doing fake reviews. Commented Oct 17, 2012 at 19:33
26

I think one aspect of the problem that could be addressed is "reviews per hour".

Good reviews take time. Reviews that are gaming the system do not.

Your review that you took time and effort for was undercut by a review of someone who could have been gaming for a badge, not willing or able to put the required time in, or not competent enough to create a review (despite accruing 2k+ rep).

Limiting the rate at which people can perform reviews won't necessarily fix how good the reviews will be, but it will prevent the damage that can be done by gamers sprinting towards a badge rather than performing a proper review for the good of the community.

This could be fixed by letting users know they can only perform 5/10/n reviews per hour, potentially by review type, and informing them when they reach that limit, as they are going too fast to be likely to be making a positive impact.

Edit:

Based on S.Albano's comment below we could use a dynamic rate to protect review tasks when the queues become smaller than the allowed rate, e.g.

int ReviewsPerHour = Math.Min(10, Math.Round(reviewTaskCountInQueue/10));
ReviewsPerHour = Math.Max(ReviewsPerHour, 1);

Which would hopefully protect 90% of the review tasks.

9
  • 3
    Re rescinding rights silently, it's been discussed under the name "hellbanning," and rejected.
    – Pops
    Commented Oct 17, 2012 at 15:33
  • @PopularDemand Thank you. That was the blog post I was searching for before. Will remove that as part of the suggestion.
    – StuperUser
    Commented Oct 17, 2012 at 15:35
  • Do you want me to delete my comment? I was only trying to add supplemental info, not criticize, but I see you've already edited.
    – Pops
    Commented Oct 17, 2012 at 15:48
  • 2
    No, don't delete it; a link to hell banning and why it's not acceptable is germane to this answer and the discussion as a whole.
    – StuperUser
    Commented Oct 17, 2012 at 15:51
  • You could suggest an edit to the question but then it would have to be reviewed... ;)
    – StuperUser
    Commented Oct 17, 2012 at 15:51
  • 1
    At first this seemed good, but the idea breaks when queue size approaches zero and the number of posts needing review divided by the +1 reviewers is <= the hourly rate. Then they continue to cause problems without triggering the throttling threshold.
    – S. Albano
    Commented Oct 18, 2012 at 3:45
  • @S.Albano VERY good point, in take case with could lower the rate dynamically based on the remaining number e.g. have int ReviewsPerHour = Math.Min(10, (reviewsInThisQueue/10)), which will protect 90% of the review tasks.
    – StuperUser
    Commented Oct 18, 2012 at 8:15
  • I was just going to suggest this as an another alternative.
    – ChrisF Mod
    Commented Oct 18, 2012 at 8:40
  • 1
    I agree with this as well! However, I do think it would not be sufficient on its own. This should be combined with the 'lock review' when it is picked up to be edited or commented on by somebody else, so a review cannot be completed quickly while you're working on it. Commented Oct 19, 2012 at 18:36
22

I have to agree with Nicol Bolas's answer that there is a fundamental problem here of incentivising users to review content without any oversight of the reviews. (I'll come back to this later.)

There is also a very interesting comment raised by Ben Broka

I'd like some actual evidence that reviews are doing more harm than good if we're going to pretend they're a problem.

I think this is worth discussing. So, let's assume (just for the sake of argument) that a significant number of people are going through the review queues, upvoting an item without even reading it, and then clicking "I'm Done". How would that be any different then them doing nothing at all.

Well, there are only two changes that take place:

  1. The post now has an upvote on it. If it's a great post, this is correct; if it's a mediocre post, then it's not a problem; if it's a terrible post, then it's doing some harm, but not a huge amount because it can only add at most one vote (from the review queue) per post. Adding a single vote is somewhat unlikely to push a poor answer above better answers. It's possible, yes, but we're at the point where very few posts fall into this category. So, net effect: close to zero. I imagine the positives will be comparable to the negatives.
  2. The item is removed from the review queue. It is unlikely to be seen by someone who is capable of making more substantial actions such as some combination of flagging, editing, voting to close, commenting, downvoting, etc. as appropriate. Here the issue is in the opportunity cost. The poor review has prevented more positive actions from taking place, or make it harder for those looking to do proper reviews from finding content that is worth their time. This of course only happens if the queue is emptied. As long as there are items in the queue to review for the legitimate reviewers, their time isn't wasted.
  3. A reviewer is getting credit for reviews while not actually reviewing content; this reward is potentially in the form of badges or a place on the (daily or total) leaderboard. Now, the value of those rewards is somewhat diminished because so many people know that it is often undeserved, but clearly there is enough "value" in the reward for people to spend time faking reviews.

I also want to address the fact that the discussions around suggested edits have been brought up as being related. There are a few key differences between suggested edits and these other review queues.

If an item never gets reviewed in the first/late posts it's not stopping any action. Having an item sit in the queue for years without being looked at simply means it might not possibly be fixed by someone if something is wrong. There's nothing concrete there. Suggested edits on the other hand cause problems when they sit in the queue. When an edit is in the queue it means that the post can't be edited by other users (if they would want to make more substantial edits, or fix things that the edit missed, or just add more info in the case of the OP). It also means (if the edit adds value) that the positive edits aren't being seen by everyone viewing the post. That's a problem. Finally, to prevent that queue from getting excessively long, it is limited in size. If nobody is reviewing suggested edits the capacity will be reached (the old system was frequently hitting capacity on SO several days a week, or at least sitting close to it). When the queue is at capacity new suggested edits (presumably of value) can't be made. Because of all of these reasons there is a legitimate argument to make that value of approving suggested edits quickly is worth more than the cost of having some bad suggested edits approved (ideally we'd want fast and quality approvals, but we haven't figured that one out yet).

First/late posts are different though, because there is substantially less value in reviewing content early (most of it really can wait) it doesn't offset the cost associated with bad reviews. As such it has been positive to add more incentives for the suggested edit queue (although I still think it could be improved further) while adding incentives to the first/last posts are detrimental.

So, what exactly should we do?

Just to explicitly state it, I suggest that both the gold an silver badges for review queues be removed. The bronze badge is fine for introducing users to the queue and is set low enough to not encourage significant abuse. The leaderboards (both daily and total) for all queues should also be eliminated. There are users who will place being on those boards as goals, and let the quality of their reviews suffer so that they can get/maintain positions there. Removing both of those features should take away the most significant incentives for users to try to get review points without regard for the quality of the reviews themselves.

</WallOfText>

4
  • 10
    Yes. It does seem that the upvote-and-move-on reviewers are emptying the queue, so your point 2 - "as long as there are items in the queue...their time isn't wasted." Well, there aren't items in the queue.
    – pjmorse
    Commented Oct 11, 2012 at 2:57
  • @pjmorse Yep, but it has only been emptied in the past few days; it wasn't a real problem before then. Likewise, if the team adds new queues, or changes the algorithms to allow new items into the existing queues then the problem goes away (temporarily).
    – Servy
    Commented Oct 11, 2012 at 13:54
  • 10
    I'm sorry I keep banging on about this, but for some reason it's irking me and I can't let go of it. I guess the issue for the site as a whole is that as long as the queue is small, the quick, low-thought (and presumably low-quality) reviews have a competitive advantage over good reviews. It's certainly possible to have a post in the queue which is concise and well-written and deserves a twenty-second read-and-upvote review, so quick reviews aren't inherently bad.
    – pjmorse
    Commented Oct 11, 2012 at 14:18
  • 1
    Also, I only have the rep to access the queues which are pretty much constantly empty: first posts and late answers. :S
    – pjmorse
    Commented Oct 11, 2012 at 14:20
19

A partial solution (well, not "solution", that seems too strong; maybe "medicine"?) would be to remove the post from the queue as soon as someone starts editing it. That would at least prevent those moments of sheer frustration, where you finally finish injecting some sense into an atrocity committed against English grammar, only to find that someone has pulled the rug from under your feet and taken your credit without so much as making a token effort. (If the edit is abandoned, then naturally the post goes back in the queue.)

The other part might be to go ahead and count all review-type actions, even if they're on posts someone else has already reviewed; heck, even if they originated from someplace other than the queue. Naturally, the badge limits would have to be greatly increased. Also, I don't think votes should count as review-type actions: I'd limit it to edits, flags, and comments.

6
  • 2
    Locking a post while editing has been suggested in other contexts, such as preventing others from making edits while you're editing. They all have fundamental problems though; someone can start editing and then leave the browser up for hours/days on end. It needs a timeout, but some people can legitimately spend 10-20 minutes editing a large post, so it can't be all that short of a timeframe either.
    – Servy
    Commented Oct 17, 2012 at 20:35
  • For what it's worth, a dev posted something similar to this last week that reached a score of 39, but deleted it two days ago without giving a reason. EDIT: actually, the reason was probably that the first half of the answer I'm referring to talked about the first of these features in the future tense.
    – Pops
    Commented Oct 18, 2012 at 15:55
  • 1
    I mostly gave up reviewing because of this.
    – nalply
    Commented Oct 23, 2012 at 12:18
  • 3
    To be more exact: I gave up reviewing by edit. I just skip when I see the need for an edit.
    – nalply
    Commented Oct 23, 2012 at 12:42
  • @Servy: giving the engine (as for how) has a draft-saving feature, implementing a timeout wouldn’t be a challenge. Commented Sep 5, 2015 at 19:16
  • @IncnisMrsi The fact that there's a draft saving feature isn't really relevant at all to that. And as I said, it's in the paradoxial position of needing the lock time to be long enough to perform a long edit, but short enough to not be disruptive to everyone else or to allow for abuse, and there is no real overlap times between those two.
    – Servy
    Commented Sep 8, 2015 at 13:15
18

I'm starting to think that having badges (particularly gold!) for something that can be so easily gamed, might not be a good idea after all.

Whatever criteria a badge has, there will be people gaming the system and taking the path of least effort to get that badge. In this case, though, there's so much gaming of the system that any countermeasure that requires peer consensus seems doomed to fail. All the gamers will just do the same thing, and since they all "agree", bad reviews now look legit.

I'm about this close to recommending removal of the gold review badges entirely. I'd gladly give up mine, if it means we have fewer crap posts getting rubber-stamped by badge whores. (There will still be some gaming in the short term, but something tells me it'll taper off once people get their silver badges.) The bronze badge can stay, and probably the silver, but as long as there's a gold badge, people are going to farm it any way they can.

12

How about taking a "test" for each type of review queue? In order to get full privileges to each of the review queues you'd have to go through a vetting process. Here's what that could look like.

  1. A new user wanting to participate in the review queues would be presented with posts that have a clear "right" answer in terms of reviewing. The user would have to review 100% of these "test" reviews correctly in order to get access to the real queue. You would have to go through this for each queue.

  2. Just because a user gets past the initial test for a queue doesn't mean you stop throwing in "test" reviews. For example, when you open the floodgates to them, keep showing them "test" reviews. If a user gets enough of these wrong at any stage, temporarily revoke their access to that particular queue (or warn them or something). Alternatively bump them down a "level" (See below).

  3. As the user proves themselves by answering more test reviews correctly, decrement the number of test reviews they have to get correct.

  4. Provide other reviewers a way of flagging people that are not reviewing things correctly. Enough flags could increase the number of test reviews a person is shown (this could be an automated process or perhaps moderators would have to be involved).

You could break the process up into "levels." Obviously these are flexible:

  • Level 0 reviewer: 100% test reviews
  • Level 1 reviewer: 30% test reviews and 70% real reviews
  • Level 2 reviewer: 20% test reviews and 80% real reviews
  • Level 3 reviewer: 10% test reviews and 90% real reviews
  • Level 4 reviewer: 5% test reviews and 95% real reviews

You could display the "level" of the reviewer on their profile, thereby still gamifying the review process but making said gamification a little more useful. You also would not tell them which reviews were "test reviews." So they could, at any point, lose their "Level 4 reviewer" status and be bumped back down. This would prevent people doing the right thing up until they reach Level 4 and then just reverting back to their badge-grinding ways.

Other notes:

  • When a user gets a "test" review wrong, let them know why and point out exactly what's wrong with the post that they missed. That way we're hopefully educating users while we vet them.
  • I know I'm glossing over a few problems like "how to determine canonical reviews" for each queue.
4
  • Andrew, I think Late Answers queue can be used as testing tool as is. It seems to already have most important "ingredients" for that, such as items to detect verifiable abuse and sort of protection against those who would (who will) try to trick the testing, just like they currently trick the review
    – gnat
    Commented Nov 22, 2012 at 15:52
  • 1
    @gnat Maybe Late Answers queue have a good mix of bad or good post already. However, there no such things as telling the user why this was a good or a bad post on the spot when he just did a bad review. At the moment, there are users that has almost the same amount of review than upvote and maybe less ~10 downvotes. So it looks to me as they don't learn anything and they just keep spamming the UpVote button.
    – ForceMagic
    Commented Dec 7, 2012 at 19:01
  • @ForceMagic I was thinking about script similar to one that currently is used to detect and revert serial-voting (by the way brainless upvotes to wrong posts in LA queue can be considered as a kind of vote fraud). Fraud detection script doesn't catch fraud on the spot, right, but still it does a pretty good job of keeping site clean doesn't it?
    – gnat
    Commented Dec 8, 2012 at 7:28
  • @gnat I didn't actually knew that some changes has been made to detect bad behaviours in reviewing until yesterday when I hitted a fake review and actually got it right. I was please to see that because few weeks ago I stopped reviewing shortly after posting [this][meta.stackexchange.com/questions/155299] because I was annoyed by users like [this one][stackoverflow.com/users/1647597]. I don't usually want to point anybody, since we all learn and improve ourself, but from what I seen, this one haven't learn anything IMO.
    – ForceMagic
    Commented Dec 8, 2012 at 8:57
10

An additional option might be to only provide the badge incentive when there is a backlog, thereby not rewarding undesirable behavior.

Here is the use case:

  1. The review queue hits a certain threshold (say 500), triggering reviews to be counted towards the badge incentives.

  2. Some indicator in the user interface indicates that the review queue is in that mode.

  3. Additional reviewers are attracted by the reward, while there is no apparent race with other reviewers to reduce the quality of review.

  4. The review queue hits a certain low threshold (say 80), triggering reviews to not be counted towards the badge incentives.

This scheme has several benefits:

  1. It calls in backup for the review process when needed.

  2. It makes it less likely that upvote reviewers will trample on the well thought-out edits of another.

  3. It provides a cushion between the high and low thresholds, so that the time during which the queue rewards cheating is minimized.

  4. It should select for the more desirable edit action both during both phases thanks to the rep. and diminished sense of urgency.

  5. It provides lull time for users who's only reward is to improve the site to do so in peace.

6
  • 1
    So what happens between 80 and 500? Commented Oct 16, 2012 at 9:08
  • 3
    @RichardTheKiwi I intend this as a feedback loop , as seen in the regulation of metabolic pathways in cells, as well as various other complex systems. On the way up to the upper threshold, the queue would remain in the no-incentive state until it crosses the upper threshold. On the way back down, the reviews count towards the badges until levels cross the lower threshold. This should result in the queue count oscillating between the two thresholds, but spending very little time on lower threshold where this problem seems to be worst.
    – S. Albano
    Commented Oct 16, 2012 at 15:24
  • As I describe in my answer, queues other than suggested edit approvals don't cause problem with they have a large backlog. Keeping that backlog around would do less harm than marking them all reviewed without actually reviewing them. If we just want don't like seeing big numbers for those queues we can simply make the criteria for getting into the queue more restrictive (or prioritize them more effectively) so that the most needy posts get the most attention (this is already done to a certain extent).
    – Servy
    Commented Oct 17, 2012 at 15:52
  • @Servy I would argue that a large backlog in the new user and late answer queues is bad for the new users. We want to encourage timely reviews that improve their questions and answers and provide them constructive feedback. Good reviews make SO feel more useful, welcoming and responsive, helping to pull in new users and teach them how to be constructive members of the community.
    – S. Albano
    Commented Oct 17, 2012 at 16:58
  • @S.Albano Yes, I agree that performing an appropriate review of a new post within an hour or so of the post is of more value than a review a week later. I also feel that an appropriate review of a post a week after it was made is of more value than adding one upvote to every single first post without ever performing any other review action. Commenting on new posts to (politely) indicate things that they have done that aren't in line with SO guidelines, editing up a good answer, etc. can provide a lot of value. One upvote (whether it's valid or not) is pretty minimal feedback.
    – Servy
    Commented Oct 17, 2012 at 17:03
  • @Servy I agree that an upvote is not particularly useful, and my proposal operates under the assumption that these upvote reviews are selected for at the lower threshold, and attempts to minimize the time spent there, while still encouraging throughput with a periodic burndown phase. Your proposal avoids the low threshold completely under the assumption that the upvote selection is just as strong during the burndown phase. It would be interesting to know how prevalent lazy upvoting was during the burndown of the queue compared to the current levels at the low (0) threshold.
    – S. Albano
    Commented Oct 17, 2012 at 17:42
9
+500

EDIT #4:

Okay. This edit really does cinch a few things wrong with the system. Is there progress on removing poor reviewers from the system? It doesn't feel like there is any, yet.

Please. Start removing the bad reviewers from the system. I'm seriously questioning my participation in the review process as a whole, if edits can be blindly approved like this. This has got to stop.

EDIT #3:

This is getting painful. I'm starting to seriously rethink this whole review queue thing. The very fact that this edit was not immediately rejected by all participating reviewers indicates that the reviewer(s) that decided to accept the revision just doesn't care.

These sorts of poor editors have to be removed from the system. And I would hope that this culling is done at a much more accelerated rate than what's going on now. The rate we're going at now just doesn't seem to be cutting down the lack of quality reviews.

EDIT #2:

I'm sorry, but this edit made me about lose my salad here. It's not the spelling or anything like that, but it's the lack of completeness. What are we getting at when we want reviews to be peer-reviewed or editable? We must find a way to get quality from the full review process, not just the (oft empty) review queues.

EDIT:

Take a look at these string of edits. The first few don't address the critical shortcomings of formatting the question, only taking time to make simple changes but leave the critical stuff alone. The need to remove poor editors from the system has never been greater.


So here's my beef with the current system. Here's my recent example; let's try and keep it clean/egoless as I go through this.

So the main thing that I do when approaching a question/answer to edit is gauge how much actually needs to be revised. Is it just the code formatting? Was there some grammar issues that came up? Did the question need a bit of spit and polish to come across cleaner? It's a process that means less edits from me, but (I believe) higher quality.

When I'm reviewing questions/answers, I apply the same rubric I do as if I were editing the post myself. Is there more that can be changed? Was there enough work done revising the question to be useful to the next person?

Let me stop myself right there. The interesting Catch-22 of the edit system is its ultimate goal: The revisions must be substantial enough to benefit someone else reading the question in the future. The current review queue obviously does not enable the community toward that goal, not unless they are disciplined and patient enough to go through each detail of the question and revise it.

But let's be honest - few of us are.


So, the example I have posted up - take a look at revision #3. There wasn't anything of actual value added to that revision - the grammar mistakes went largely unchecked, and code formatting was actually reverted. This was a poor edit, and it was correctly rejected - except for Community coming in and accepting the revision. This is likely due to someone else editing the file and claiming that the edit was "useful".

So, three part plan to fix the system.

  1. Enforce, to some degree, a small checklist of what should be fixed in the post. This can include but isn't limited to:

    • Spell checker; catch words that are commonly misspelled and are missed in a review. If a user repeatedly misses these, after some given threshold, restrict their ability to participate in reviews for a while.
    • Code format checker; catch blocks of code that can be interpreted as a particular language, and see how it's formatted. This may be tricky to implement, since there's plenty of holy war on where braces go in Java, and the type of whitespace used in Python examples won't be clearly conveyed.
    • Miscellaneous checker; this can suggest problems such as, "Hey - this post has a signature at the end, that would be one thing to improve." May be tricky to implement if it had a broad range of miscellaneous items to check through.
  2. Penalize users that do not sufficiently review the question/answer with negative reputation. This can apply to anyone, as to encourage those without the suggested edit restriction to actually sit down and review the question.

  3. Accounts with a history of poor reviews should not be allowed to participate in the system. I can't stress this enough - the only way we'd be able to stop poor reviews from affecting the rest of the site is to stop the poor reviewers from partaking.

4
  • Regarding your latest update: am I missing something? What's so terrible about fixing a misspelled title?
    – Shog9
    Commented Aug 21, 2013 at 7:07
  • @Shog9: I suppose I was referring to the suggested edit portion of it; the misspellings I can live with.
    – Makoto
    Commented Aug 21, 2013 at 15:14
  • @Makoto: see: meta.stackexchange.com/questions/79342/…
    – Shog9
    Commented Aug 21, 2013 at 17:17
  • @Shog9: Fine, but he didn't just touch the title - he corrected something else in the post. I'd argue that correcting anything in the post counts towards the "too minor" edits, or we'll have a lot of users gaming the system with one or two changes to the title.
    – Makoto
    Commented Aug 22, 2013 at 5:55
8

Few solutions (mix and match), in three steps:

Filtering for a bad reviewer

Sticky question

Whenever users give opposite votes on the same question in a queue--Make the question a bit stickier in the queue (doesn't disappear immediately). If you have enough conflicting votes, the guys who were upvoting probably were fly-by reviewers.

Consecutive upvotes

If a user has too many consecutive upvotes in too little time, it's quite likely he is a bad reviewer.

Remember that posts which take very little time to go through are many times downvote-able posts. If the user is just zipping though the queue, something's amiss...

Confirming it

Monkey on my back

If a user rarely downvotes in a review queue, and goes through the posts way too fast, attach a "monkey" on their back. This triggers the honeypot posts(or whatever other traps you have) more often. If they answer the honeypots correctly, the monkey goes away.

Community flags

Have Community flag what it thinks are bad reviewers--those who have tripped up the filtering algorithms too many times

Honeypot

Already implemented

Penalizing them

Have warnings shown to them for the first few offences, with no penalizing. If they continue, gradually increase the severity of the punishment. Various (mix and match) ways to penalize:

  • Rep penalty
  • Block from review for X hours
  • Reverse all reviews by the user in the last X hours (this includes resetting their review stats)
  • Reverse the last X reviews by the user

Though the simplest one (and least annoying) is this: For every bad review, the user's review progress goes down by 2. (Maybe more, if they have too many bad reviews)

3
  • 1
    "If a user has too many consecutive upvotes in too little time, it's quite likely he is a bad reviewer." Possibly, but just keep in mind the key word there is "likely". Flagging such users for mod attention would be fine, but there are legitimate cases. Here is some discussion on the issue. (It's on suggested edits, while you're referring to other queues, so it's a bit less applicable here, but still relevant.)
    – Servy
    Commented Dec 4, 2012 at 21:08
  • 1
    @Servy: That's why this is just a "filter". You have more checks in place to check if the person really is a bad reviewer Commented Dec 5, 2012 at 0:36
  • I like the "monkey on my back" suggestion.
    – Luke_0
    Commented May 21, 2013 at 21:30
8

How about a reviewer can only upvote an answer if they have a minimum number of tags earned in that subject? This way all users unfamiliar with a particular language or subject can screen for spam, Not an Answer/Real Question, and other obvious undesirable postings and still be able to leave a helpful comment or fix broken/missing code formatting - But if they haven't demonstrated at least a minimum competence in a topic they haven't earned the credibility to upvote (they can still go back and upvote later on but it wont count as a review).

Personally I know subjects relating to LAMP webservers like PHP, MySQL, JQuery, Apache, etc but I'm not a Ruby or C++ guy, so when I see questions in my area I upvote them if they are good answers and just skip if they aren't anything special. But when a Ruby or C++ question comes up I usually always skip if its not obvious junk needing flagging or commenting because I feel like I have no business upvoting posts on subjects I don't understand, for all I know the poster has no clue what they're doing either and their answer doesn't solve the problem. Upvoting should be done because its a good answer, not because it isn't spam or obvious crap.

Example: My Current Tags

enter image description here

  • At a set minimum of 25 Posts, I could only upvote posts tagged with or .

  • At a set minimum of 25 Upvotes, I could upvote , , , , .

  • Either way I couldn't upvote or or the other lower tags until I get my postings in those tags up a little bit and prove other users agree I know a bit about these subjects.


Example of blind upvoting this might address:
How can I get the CheckBoxList selected values, what I have doesn't seem to work C#.NET/VisualWebPart

2 users upvoted this (and another edited it) while I had it open for review, all without any of them catching the fact the original question was about CheckBoxList values and the answerer just pasted some random completely unrelated C# sql connection script. If review upvoting privileges for this question were restricted to the C# crowd then I would hope they would recognize right away the code is unrelated and filter crap like this out.


I realize that restricting review upvotes based on subject experience might make it more difficult to get some of the more obscure topics out of the review queue. Maybe it would have to only apply to the more popular subjects like that have large user base. Maybe it would have to only apply to answers because its easier to know if a problem sounds legitimate than to know if an answer solves the OP. But it would help prevent users upvoting simply because it's not obvious junk and it earns them a review credit, even though they probably have no clue how good the answer really is!

4
  • 3
    The problem with people who upvote everything is that if you stop them from upvoting they'll just do whatever else is easiest to enable "I'm done". Maybe they'll downvote (especially on questions which don't cost the voter rep) or perform an edit that changes nothing relevant, add a comment that's essentially spam/noise, etc. This would only really help for people who think they're taking the appropriate action by upvoting a post they know nothing about (which, granted, will still be some people).
    – Servy
    Commented Dec 7, 2012 at 21:16
  • I considered that too, but at least useless edits from low rep users could be rejected, and noise comments flagged, both of which could undo that review credit. And you could give downvoting the same credential restrictions as upvoting... Its a complex problem to still incentivise reviewing without rewarding useless reviews and I'm not sure there is a silver bullet solution that won't result in a game of review loophole whack-a-mole. At least curbing serial upvoting would help users differentiate between quality posts and posts that were rubber stamped up for review credits
    – WebChemist
    Commented Dec 7, 2012 at 22:27
  • An alternate idea would be to limit reviewers to 5 up/down votes out of their daily queue max of 20 reviews (so spend votes wisely) but again like you pointed out that would just result in more useless comments/edits.
    – WebChemist
    Commented Dec 7, 2012 at 22:38
  • Users need at least 2k rep to review posts, so none of the edits would be suggested edits; they'd all be real edits and couldn't be rejected. Yes, they could be reverted and comments flagged/deleted, but the point is that the net result is that the review queue would be causing more problems than it would be fixing, it would just be different problems than incorrect upvotes. It's not addressing any of the core problems, just trying to put a band aid on the symptoms.
    – Servy
    Commented Dec 7, 2012 at 22:48
7

Incentives should only be given for work. Answering a question takes time and effort. Clicking an upvote button is not sufficient work to earn an incentive. Sure everything needs to be reviewed. But only grant rewards when someone takes the time to edit and improve.

Once an item is shown to a user for review, it should not be shown to another until the first has finished or some substantial timeout has passed.

There is still an incentive to do reviewing. You will skip past items that don't need help quickly until you find one that needs your help. You improve it, you get rewarded.

Possible downside: worthless, unneeded, or even damaging edits made solely for the purpose of getting a review credit.

6

It seems that there are possibly simple solutions to the two problems:

Problem 1: Flyby reviewers

I think the biggest part of the problem is that people are penalized if they pass. They loose their time investment for no reward. Award points for the pass button, and change the criteria for the badge to be twice as many reviews with no more than 50% pass. This will at least train folks that pass is a good and valid option when you don't actually know. Much better to have pass be the default action than approve, and I'm sure it's not any harder to design a honeypot for flyby use of the pass button than it is for fly by use of the approve button.

I do think failed honeypots should explain what the problems were.

Problem 2: Theft of a review in progress

  1. Reserve the review. One does NOT need a lock to do this. Merely update the record with a timestamp and don't give the review to another user for 5 minutes. No need to come back and update, and if the review takes more than 5 minutes, you probably went to make coffee in the middle of it anyway. OR
  2. Allow multiple reviews of the same content. poof problem goes away :). Possibly raise the badge counts to compensate.
4

This sort of follows what Mac suggests, but I've often wondered if there is a filter on the review system that checks to see if someone's reviews are generally in line with what other users are doing. Obviously there will be differences between different reviewers, but I see the opportunity to implement some automatic flags (that at least alert mods to the behavior).

  1. Detect if votes (upvotes and close votes) on specific posts are in line with other reviewers on that review. If a reviewer is out of line with the community more than x% of the time (and has more than y total reviews in that category) it triggers a flag.
  2. If the speed at which a user goes through reviews exceeds some % above the median speed, it triggers a flag.
  3. If a user reliably uses the same mechanism to handle a review (i.e. almost always upvotes, almost always votes to close, etc) in a statistically suggestive way, it triggers a flag.

Users with review flags could either be noted for moderators, and/or

A review of reviews review category could be created, which puts suspect reviews in a pool for review, with similar incentives/badges as currently exist for other review pools (that's right, I just used "review" 6 times in one sentence).

12
  • 8
    ...and a review of reviews of reviews!
    – pjmorse
    Commented Oct 12, 2012 at 0:41
  • 11
    The problem with this is that I see more poor reviewers that quality reviewers. The system will likely think I'm the black sheep because I actually read the question, downvoted it, and voted to close when 75% of the other reviwers just upvoted it without reading and moved on.
    – Servy
    Commented Oct 12, 2012 at 17:05
  • @Servy - do we really think that 75% of reviewers are bad actors? Perhaps a more scientific look at how deep the problem runs is in order. Regardless, this really only applies to the first of my three flag criteria... if people are reviewing too fast or coming to the same conclusion too often then it triggers a flag.
    – Ben D
    Commented Oct 12, 2012 at 17:18
  • 3
    This is rather similar to Is anyone monitoring people rejecting good edits or approving bad ones?. @pjmorse, see my answer to that question.
    – Pops
    Commented Oct 12, 2012 at 17:20
  • 1
    @BenD I requested a more scientific look a few weeks back, with fair results.
    – Pops
    Commented Oct 12, 2012 at 17:21
  • 1
    @PopularDemand - Yes! meta.stackexchange.com/questions/140017/… has an excellent response. There are a variety of other queries I'd be interested in, but this is exactly what I'm after. In this particular instance, both queries seem pretty damning for Book Of Zeus who did manage to get his steward badge burning through reviews (approval every 1.8 seconds for 40 reviews in one day?!)
    – Ben D
    Commented Oct 12, 2012 at 17:33
  • Note that there are some people in that top 32 list are known good users (e.g. Cody Gray and Bill the Lizard).
    – Pops
    Commented Oct 12, 2012 at 17:35
  • @PopularDemand - I noticed that, but I didn't find the first chart as interesting as the second for that very reason - the query didn't control for overall participation... sometimes 5 seconds is all you need, and high volume users will have a fair number of them. What it needs is another column with total reviews (or a % of total reviews separated by < 5 seconds). The second chart made it obvious that Book Of Zeus was just abusing the system.
    – Ben D
    Commented Oct 12, 2012 at 17:55
  • You could limit the query to users who have the Steward badge already.
    – pjmorse
    Commented Oct 12, 2012 at 18:40
  • @pjmorse - I think that most of the damage has already been done by that point... (you can only get one steward badge per category, so people who do it for the badge will have done 1000 reviews of damage by that point).
    – Ben D
    Commented Oct 12, 2012 at 19:20
  • @BenD If you're looking to stop damage, yes, you're right. My suggestion was looking more at the question of whether damage has actually been or is actually being done. If we want to see if people are reviewing by just upvoting a lot of posts in a short time, it seems to make sense to look at people who've done a lot of reviews first. That's where the data is.
    – pjmorse
    Commented Oct 12, 2012 at 20:40
  • Review of reviews is a good idea, but offering any badges at all is a bad plan. Review of reviews would be done by people who just want to improve the site. You could program some heuristics about which reviews are likely to be of poor quality based on track record, whether there were any opposing reviews at all, how long it was between load and review or between subsequent reviews.
    – AndrewC
    Commented Nov 27, 2012 at 11:07
4

Review Allotment

Another option with precedent on the SE system would be to have a system similar to the flagging system, where a user has only a limited number of "reviews" to spend, but may be entrusted with more for appropriate reviews.

  • Each time a user completes a review task that is approved by members of the community, such as a helpful edit, or flag, or a comment that is later upvoted, they are entrusted with a few more reviews. (carrot)

  • If an action is rejected by the community, their review allotment is decremented by several reviews. (stick)

  • Review tasks that are not approved simply decrement the review allotment available to the user by one. (stick-lite)

  • Perhaps have a slow regeneration of allotted reviews to some minimum to give users a second chance.

This system provides both the carrot and the stick, while still keeping the badges. It provides incentive for work, a penalty for bad work, recognizes that an up or down vote is OK in some instances, and makes certain that voting is not the only thing that the user is doing.

1
  • 6
    This is an interesting variation on the "review of reviews" idea proposed a few times elsewhere, but ultimately fails for the same reasons. See this comment thread.
    – Pops
    Commented Oct 18, 2012 at 15:50
3

You could simply increase the rep required to do reviews when the queue is small.

2
  • 8
    No, this is not really a solution, as it happend to me more than once to try to improve a bad post only to find it already wrongly reviewed by a 10k user...
    – Adinia
    Commented Oct 15, 2012 at 14:08
  • 1
    This would help mitigate the problem, without actually solving it. If there are less reviewers then items wont' be processed as quickly; if they aren't processed as quickly the queues will be longer. With longer queues there will be less posts being concurrently reviewed by the same people.
    – Servy
    Commented Oct 17, 2012 at 15:54
3

Lots of good solutions here. Seems that this is likely a complex problem that needs more than one solution - it happens sometimes. So I have two ideas (related to each other) into the "pot" for consideration:

  • Throttle the number of edits and reviews a user can do in a given period of time. It's obvious that quality edits take an average of X minutes per Y posts; I'm sure admins and super-users can fill in the best numbers there. So (for example) when someone does 50 edits in an hour, you know there is no way those edits could be good - no matter how fast or smart the editor is. Doing that many, that fast, should either be prevented, subject to scrutiny, and/or flagged for low quality - and badges gained could be revoked if super-users vote for such action. PS - don't quote me on the number/hour - superusers and admins know the math and have great formulas :)
  • Throttle number of "review"-type badges/period a user can get.
  • 'Lock' the question or answer while it's being reviewed - all other parts of the question/answer are 'open'. This prevents simultaneous and overlapping reviews, edits, and up-votes on the part being reviewed. Since each question, answer, comment, and vote is handled as a separate object on the server, locking each part currently being reviewed shouldn't be a herculean effort. This 'lock' also helps throttle "hurried" users who do the system injustice by skimming through content. It's only part of the solution, but it's important for the overall health of the system.

If people want to do quality reviews, they'll never be impacted by these limitations. The ones that fly through, simply up-voting for badges, are diminishing the experience for all, so their incentive and ability to do so needs to be reduced. These 3 methods, along with some others on this Question, should help restore these badges' value, and disenfranchise those who wish to accumulate points at the expense of quality.

4
  • 1
    I think one problem that we actually have is that multiple person can access to a review. Thus, mutliple edits happens at the same time. It always annoys me when I'm writting a comment and I see 2-3 person upvote the post, especially when it`s not a good one. I seems to me that people feel "hurry" because they know someone else could review it before them, that doesn't encourage to take your time IMO. Also, I think that reviewing 50 edits in a hour could be fine it depends on how deep are the post. However, 20 edits in 3 minutes is certainly not enough.
    – ForceMagic
    Commented Nov 15, 2012 at 18:52
  • @ForceMagic - Good feedback, sounds good. I'll edit my post to include those parameters if ok w/you. Some problems are harder than others and need engineering. This definitely qualifies, in my book.
    – Lizz
    Commented Nov 15, 2012 at 19:28
  • @ForceMagic a possible way to remedy this issue could be to introduce "exclusive review period"
    – gnat
    Commented Nov 16, 2012 at 11:24
  • 1
    @gnat Indeed, I agreed with you on that post, I already upvoted you a while ago :)
    – ForceMagic
    Commented Nov 21, 2012 at 20:12
2

A solution might be to only award reviews if a person's review is consistent with other reviews, or the user is generally consistently accurate in their reviews.

For example, if I review a suggested edit, perhaps a review should only be awarded when I choose "approve" and the final outcome of the review process is "approve" (after all votes are cast), or vice versa (I select reject and it is ultimately rejected). I'll get penalised when I make the wrong choice (by not getting a review), but this will encourage people to make calculated, intelligent choices rather than blindly selecting an option. This, ultimately, can only improve the system, by ensuring that reviews are only incentivised when the reviewer is contributing to a consensus.

This can also be used to get an idea of my review accuracy: if I consistently choose to approve/reject suggested edits that are ultimately approved/rejected, then I'm a pretty accurate judge of what is and isn't a good edit. If the ratio of correctly judged reviews to incorrect ones is high, perhaps only then should reviews be awarded.

The particular circumstance described by the OP is probably a little harder to fix this way, but I'm sure it can be done.

10
  • 10
    The problem with this is that I see more poor reviewers that quality reviewers. The system will likely think I'm the black sheep because I actually read the question, downvoted it, and voted to close when 75% of the other reviwers just upvoted it without reading and moved on. On suggested edits I'm frequently the only reject vote (because it's a bad suggestion) when there are two approve votes (because so many people just approve everything).
    – Servy
    Commented Oct 12, 2012 at 17:08
  • @Servy: fair enough, I can't disagree that many reviewers don't put any thought into their reviews. My point is that if a reviewer is less likely to be awarded with a review for making a random decision, they will be less inclined to do so. In the end, losing a review or two is not a big deal for someone who is trying to do the right thing, but for someone who's not putting thought into it and just trying to get as many reviews as possible, it might just be enough to get them to slow down and put in a bit of consideration.
    – Mac
    Commented Oct 15, 2012 at 22:13
  • 3
    The problem is that the "bad" reviewers don't make random calls; they just approve everything, and they all do this, so they all agree with each other. I almost never see a post rejected that I wouldn't reject; I frequently see things approved that shouldn't have been. This means that anyone who ever rejects anything needs to fear being blocked by the system for "anomalous reviewing" and is in fact encouraged to just approve everything to avoid being blocked. That's a net harmful effect.
    – Servy
    Commented Oct 16, 2012 at 13:47
  • @Servy: yep, true. Point conceded.
    – Mac
    Commented Oct 16, 2012 at 22:10
  • @Servy I wonder if the don't-look-just-click reviewers would start deciding in a more randomish fashion if accepting needed two clicks too, like rejecting. Commented Oct 17, 2012 at 14:08
  • @DanielFischer I doubt it. It's not so much the number of clicks, it's that they just don't want to think (because that takes time, which means less reviews per second). You are expected to justify rejecting an answer, but not for approving one.
    – Servy
    Commented Oct 17, 2012 at 14:12
  • @Servy That's what I meant. If they had to give a reason for accepting too, that would level the playing field there. So then they wouldn't be "forced" to accept because it's the faster and simpler way. Commented Oct 17, 2012 at 14:16
  • @DanielFischer My worry there is that people will just pick one of the accept reasons; likely whatever ones is most vague or widely applicable and use that every time. It might help a bit, and it may accomplish something for those who honestly don't realize that certain posts should be rejected, not approved, but for those who just don't care what should be accepted you'll only slow them, not stop them.
    – Servy
    Commented Oct 17, 2012 at 14:17
  • 1
    @Servy Yes, that wouldn't be a solution in any way, I'm just wondering whether levelling the playing field would reduce the skew. Of course the bad reviewers would just pick a generic reason whether they'd approve or reject. I'm afraid (am I really?) the solution would be to scrap the bloody badges, so only people actually reviewing would review. Of course, that means the queues would be full again, but ... I've reduced my reviewing activities a lot since so much crap gets approved/upvoted before I finished reading the second sentence. Commented Oct 17, 2012 at 14:25
  • 3
    @DanielFischer Yeah, I used to spend a lot of time reviewing, but I've all but stopped entirely because my actions are continually overridden by people not actually trying to review.
    – Servy
    Commented Oct 17, 2012 at 14:37
2

We have two problems:

  1. People are getting free review 'points' for their shiny badge 'chievos.
  2. Real reviews are not being accepted, and not being prompted (had you not been part way through review, it may have gone unnoticed)

So I would propose a subset of the review options (upvote, 'not sure','looks good') etc pass the baton on for someone else, and it stays in the queue until it's had several reviews etc.

So here's the way I see it.

You start to edit, you are flagged as 'in review' by the system. Person two swoops in and upvotes it, then has to press either a 'looks good' or a 'not sure' button to register their action, which is added to a 'decisions queue'. Part way through you get an AJAX'd update saying '<insert UserName here> upvoted and said "Looks Good/Not Sure" ', in the same way you are updated when a new answer comes in or a new edit on a post comes in. When you finalise your edit, you can click on 'not sure' (maybe you doubt your edit is enough) or 'looks good'. Perhaps even a 'reason' field that says "The last review did not fix the problem" that extends the number of necessary reviews?

We then pile up about 5 decisions on the 'decision queue', for instance: Upvote - LG, Edit - LG, No Action- NS, Upvote - LG, Upvote - LG. Where LG = looks good, and NS = not sure.

The system then looks at who said what. I don't know how best to do this, but I'd reckon that the first upvoter shouldn't get any points, as an edit was needed. The editor should get points related to the 'views' of the other three reviewers. That is, there are more Looks Good votes than Not Sure votes (within our descion queue), so the editor gets points. As do the people who voted NS and LG because they reviewed it to pass it on to the next.

Had it been: Upvote - LG, Edit - LG, No action - NS, Upvote - LG, No action- NS.

The first upvoter still gets nothing, the editor gets no/fewer points (as it is now part of the LG/NS review from the other reviewers, i.e. by majority they don't think the edit has improved the post) and the other three get some points, the decision queue gets wiped and it's back up for review.

In this system you wouldn't get points for each review, but you might get 0-3 points dependent on how well received your review is.

Or we have: Upvote - LG, Edit - LG, Edit - LG, No Action - NS, Upvote - LG.

Here the system requires a deciding 'vote' (were the two edits good or not sure?), and because that might incentivise speed reviewing, we hide the number of non-edit actions. Then a NS/LG either commits the review, or it pushes it back into the main list of reviews. We can then give the first editor fewer points, and rely on the last three actions to show us if the second vote was good.

Conclusion:

  1. People who speed review no longer get points if an edit is needed or a majority claim 'not sure' after them
  2. Edit's are peer reviewed and then points are distributed among the review contributors according to effort (two points to the editor, 1 each to the last three reviewers?)
  3. We might catch people who 'fake' edit a post, by the subsequent voting.

Obviously we end up inflating the number of points so we make the badge goal get X points, not review Y posts, and make X>Y in some sensible fashion.

I hope that makes sense and is helpful.

tl;dr: Make each review 5 reviews and only commit when a consensus is met. Give points out according to contribution to consensus making.

5
  • 6
    This sort of "review the reviewers" system has been proposed before in similar contexts, but the problem is that the second-level reviewers are just as apt to be bad as the first-level reviewers. When the overall number of bad reviewers is low, that's fine, but in our current situation, I'm not convinced that this would improve anything.
    – Pops
    Commented Oct 17, 2012 at 16:31
  • 2
    @PopularDemand but that then means we aren't trusting anyone to review properly doesn't it? Commented Oct 17, 2012 at 17:32
  • 4
    It means we don't trust enough people to review properly. Adding more levels of review doesn't help if the quality of the reviewers is constant. You're just shifting the issue from the probability that a bad reviewer will show up on the first level to the probability that multiple bad reviewers will show up on the second level. Now, if you could change the quality of the reviews on the second level, we'd be in business, but then why not just change the quality of reviews on the first level?
    – Pops
    Commented Oct 17, 2012 at 17:42
  • I see your point. But we need to somehow check that first review....how to do that without further reviewing? Commented Oct 17, 2012 at 22:00
  • 2
    This is going to sound kinda dumb, but we wouldn't have to check the first reviews if they weren't bad in the first place. By that, I mean that we need to teach people how to review properly and/or remove the review ability from people who know how to review and are doing it wrong anyways.
    – Pops
    Commented Oct 17, 2012 at 22:18
2

As a side/first thought, it seems to me that the 'serial lightning reviewers' are the problem reviewers. They go through the queue, up-vote every post, and then light a cigarette.

I think that when reviewing a post and downgrading it, the karma should not be hit with -1, this way the serial reviewers can at least rate bad posts as bad with impunity.

Update: apparently there is no penalty for down-voting a reviewed question, this should be much more obvious in the UI.

Update 2 : There is a penalty to downvoting a reviewed answer though! There should be none when reviewing first answers! Most first answers are not good. I just lost 2 points ;)

Update 3 : This is awesome : C# Test if user has write access to a folder The late answer copies the solution with a few comments. Pretty pointless but of course the reviewer +1's it because he only checks the question and the late answer, not the existing answers..

5
  • Even if they are also willing to downvote bad posts, we also need them voting to close, flagging bad posts, making comments to indicate problematic or desirable behavior, or even editing posts with fixable problems. While this may not make things worse (until we start seeing people do nothing but downvote without reading and then moving on) it doesn't help all that much either.
    – Servy
    Commented Oct 17, 2012 at 20:37
  • 1
    Hold on, I see posts that are patently bad ( wrong site ) being upvoted (!), compare this to getting the post downvoted. Isnt that a major improvement ? ( We could consider that you can only downvote with impunity if you leave a comment ). Other than that, please do not enforce editing, some questions are fine and I feel that there is already too much busywork-editing rewarded.
    – tomdemuyt
    Commented Oct 17, 2012 at 22:15
  • 3
    Downvotes on questions are already "free," so I would think that if this worked, it would already be evident. These people want to put in as little effort as possible, and deciding between upvoting and downvoting takes non-zero effort.
    – Pops
    Commented Oct 18, 2012 at 15:52
  • 2
    It looks like people are doing the "cheapest thing" necessary to get the answer off their review queue. At the moment that's an up-vote. There's no cost to the giver of the vote, they don't have to spend longer than the time it takes to click the up arrow. Any other action takes more time. They're not going to down-vote because that costs them 1 reputation point. If down-votes were free then there'd be a choice of "cheapest" actions. Some would up-vote, others would down-vote. On average the score of the answer would remain at zero.
    – ChrisF Mod
    Commented Oct 18, 2012 at 15:57
  • 1
    "apparently there is no penalty for down-voting a reviewed question, this should be much more obvious in the UI." There's no penalty for the downvoter for downvoting any question from anywhere, regardless of whether or not you're in a review queue. It is only downvotes to answers that are penalized.
    – Servy
    Commented Oct 19, 2012 at 18:00
1

I would solve the problem with a penalty for bad review. A review turns out bad in any of the following cases:

  • upvote review: the question is closed
  • upvote review: the question gets negative total vote after a week [1]
  • downvote review: the question gets positive total vote after a week [1]
  • comment review: the comment was flagged and removed
  • edit review: the edit was rejected

edit: [1] votes consistent with the reviewer could count double into total, to help in case of controversial questions with lots of votes in both directions

Someone else will know better what kind of penalty fits best and whether it should be applied always or only on frequent review errors. I guess it will also improve question's quality which I find very poor.

4
  • 9
    -1, only because positive vote totals don't mean a question or answer is good...only that it is popular. The top 50 questions at SO are largely trash.
    – user7116
    Commented Oct 15, 2012 at 17:18
  • So you could probably show me one question from the top that you would downvote during the review. And what you write (positive vote means popularity, not quality) is against the basic rules of stackoverflow, rules of upvoting.
    – jarekczek
    Commented Oct 16, 2012 at 9:21
  • 1
    If you think there are consistent rules to voting that are followed, you'll be sorely disappointed. As for trash at the top, I looked into it last year. I downvote a lot of positive vote total questions (and a lot of negative vote total questions), this is perfectly legitimate regardless of what others have done.
    – user7116
    Commented Oct 16, 2012 at 13:22
  • @sixlettervariables: If we assume that the voting total does not reflect the question and answer quality, then whole reputation concept is a trash. So we must not make such an assumption. It would be like losing sense of life. But we may differ here. Anyway see my edit, which addresses the real issue imo.
    – jarekczek
    Commented Oct 16, 2012 at 13:46
-3

What if you allow multiple "provisional" reviews of a question, and then the questioner picks the winning review within a certain time window after the question is asked? The time window should be short - an hour might be enough. If the questioner doesn't act within the given time window, fall back to the first-review-wins system. Since no questioner is going to select a downvote review as the winner, they might only be allowed to select the winner if the majority of reviews are upvotes - otherwise the first downvote reviewer is wins credit when the review window expires.

I'm still trying to puzzle out where the incentives would point if this were the situation. A few thoughts (not all on the positive side)...

  • Each individual review is less valuable to the reviewer's reputation, since there's a good chance it won't be chosen as the winner. This might discourage gaming the system.
  • Reviews that substantially improve the question might be favored over simple "vote and run" reviews, since the questioner has an interest in having a quality question, however...
  • Pride might keep questioners from selecting quality reviews that alter the presentation of the question, instead preferring the affirmation of another reviewer who simply says "yeah, looks good"

Also, making the interface for this process grokable to a novice questioner would probably be a significant challenge.

2
  • The entire point of most of the reviews is that the user is new, and therefore doesn't have a strong understanding of how the site works; don't know exactly how upvotes/downvotes are used, don't know what should or shouldn't be closed, doesn't know what should/shouldn't be posted in a question, etc. The OP is the least qualified to know if the review was good or bad.
    – Servy
    Commented Oct 18, 2012 at 15:58
  • +1. While I agree with Servy that this does not solve the problem at hand (and especially not if "review" includes upvotes alongside edits), but I think that the OP is naturally motivated to care about the well being of their post, and they can typically tell a honest edit from a pretended one. So I would like to see a system that can incorporate OP's feedback on edit quality as well; even when the OP would not otherwise have the privilege level needed to providing any. Commented Oct 19, 2012 at 14:33
-8

My take on this is that someone should receive less bonus for consecutive upvotes and also having popups sometimes like "Hey you have upvoted 43 times in this post, you might consider picking the good answers".

It is a complex matter which should be analysed from a systematic point of view.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .