36

There are many proposals for changing the rep incentives associated with closing vs answering duplicates. But they either (1) can be gamed, or (2) penalize users who put work into answers on interesting niche questions, for which the duplicate was hard to find.

I have two suggestions, which I'll post separately to allow separate voting. The other is:

First, when asking a question, there is a list of supposedly-similar questions shown after you type a title. After posting a question, links to related questions are provided in the right-hand bar. The quality and relevance of the latter is far better than the former list, probably because it considers both the title and the actual question content. Let's kill the former list completely.

Now I see a similar questions list as I write this suggestion, and it does seem to be updating as I add explanation. That's good. But it's not part of the main workflow like the first list, instead it's shuffled to the side, with colors that make it easy to ignore. In fact, as an experienced member of the site, I assumed1 it was the help-with-Markdown-syntax until I took a closer look. Users who may be new here but familiar with wikis and other Q&A sites are probably thinking the same thing. And there's reverse incentive to go through that list carefully, since it interrupts describing the problem, and also increases the chances of losing my work. And it's still easily ignorable.

So here's what I propose:

  • Kill the list found from the title only. If the user hasn't bothered putting their title into either google or the stackoverflow.com search field, I don't care about saving them a little time on their writeup.

  • Take only the first few items from the content-driven similar questions search, give them the space vacated by the other list and the UI that shows whether they've already been answered/accepted. For the 2 which score the highest for relevance, actually show an excerpt. As well as the name, rep, and matching tag badges of the author of the accepted answer.

  • Make dealing with the similar questions list a mandatory part of the question asking process. The current button changes from "Post Question" to "Submit". At this time the question is saved and becomes accessible from the user's profile. But it is not yet visible to the community via front page, RSS feeds, tags, searches, etc. The asker is then taken through each of the top relevant questions and can either type an short explanation of how their problem is different, or accept a duplicate. When accepting a duplicate, they are given the opportunity to add some search terms to the existing question. Also, the question they've laboriously typed in starts to influence the search system to help find the existing question, even though it's not shown as a redirect to the general public.

  • Viewers of the question then can see the one-line "why doesn't this apply" explanation for the linked questions when they appear in the auto-generated list, or in the close-as-duplicate interface.

TL;DR version: Make looking at suggested duplicates/related question a mandatory part of asking a question.


1 Actually, I went back and started typing a satire of the usual "why are my question-asking privileges revoked"-type blather that's so common, in order to re-observe the actual user experience. Lo and behold, it WAS the Markdown syntax help before it was the related questions list. In fact, it went through "Types of Questions to Ask" (when adding a title), "How to Tag" (when putting support in the Tags list), and "How to Format Markdown" (typing the first few lines of the question). All in the same color scheme. I noticed this because I was really looking for the related questions list that I promised shows up. Well, it did eventually show up, after typing about four paragraphs of pseudo-question. Still in the same color scheme. It's no wonder people are tuned out.

Yes the Markdown list is there, at first

There needs to be ONE LIST. It needs to use whichever of (TITLE, TAGS, QUESTION BODY) have been entered so far.

similar questions finally show up

13
  • 1
    +1, I've suggested something similar as well. meta.stackoverflow.com/questions/253305/… Commented May 2, 2014 at 19:49
  • @Carrie: Yes, but mine is different because you require the viewers to merely view the related questions, I require them to summarize what the difference is. Showing that explanation to 3k+ users should minimize false-duplicate closures. Also, yours is kinda buried in a flood of related suggestions, I chose to use a pair of mutually-linked meta questions for that.
    – Ben Voigt
    Commented May 2, 2014 at 19:52
  • 2
    ^^And I should have had to enter that summary of differences before my question became visible, under my proposal.
    – Ben Voigt
    Commented May 2, 2014 at 19:53
  • Just stating the similarity ;] Commented May 2, 2014 at 19:55
  • 1
    @CarrieKendall: Oh also, I really don't like your idea to have a review queue for these. That's the single thing that turns your suggestion into the creation of a human-assisted search engine. I tried as much as possible to make the workflow what I, as a near-guru-level C++ developer, would actually find useful when posting a question. Your suggestion showed up in the lists... but the UI basically forced me into evaluating it based on the title alone. And the question-linked-in-comments is about the worst user experience possible, it doesn't even show the title in the comment.
    – Ben Voigt
    Commented May 2, 2014 at 20:00
  • 7
    You seem argumentative and your ideas will likely suffer from that :) either way, I think your tl;dr is a good idea. Commented May 2, 2014 at 20:02
  • I've found that the "related questions" shown when posting a new question is pretty much useless unless you add a language keyword to the title. If SE did try something like this, tags should be taken into account when trying to find potential duplicates.
    – Rachel
    Commented May 3, 2014 at 0:30
  • 4
    "..can either type an short explanation of how their problem is different..." And those explanations will run from "Not exactly my question" to "asdfkjh asdkjfha lsfa".
    – A. Webb
    Commented May 3, 2014 at 0:46
  • This seems like a horribly inconvenient series of hoops to jump through. Especially since it's ultimately to force a human to explain something to other humans just because a machine didn't understand it. Commented May 3, 2014 at 1:13
  • It's information that is going to have to be provided in response to comments anyway, unless the question is extremely well crafted. And if it's a question that 1 hour or more was put into writing, the extra 5 minutes for differentiation is almost negligible. If the differentiation is gobbledygook, it gets flagged as garbage just like putting that in the actual question.
    – Ben Voigt
    Commented May 3, 2014 at 2:34
  • Is this SO specific? Might be a meta.se question.. Commented May 3, 2014 at 4:36
  • My solution for helping askers find duplicates: leave a link to the duplicate by either a comment or a close vote. Best algorithm there is for finding accurate duplicates is to let another willing human being review the question and point the asker in the right direction. But you know what? It's faster for me to answer than it is to lookup the duplicate question/answer I'm thinking of. Therefore, you end up with more answers and less duplicate links.
    – Rachel
    Commented May 3, 2014 at 4:43
  • See also meta.stackexchange.com/questions/213370/…
    – Raedwald
    Commented May 3, 2014 at 12:51

3 Answers 3

17

I like the general idea, although your idea of requiring someone to write out a pairwise-explanation of why each match isn't a duplicate will probably not fly.

However, having the list of likely matches very visible is important, and I think requiring the asker to explicitly dismiss that list could help.

Something like:

These questions appear to be very similar to yours:

  1. Thwart publishing duplicate and low quality questions
  2. Give more attention to user-identified duplicates
  3. Marking question as duplicate of duplicate
  4. Handling the answers of duplicates

Before posting your question, please review these questions and

  • Confirm that your question isn't already answered
  • If those questions do not answer your question, make sure your question clearly explains how it is different

[ ] I have reviewed the similar questions list and edited my question accordingly

    [Submit]

(That last bit being a checkbox that must be clicked before the submit button becomes active.)

Won't stop people who just can't see the connection between their problem and a similar one that uses slightly different wording. Neither would a more complicated system, however (as A. Ward says in the comments, someone could just write a meaningless explanation for why their question is unique).

The real difference is that a simple reminder + checkbox to confirm you've read it doesn't slow down or inconvenience the people who have done their research, searching through dozens of questions, before posting their own.

You don't want to stop the good questions, after all...

3
  • This is good, as long as it is not an extra step between writing the question and saving on the server under my profile. If I've spent time writing up a question, I do not want to lose that. The draft feature is unreliable, many users don't know about it, doesn't follow me between computers, and only allows one draft. First, save the question body. Then, invite the asker to review duplicates identified using the body. Finally, make the new question public, if it was not a duplicate.
    – Ben Voigt
    Commented Sep 1, 2014 at 22:26
  • It's not a good idea to just produce a "click-through" prompt. You should maybe force the user to have to wait 30 seconds before clicking through.
    – gparyani
    Commented Sep 12, 2014 at 21:22
  • 2
    @damryfbfnetsi That needlessly hurts good users who have already spent time searching before composing the question.
    – Jason C
    Commented Nov 6, 2014 at 8:23
9

Be careful - that algorithm will be flawed in the beginning, and making it mandatory will make the pains more apparent.

From what I've seen of duplicate questions, they typically fall into two categories:

  • Questions that are solvable by an answer in a duplicate question, and
  • Questions that are related to a duplicate question, but do not have an answer that satisfies.

The latter type is the one that burns both duplicate-close-voters and askers alike - they see, "Oh hey, this question is already answered, let me see what it says," only to find that their question, while similar, isn't completely answered by the answers provided in the duplicate.

I personally don't feel that it should be mandatory for that reason; if the question isn't answerable in the same fashion as the duplicate question, then we're not doing the asker any favors by closing it as a duplicate. We may even be actively hurting them with that.

Here's a for-instance: out of all of the questions that showed up as a potential duplicate of this one, how many of them had an answer that you felt was satisfactory? What if the system were to deny you the privilege of posting this question?

4
  • The more different the questions are the easier it is to name one thing that makes this question different. That's all I'm expecting the question asker to do.
    – Ben Voigt
    Commented May 2, 2014 at 20:28
  • Also, if the question asker does the work to evaluate the potential duplicate, which we can make easier by not requiring leaving the ask-a-question page to do, they should be able to save the close-voters from repeating that effort, or at least give them a summary of the difference that they can quickly confirm or not. Should prevent the "burning".
    – Ben Voigt
    Commented May 2, 2014 at 20:59
  • 2
    If the asker has done their homework, then they have seen the almost-duplicate questions and should explain prominently in their question why the "almost-" qualifier applies. I don't feel that this is unreasonably burdensome, and I definitely don't lose any sleep over my nontrivial false positive rate when marking duplicates. Commented May 3, 2014 at 5:02
  • 2
    If the algorithm is flawed in the beginning, it should be activated in MSO before being released in the wild. Use for MSO questions would give a degree of real-world testing as well as allow feedback on user experience with it. Commented May 3, 2014 at 14:05
0

Kill the list found from the title only.

That list has helped me plenty times.

If the user hasn't bothered putting their title into either google or the stackoverflow.com search field, I don't care about saving them a little time on their writeup.

I did. I don't get proper results back. This has been noticed on MSE a long time ago.


Current situation:

  • Search google -> no result
  • Search SO -> no result
  • Write question -> see duplicate in title -> find answer

I have my answer and I didn't burden the community.

Proposed situation:

  • Search google -> no result
  • Search SO -> no result
  • Write question -> no result -> post question

I don't have my answer and I post my question. The duplicate pops up in the related bar, I get downvoted, my question becomes closed, and whilst I have my answer, I don't really feel great. Particularily the part where people will ask you things like "did you even search for x y z" (yes, I did) when the duplicates show up in the RELATED section.

I mean, this will just create strife and increase the time it takes for me to get an answer that already exists. Why did you think this was a good idea?

2
  • I don't think you understood the proposal. It's not to take the list away without replacement, it's to make that list from the much better algorithm used for finding related questions. The one building the list that you never even noticed (yes, it DOES appear on the question writing screen), because it's off to the side and in a color that makes it look like a sidebar advertisement. My proposal is to bring that, more accurate, list front and center.
    – Ben Voigt
    Commented Oct 30, 2014 at 22:38
  • Actually, reviewing my year-old proposal (sometimes I forget which suggestions I made in which meta question) and adding some illustrations, you clearly didn't read the whole thing. There would be no downvoting, there would be no comments "didn't you search". Because you are now reviewing the suggested/related questions after saving your question (or half-a-question) on the server, protecting you against losing your work, but before anyone else can see it. If the related list, which you DO see while writing the question, has your answer, no one but you knows you almost asked a dupe.
    – Ben Voigt
    Commented Oct 30, 2014 at 23:01

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .