321

Related to Meta Super User efforts:

Old unanswered inactive questions with low views/votes

and Meta Server Fault efforts:

Cleaning house, really old, unloved questions

We already auto-remove negatively voted unanswered old questions automatically after 30 days, network wide, with no human intervention required.

I was thinking we might extend that to remove old, unanswered zero score questions... based on the following criteria:

  • fewer than (question age in days * 1.5) views
  • 0 score or lower, or 1 score or lower if the original author is deleted
  • no answers
  • 1 comment or fewer
  • asked more than 1 year ago (thus based on creation date, not last activity date, so Community poking a question, or a user editing it, does not give a doomed question 365 more days of zombie "life")

See the results of this query on:

Stack Overflow (10509) | Server Fault (787) | Super User (650)

This query has been refined based on the comments, and this is the final version that will roll out tonight. Every site in the network (except metas) will automatically delete old questions that meet this criteria from this point forward.

23
  • 85
    I can't help but imagine that those are the criteria for a proposed "Uber-Tumbleweed" badge. Commented Feb 5, 2011 at 5:32
  • 11
    Why does the view count matter if the score is <= 0, there are no answers, and it hasn't been touched in a year? It it meets the last three, it's a pretty dead question...
    – Andrew
    Commented Feb 5, 2011 at 5:33
  • 13
    @andrew viewcount = measure of the internet public's interest in the question, regardless of whether or not it is answered Commented Feb 5, 2011 at 5:41
  • 14
    @Jason: "Uber-Tumbleweed" isn't strong enough. More like "Pariah". ;) Commented Feb 5, 2011 at 5:42
  • 1
    The majority of 'thank you' related flags have been coming from zero score 'answers' that are several months old. This should really help cut down on that.
    – user50049
    Commented Feb 5, 2011 at 8:38
  • 8
    @Jeff since I manually deleted about 150 questions, I can tell that the number of views might be kind of low. Plus it inflates by everyone here checking them out to 'judge' them! No activity over a long period of time means the questions dead, regardless of the views
    – Ivo Flipse
    Commented Feb 5, 2011 at 22:23
  • 3
    Hopefully merged and closed questions don't get included in the 'unanswered' count. Commented Feb 5, 2011 at 23:13
  • 1
    Is it bad that I went though that entire list on an edit spree?
    – John
    Commented Feb 5, 2011 at 23:50
  • 1
    @John - I think I approved some of those - they seemed good edits to me, but ... they are likely to be in vain. I've posted a separate question about edit rep for deleted questions: meta.stackexchange.com/questions/78147 Commented Feb 6, 2011 at 0:24
  • 4
    Now that I've gotten my first Tumbleweed badge, I'd like to think that I wouldn't have to keep asking my unanswered questions every year. Maybe you could avoid false positives by only looking at questions from low rep (<1k) or inactive (no activity for over 6 months) users.
    – Gabe
    Commented Apr 9, 2011 at 8:32
  • 4
    @Arjan soft delete for sure, we just undeleted a question that was deleted by the automatic process. :) Commented Jun 10, 2012 at 11:11
  • 2
    @Shog9 I'm wondering, purely as a curiosity, how many questions were deleted the first time the script was run?
    – user206222
    Commented Jun 21, 2013 at 8:07
  • 3
    The first time ever? I don't know; that was over 2 years ago now. I just edited this to reflect the addition of closed question culling after 9 days, which ran for the first time an hour ago with 27891 questions deleted on Stack Overflow, @Emrakul
    – Shog9
    Commented Jun 26, 2013 at 5:13
  • 2
    I have seen questions which were solved in comments without any answer posted. Would they be deleted as well? Commented Apr 6, 2014 at 18:50
  • 1
    @gnat no it's not. Commented May 4, 2014 at 20:58

12 Answers 12

420

Abandoned, unanswered questions can be a nuisance for readers when they appear in search results. While every question deserves a chance to be answered, at some point the annoyance to those searching for a solution outweighs the increasingly small chance that an answer will be provided.

For this reason, the Community user will automatically delete old abandoned / dead questions in the following circumstances:

If the question is more than 30 days old, and ...

  • has −1 or lower score
  • has no answers
  • is not locked
  • has not been migrated from another site

...or...

  • it was closed and migrated to a different site (i.e. it is a migration stub)

...or...

  • it was migrated from a different site, and then rejected

... it will be automatically deleted. Internally, these are termed "dead" questions (RemoveDeadQuestions, RemoveMigrationStubs in the case of a migration stub, or RemoveRejectedMigrations in the case of a rejected migration).

For this criterion only, when calculating the question's score, only downvotes that were cast more than two days ago are considered. This is to prevent serial downvotes from causing automatic deletions and allow the voting fraud scripts to run before the checks above run.

If the question is more than 365 days old, and ...

  • has a score of 0 or less, or a score of 1 or less in case the owner's account is deleted
  • has no answers
  • is not locked
  • has view count <= the age of the question in days times 1.5
  • has 1 or 0 comments
  • isn't on a meta site

... it will be automatically deleted. These are "abandoned" questions (RemoveAbandonedQuestions).

These checks are run every week across all sites.

If the question was closed more than 9 days ago, and ...

  • not closed as a duplicate
  • has a score of 0 or less
  • is not locked
  • has no answers with a score > 0
  • has no accepted answer
  • has no pending reopen votes
  • has not been edited in the past 9 days
  • has not been migrated from another site

... it will be automatically deleted. These are "abandoned closed", and show as RemoveAbandonedClosed.

This check is run every day across all sites.


The particular script which applied to a given question is shown in the timeline entry for its deletion, in the "comment" column:

Example of a question deleted under RemoveDeadQuestions

See also: The official "How does deleting work? What can cause a post to be deleted, and what does that actually mean?" FAQ meta post.

42
  • 122
    so I can delete a old quesion with no votes and anssers by down voting it without a check from anyone else - maybe there should be a delay after the downvote before the delete kicks in. Commented May 20, 2011 at 9:43
  • 127
    And the OP can prevent a question from being deleted by making two comments to the question. Commented Aug 14, 2011 at 16:12
  • 16
    "has 1 or 0 comments": does that mean "has <2 undeleted comments" or "has <2 comments, including deleted comments"? Courtesy ping @JonathanLeffler. Same question for "has no answers": "has no undeleted answers" or "has no answers, not even deleted ones"?
    – msh210
    Commented Feb 17, 2012 at 19:35
  • 8
    Note that for the first rule it doesn't matter when the vote was cast. If the downvote is just a few days old, the post will still be deleted when the clean up script runs. Makes sense, I guess, so just for the record.
    – Arjan
    Commented Jun 10, 2012 at 12:26
  • 6
    As for my last year's comment: I feel deleting after a single (retaliation) downvote might not make much sense after all, at least in some cases.
    – Arjan
    Commented Mar 23, 2013 at 12:24
  • 15
    The message is clear, especially from the first part : the site wants answers, but does not want questions. Go figure. Commented May 23, 2013 at 19:12
  • 70
    Unanswered questions are a dead-end in search, @Nicolas: they're worth keeping around for a while on the chance that someone will find and answer them, but beyond a certain point if no one is expressing any interest in them they're just noise.
    – Shog9
    Commented Jun 25, 2013 at 20:26
  • 11
    If a question is automatically deleted after 9 days, Can this go towards a post ban?
    – user310756
    Commented Jun 29, 2013 at 7:48
  • 16
    @Shog9 — I bounce on your comment “Unanswered questions are a dead-end in search…” You think about the visitors of the page. I think about the person who has asked the question. When a visitor expecting an answer lands on a page with the question without answer, the failure lies in the search, not in the page itself. Here what has failed — and needs to be improved — is the search engine, the external one (Google…) or the internal one (the site). Commented Apr 27, 2014 at 7:37
  • 10
    Why are you holding a candle for someone who has gone months with no answers, no comments, no votes no attention at all @Nicolas? Do you seriously think we're doing anyone any favors at this point by keeping their question around so we can continue to ignore it? And your solution to the search problem is... To hide them? So, at some point, we identify questions getting barely any attention and make sure they get less attention in the future? How is such hell-banning not worse than deletion?!!
    – Shog9
    Commented Apr 27, 2014 at 16:44
  • 26
    @Shog9 — Why keeping the question after 1 year ? For the same reason as keeping the question after 1 day. So that the question be answered. There are questions which are very specific yet very pertinent. Commented Jun 8, 2014 at 11:23
  • 44
    I often find unanswered questions to be useful—they serve as examples of almost-working code. It's usually clear from the question what the OP is still missing, and what he's already worked out often answers my question. Commented Jul 2, 2014 at 20:36
  • 30
    I'm really against automatic deletion of questions. What a lack of respect to people asking good but not popular questions. Commented Mar 9, 2015 at 19:46
  • 10
    @Dualinity 30 days requires negative score, this is a big part of it: some users explicitly found the question to be bad. At 365 days, questions are deleted even with the 0 score: there is no evidence of them being bad, but it's clear they do not attract any activity or even passing interest.
    – user259867
    Commented Jun 22, 2015 at 1:44
  • 50
    @NathanArthur: if a question has helped you, whether it has an answer or not, upvote it.
    – jmoreno
    Commented Sep 18, 2015 at 16:30
58

I would prefer a crowdsourced approach, because I fear a purely algorithmic solution will never cover enough dead content to actually make a difference.

How about expanding the scope to:

  • 200 views (although I'm not sure I see the point of a view limit at all. A crap question doesn't get better when viewed 500 times, and negatively voted questions often get looked at a lot just for the entertainment value)
  • Last activity more than six months ago (or even three)
  • Has 0 score or lower, or 1 score with at least 1 downvote (to catch pity upvotes)
  • Has no answers with more than 1 upvote
  • Has no accepted answer (obviously)

and instead of deleting them, relaxing the "vote to delete" rules for those questions say to two required 2k+ votes instead of five, and having the community user bump a steady trickle of them to the front page? Like, one every five minutes. For most users, they will come up only if they're in their interesting tags, so they will hardly be noticed.

Maybe mark them with a message like

This question has seen very little activity in the past x months, has a low vote score, and is unlikely to improve. If you consider this question useless in its current state, please consider casting a vote to delete.

1
  • 13
    I like the message, perhaps even create a tab for this on the review yo start crowd sourcing voting to delete.
    – Ivo Flipse
    Commented Feb 6, 2011 at 15:38
22

238 seemed very few... I've been lurking round the Ant tag for a while, and expected there should be some of those questions showing up, but there are none. Playing with the query led to the discovery that changing

p.AnswerCount < 1

to

( p.AnswerCount < 1 OR p.AnswerCount IS NULL ) 

increased the yield from 238 to around 6 600, and brought in some of the Ant questions I expected to see. I think these are very likely to be safe to delete.

To try and gauge what viewcount would be a valid cutoff I removed that criterion - the result was ~12000 candidates, of which 119 had 1000 views or more.

Inverting the sort, so that highest number of views are shown first was interesting: three zero-vote no-answer questions, all with over 10K views, all mergees. Followed by only 9 more questions with over 2K5 views. Clicking through those revealed mostly merged and closed, but one or two that might be worth keeping, plus some woeful tagging (just serial-port?).

In the 'mid range' ~500 views - there's again mostly cruft.

For stale zero-answer zero-vote questions from OP accounts that have been closed, the view count doesn't matter - the question can be deleted. I think there should be some onus on the OP to 'tend to their questions needs' - if the OP has not logged in to the site for some period (6 months?) then these ought to be candidates for deletion, probably with looser criteria.

For active users with 'stale' questions, why not e-mail them saying their question is a deletion candidate, giving them a chance to try and salvage it?

Edit: another take on this:

How about joining to the Users table and using a 90-day cutoff, applying to the post and user activity? (An outer join shows more matches, presumably as for some posts the OP account has been removed.) There are 16 000 of these with under 100 views, rising to 21 000 if the limit is 500 views (i.e. almost 10% of the headline SO unanswered population). Looking at the high-view questions they do seem pretty dead - no obvious difference to the ones with few views - remember these

  • have no answers
  • have no upvotes
  • and the OP hasn't been active in three months

... so their prospects for revival aren't good. The SO 'ZUV-ZA' questions, older than 90 days, have accumulated 1.7 million views in total.

2
  • 12
    What's the difference between the two states? Does NULL indicate that the there never was an answer, while 0 indicates that there was at least one answer but they've all been deleted? A quick check of the first three questions on Jeff's query indicates that it does appear to be the case.
    – ChrisF Mod
    Commented Feb 5, 2011 at 22:14
  • 1
    @Chris - don't know, but I suspect you are correct. Commented Feb 5, 2011 at 23:19
17

We've been trying several things on SU to find such redundant questions. My findings were:

  • Views are not always very important, especially if there are no or few upvotes. If 200 users that actually took the effort of looking at a question didn't find it interesting enough to upvote, it's probably not worth answering to begin with. Furthermore, if everyone reading this question starts 'investigating' and looking at these low view questions, they won't be low viewed for much longer...

  • Votes on the question aren't important if there are no answers. This often indicates that while the question is 'interesting', the user often is asking for the impossible. Either way, no answer means it's probably no longer worth it to visit the question.

  • No activity of the OP. A lot of these 'crappy' questions are asked by drive-by users who ask only one question and are never seen again. This means they never provide requested feedback and never accept an answer (if any). If you check for questions asked by users who haven't been on the site for x months (we tried 12 to be safe) and the question had 'low' views, no (upvoted) answers or wasn't upvoted itself, they can go.

  • We don't need one query to delete everything, because if there's one thing we found its that there are many variations of bad questions. Low views + no upvotes + no answers. Some upvotes + no active OP + no answers. By checking for multiple conditions we can find a broader range of questions, without resorting to complex queries.

More generally, we should wonder that if a question hasn't gotten any worthy answers or attention, what's the point of keeping them around? Some problems get solved, because people learn to deal with it, update the software or just get outdated. I would simply start deleting them after several months, because the chances of someone solving it after such a long time are simply too small. Sure it happens, sometimes, but should we keep so much crap around just for those edge cases?

Perhaps in the future users should get a message that their question hasn't received much attention and that it will be deleted automatically unless they will add more information to get it answered. My guess is that most users would think: I don't care about the question anymore, you might as well delete it. If not, then they'd better show that they can make it worthwhile, else it would still get deleted the next month.

The most important thing is that it's a privilege to ask a question here, if you're not willing to improve your question to help others help you, than your question isn't worth keeping around. Because in the end, users come to our sites to find an answer, if we can't provide one than we shouldn't lure them in here either!

Which I think is also a reason to automatically delete all (non-duplicate) closed questions after a certain period where the decision could be appealed. After that they are a dead end and useless to the site.

3
  • 9
    +1: the OP should nurture their question. If no-one else can manage an answer, or even an up-vote, and a year has gone by... Commented Feb 5, 2011 at 23:18
  • you should look at the revised query I just edited in. Commented Feb 7, 2011 at 4:30
  • @Jeff It's a nice start, so I'm not complaining! ...for now :-)
    – Ivo Flipse
    Commented Feb 7, 2011 at 9:07
15

For those of you, who are also "moderators" of a small community around a less frequently used tag, you may want to check that the generally low view count in that community doesn't lead to the deletion of valid, yet unanswered questions.

I've created a data explorer query that allows you to see the unanswered questions for your tag that might be auto-deleted soon.

If this reveals questions you want to save from deletion, you could do so by casting an upvote.

9

Another suggested set of criteria...

I've actually been trying to do a little tidying-up for my tag of choice lately and have been deleting many questions that meet this criteria:

  • no answers
  • a few months old (i.e. no real activity lately)
  • closed (but generally not if it was closed as a duplicate, unless it's an exact repost, since duplicates still seem to be considered valuable)

If it has no answers and it's closed then it certainly won't be getting any more. The views and votes seem kind of irrelevant in such a case: if the views and votes were both high but the question still wasn't able to muster a reopening, it probably deserves to be put down.

3
  • 1
    This is a good set of rules, but doesn't catch the many "meh" questions for which not enough close votes were gained in time. Still, this is likely to catch many, many more than a measley 238
    – Pekka
    Commented Feb 5, 2011 at 15:15
  • 1
    If you just look at closed 0-voted and below questions with no answers, I count over 1000 of these on a simple closed:1 answers:0 search: stackoverflow.com/… . In fact, many of the 1-voted questions for that query look to be offtopic or otherwise of questionable value. The higher-voted questions appear to either be duplicates or answers masquerading as questions. Commented Feb 5, 2011 at 21:44
  • fair point, but this is getting into closed culling, which I consider a different topic. Commented Feb 6, 2011 at 13:28
9

I would favor a moderation based approach. It's not a lot of questions to deal with on an ongoing basis (ie, having 238 over 30 months is only 1-2 new unanswered questions per week).

It would be nice if these questions were emphasized. Perhaps dropped into a "unloved questions" bin in some set of moderation tools available to 3k+ users. They can then either choose to close them, edit them, or answer them in the hopes of getting a few rep and perhaps a necro badge.

But the load of these particular questions is low, and one year is a long time. I'd rather such questions be dealt with in a month so we reduce the number of people shunted from google to here to find an unanswered question. Further, I'd propose that the more views it gets, the more important it is to make sure it's either answered or closed. We shouldn't let these dangling questions dangle potential users.

4
  • 1
    we already randomly poke a certain # of unanswered questions every hour (4 on SO, 2 on SU/SF, 1 everywhere else), and have for years.. so this is completed. Note that this also changes the LastActivityDate so "poked" questions would not appear in this query. Commented Feb 5, 2011 at 21:36
  • 1
    @Jeff I disagree that poking them would give them as much attention as a specific "unloved questions" bin. Providing a specific place for moderators and editors to view them - in essence encouraging users to specifically attack these problem questions - would get rid of them as a problem altogether, and at a significantly quicker rate than the proposal you provide. Broken windows, and all that.
    – Pollyanna
    Commented Feb 6, 2011 at 3:38
  • 2
    so you're proposing the existing "really no answers" tab on unanswered..? blog.stackoverflow.com/2010/11/… Commented Feb 6, 2011 at 13:21
  • @Jeff I'm proposing that it be given specifically important status by being shown in the tools. Giving the 10k rep users an explicit job to take care of them. They are visible for people who are looking for them, but let's specifically assign it as a job for those with the ability and experience to edit, close, and delete.
    – Pollyanna
    Commented Feb 7, 2011 at 2:41
5

Starting completely from scratch with the new query If care what I said before, look in the edit history.

Some of the question that come up are badly tagged:

All of these are ill-written pleas for help with something the author seems to have no clue about. By now they have either solved the problem or given up, and the low number of views suggest not too many other people having the same trouble.

No great harm if they are removed.


Looking for one in tags I sometimes visit I find

In addition to this a lot of the titles and tagging paterns look suspiciously off-topic to me.

I haven't look at enough posts to be definitive, but I see nothing to make me suspect this list is full of stuff we don't want to lose.

1
  • could you look again, the original query was in error -- and is now correct as of my last edit. Commented Feb 7, 2011 at 4:51
5

I found this to be related to a question I asked a while back: Option to order unanswered questions by fewest views per time?

In a comment to the answer I got from Robert Harvey I wrote the following:

If a question has many views, but no answers, then it might be because the question is too hard. If the question has no answers and no views, it might just be that no one has seen the question. Also if someone looks at a question in the view it would be moved further down the list, so it would always be the questions that has been shown the least attention that turns up in this view.

That comment summarizes the problem pretty well. If you need more context you could always take a look at the question.

I don't oppose to auto-deleting old, unanswered zero-score questions after a year, but I believe that my proposal could significantly reduce the number of unanswered zero-score questions.

Until today I was unaware of the existence of http://data.stackexchange.com. I'm not familiar with stackoverflow's database schema and I don't work with SQL on a daily basis, so I figure that someone else could implement my query much faster than I could.

If I could ask for two things those would be: my proposed view implemented as a query on http://data.stackexchange.com, so that I and others can see what kind of questions are returned to evaluate the usefulness of such a view; and a comment from Jeff Atwood, just to know if he thinks it's something that will be implemented.

EDIT: Managed to compose the following query: https://data.stackexchange.com/stackoverflow/s/1125/least-noticed-questions

I guess it will be more useful if you filter it to only include tags of your interest. You probably also want to exclude questions with accepted answers.

2
  • 1
    if your question hasn't been answered within a year, it's so unlikely that it will receive an answer that it's better to delete it and if you still require an answer, ask an up to date version with all the knowledge you've gained in that past year. See it as a reset, rather than a punishment
    – Ivo Flipse
    Commented Mar 22, 2011 at 13:28
  • 1
    @Ivo Flipse I don't see it as a punishment and I don't oppose to deleting these question. I just feel like we could reduce the number of unanswered questions by adding my proposed view.
    – Erik B
    Commented Mar 22, 2011 at 16:19
3

A lot of these seem to be cases where a brand new user posted a question. There were some comments asking for more detail. And then nothing ever happened again.

I would suggest first deleting these items.
Then secondly, possibly widening the view count criteria to 200 and only deleting items where the asker has had no activity since that question or if it is the asker's only question.

3
2

For my own digging around I'm getting an idea as to the function needed to identify crappy questions worthy of purging. A lot of the ideas have already been mentioned here, but one I discovered on SF was views-per-month. That's a better proxy than absolute views paired with date.

The hazard is purging dated information that'll be useful for people actually looking for legacy information. That's why questions with zero answers are considered. My own query uses zero-or-one, but that's for hand viewing things; I wouldn't recommend using that as an auto-purge yet. It has been an interesting exploration.

A suite of search-queries would work best for this problem.

  • Low-quality questions ignored by the search-engines (what I was driving for)
  • Questions posted by drive-bys with little or no uptake by other users
  • Pre-migration off-topic closes (move-to-SF/SU wasn't always possible! Close those old Questions). Probably a one-time query.

As we continue to purge thank-you/anyone-fix-this?/incoherence answers, some of these older questions will become zero-answered. There are a couple of SF users that seem to be making an effort to mod-flag those questions, and I appreciate the work.

3
  • 1
    take a look at the revised query I just edited in. Commented Feb 7, 2011 at 4:29
  • 1
    @JeffAtwood Very interesting! Some of the higher-view, older posts would have legs if asked today. I wonder if the views-per-day coefficient needs a bit of tweaking downwards. Will check tomorrow. Unfortunately for SF, a LOT of people outright ignore the Community-poked questions so these older ones need more of a kick to get answers by now. Commented Feb 7, 2011 at 6:06
  • yes, we ignore LastActivityDate in the revised query.. it's all based on CreationDate Commented Feb 7, 2011 at 6:25
-1

Portions of question l posted in Ask Ubuntu:

I found three Q&A's addressing "Abandoned Questions" in Ask Ubuntu Meta:

Other ones I've just discovered on July 31, 2017 but not yet reviewed:

As you can see there used to be a great deal of interest in "Abandoned Questions".

Additionally there is:

In attempts to deal with abandoned questions I wrote a SEDE query and started to go through and close vote ones that were unanswered questions >5 years old and therefore dealing with an EOL (End of Life) version of Ubuntu. As such it would be impossible to duplicate the OP's problem without installing an EOL version.

Although for some questions the OP hasn't signed on in 5+ years, in many other cases it's been > 6 months since OP signed on. In other cases I was able to comment to OP and ask if problem was still there. Many times the OP replied it wasn't reproducible and I voted to close the question as such.

However with not enough queue reviewers the Close Vote queue quickly filled up. As such close voting thousands of abandoned questions on EOL versions (prior to Ubuntu 14.04) is not viable. This led to my Ask Ubuntu meta question:

Therefore I'd be very keen to see a Community Bot created that we could actively participate in defining the parameters for.

2
  • 5
    'Much like this thread appears to be "abandoned"' — it isn't. It's been completed and the roomba bot is largely feature-complete. If you want an adjustment to it you should really post a new feature-request/auto-delete question with the reasoning. Commented Jul 31, 2017 at 15:55
  • @NathanTuggy Thank you for the link. I'll review the 40 odd posts there. Commented Aug 1, 2017 at 3:22

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .