35

Currently the roomba has a criteria that deletes quite a bit of open questions that are considered abandoned after a year:

If the question is more than 365 days old, and ...

  • has a score of 0 or a score of 1 with a deleted owner
  • has no answers
  • is not locked
  • has a viewcount <= the age of the question in days times 1.5
  • has 1 or 0 comments

... it will be automatically deleted.

This is good and I'm sure it gets a fair amount of crap that was never that interesting.

I've been pocking on SO with the following query: closed:yes score:1..1 locked:no answers:0 duplicate:no hasaccepted:no and looking at the questions... and, well, going to town with down votes on a lot of old questions where I find things like this (now all 10k links):

Unix indent command settings
Any recommendations of arguments to unix indent command settings for C/C++ files ?

Who is the lead objective-c language designer
I'm trying to find the lead language designer for Objective-C. Anyone has any contact information, etc?

To get xml of google
How to get xml tags of google?

Expandable Stick bar in asp.net
How we make expandable stick bar in asp.net

That last one had an amusing comment on it:

Who is voting up this kind of question? — Carl Norum Feb 14 '11 at 7:36

If these questions hadn't gotten a drive by up vote, they would have been deleted after nine days. As it was, they were around for several years with no one really caring to fix or down vote them (even the people who closed them).

The suggestion

Make a variation on the 365 day roomba rule to be (this doesn't replace the existing one, just adds another one):

If the question has been closed, without an edit for more than N years (N >= 1)

  • has a score of N or lower
  • has no answers
  • is not locked
  • has a view count <= the age of the question in days times 1.5
  • has N comments or fewer

it will be automatically deleted.

Note some points on this:

  • It is working off of the close and edit dates, not the post date. This is to prevent things that are old from getting closed today and disappearing the next day without warning.

  • There is a sliding scale for N. As the post gets older, the query gets more aggressive with the score, views, and comments. This mirrors the way the existing 365 day script works on views.

If you dig in that query a bit, you'll quickly find a lot of crap. Its sitting there, and its not helping anyone. This isn't an SO only problem, though SO's shear size makes it the most obvious one. AskUbuntu, SuperUser, ServerFault... just some of the older, bigger ones where its a bit more obvious of the stuff under the rug.

The questions this would delete, people don't care about. They aren't found (low views). The original asker doesn't care to fix it up. The only reason they are still around is they got a stray up vote which keeps them around like relics of an earlier time in a roadside museum and curio shop that no one stops to look at.

Related roomba script documentation and discussion:

9
  • 8
    I always hate clicking on a link from Google to find a closed question with five upvotes and no answers (and not even closed as duplicate).
    – Troyen
    Commented Sep 17, 2014 at 18:22
  • You don't need hasaccepted:no when using answers:0. More to the point, I support the idea in general (and I'd ignore more comments than N), but somehow the ever-expanding lists of clauses begin to look too complex. Maybe there should be a single formula for the shelf life of a closed question, which computes its expiration date based on Q score and A scores, plus some input from view count and comment count. Granted, this would make DV-to-D voting next to impossible.
    – user259867
    Commented Sep 18, 2014 at 4:45
  • 1
    I've never heard them referred to as "roomba scripts" before. I love it.
    – hairboat
    Commented Sep 18, 2014 at 5:25
  • @Thursday true. What you are seeing there is my 'template' version that I tweak (like question score and answer count) when looking around old low score questions. This doesn't take the place of the 9 day script, but rather tries to identify old, abandoned closed questions that lack answers that people haven't cared enough about to fix but somehow got up votes on them. They may even be ok questions, but then nobody is looking at them and it just increases the clutter around questions and makes it harder for people to find what they are looking for.
    – user213963
    Commented Sep 18, 2014 at 15:13
  • 2
    @abbyhairboat the 'Turbocharging the Roomba' is the first mention of it, so blame Shog.
    – user213963
    Commented Sep 18, 2014 at 15:15
  • I've just now found a resource recommendation question towards the top. Luckily it couldn't get answered, but it still should've been deleted long ago. I'll keep looking through the list for anything similar.
    – Jamal
    Commented Dec 24, 2014 at 23:03
  • @Jamal a lot of those should have been deleted long ago. I'm not sure why they get upvotes, but they did. Sort by date created and go to the oldest ones and you can see hundreds of questions that one really has to scratch one's head about how they ever got an up vote. They should be deleted - but the scripts aren't aggressive enough, people where too free with upvotes on... well... crap, and there is far too much higher visibility things that need vote to deletes.
    – user213963
    Commented Dec 25, 2014 at 0:58
  • @MichaelT: At least many of them are old, so they'll soon be gone. I don't mind scanning the list until I run out of votes for the day.
    – Jamal
    Commented Dec 25, 2014 at 1:34
  • 1
    @Jamal if you time it right (vote after vote reset, before roomba scripts) you can cast the votes again as you get 'refunded' the regular votes on deleted questions the same day. I'd still like to see the roomba delete them without having to further inflate my down vote count.
    – user213963
    Commented Dec 25, 2014 at 1:35

2 Answers 2

6
+100

So, with the bounty and suggested query, I decided to play with it some too. https://data.stackexchange.com/stackoverflow/query/260701

select N.y, count(distinct(q.Id))
from Posts q, 
  (select 2 y union all
   select 3 union all
   select 4 union all
   select 5 union all
   select 6 union all
   select 7 union all
   select 8) N
WHERE q.PostTypeId = 1
  and q.ClosedDate is not null
  and DATEDIFF(DAY, q.ClosedDate, GETDATE()) > (365 * N.y)
  and DATEDIFF(DAY, q.LastActivityDate, GETDATE()) > (365 * N.y)
  and q.AnswerCount = 0
  and q.Score <= N.y
  and q.ViewCount <= DATEDIFF(DAY, q.CreationDate, GETDATE()) * 1.5
  and q.CommentCount <= N.y
  and not exists (
    select 1
    from
      PostHistory ph
    where
      ph.PostId = q.id
      and ph.PostHistoryTypeId = 14
  )
group by N.y

The 'N' table gives back 7 rows, with the numbers 2 through 8. Its one of those 'ug' not sure how to elegantly get the years. But, if its good enough for Oded, its good enough for me there.

This has a 'if it has ever been locked, it isn't going to be deleted by this script'. There's probably a more elegant way of doing this or something to match the way the existing roomba scripts work.

The key bits in there is the use of N.y in the where clause which allows it to get multiple years.

The results for this is:

2 3172
3 1518
4  278
5   24

Note that there may be overlap between the rows. If something would get deleted with the criteria for N.y = 2 and it is three years old, it will show up for both years two and three.

So, what would those posts be? Glad you asked. this query tells the tale and shows the key criteria that are causing issue today. There are posts that have 2 or 3 comments preventing deletion. And when you look at those, they've mostly discussion (and often meta discussion) or an old, undeleted, "duplicate ..." comment.

2

I agree with the premise of this, but it should be more often than every year.

If the question has been closed and gone without an edit for more than 4 weeks and has no answers, it will automatically be deleted.

This gives questions that are quickly closed approximately 1 month to be worked out. I think that's a very generous time frame. If the asker hasn't deleted the question or worked with the community to improve it in that time frame, then it should be deleted.

This makes the assumption that quick-closure is a good thing. I support the idea of fast closure of questions to avoid attracting low quality answers. Once a question has answers, it becomes harder to edit it into a high quality question without invalidating the existing answers. This approach should encourage people to not only vote to close questions quickly (since they won't need to use delete votes to make them go away from the site entirely), but also encourage people to seek assistance in improving their questions in a timely manner.

3
  • 1
    For questions closed quickly, I agree. I've sometimes asked long-tail questions that sit around for a while before getting answers; were one of them to be closed (which produces no notification), I might not notice and would be bummed to find it gone later. (When old questions of mine have been closed it's been because of changes in site scope; questions that are too broad, opinion-based, or unclear usually get closed quickly, which they should.) Commented Oct 1, 2015 at 21:02
  • @MonicaCellio It sounds like there should be a feature-request for being notified when your question is closed. That should be the case today, but would be more important if a closed question meant possibly being put into a queue for deletion. Commented Oct 2, 2015 at 11:57
  • 1
    I'm pretty sure that's been asked here and shot down, but I'll go looking and if I can't find it I plan to request it. The reason given is that you should be paying attention to your questions, but that really only holds water IMO for recent questions. We can't expect people to monitor every single question for possible closures and deletions. Commented Oct 2, 2015 at 15:26

You must log in to answer this question.