Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

13
  • 69
    We've run across the entire network since its inception - all of the accuracy numbers you see in the above post are network wide. Some reasons are tuned for specific sites, some are disabled on some sites. It's a fun balancing game, but we've gotten pretty good at it.
    – Undo
    Commented Feb 20, 2017 at 15:41
  • 14
    For example, here is some code which checks for health-related spam, but it works only on some sites of the network which are often targeted. And here another 'filter' which is active on all but a few sites which are likely to yield many false positives.
    – Glorfindel Mod
    Commented Feb 20, 2017 at 15:44
  • @Undo Thanks, that wasn't clear after reading; the post only mentions Stack Overflow specifically when talking about SD's flagging behavior.
    – TylerH
    Commented Feb 20, 2017 at 15:44
  • @Undo And to focus on that topic a bit, do you have numbers per site? I'm curious if there are any sites with 100% accuracy, and also curious what the site w/ the lowest accuracy is.
    – TylerH
    Commented Feb 20, 2017 at 15:51
  • 10
    Ask Patents is probably the worst site, with currently only 64% accuracy. But remember that those posts generally won't be autoflagged, only when they reach a certain threshold.
    – Glorfindel Mod
    Commented Feb 20, 2017 at 15:53
  • 16
    But AP is just... weird, so that's not exactly surprising.
    – ArtOfCode
    Commented Feb 20, 2017 at 15:54
  • @Glorfindel "but remember" Where might I have seen the threshold before if I am to remember it? Are you talking about each user's individual threshold? If that's the case, does that mean users set their own threshold preference before the bot can flag as them? If so, what if there is a user who sets their threshold to, say, 60% while everyone else sets theirs higher? Are the settings published? It wouldn't be random in that case... SD would always use the 60% account and two others.
    – TylerH
    Commented Feb 20, 2017 at 15:59
  • 11
    @TylerH sorry, I should have elaborated. My link shows all posts reported by SmokeDetector, often detected for just a single reason. Autoflags will only be cast if a post is detected for multiple reasons, and they need to be 'effective' reasons, too. You can't set a threshold resulting in lower than 99.5% accuracy.
    – Glorfindel Mod
    Commented Feb 20, 2017 at 16:02
  • 2
    @Glorfindel Thanks for the info!
    – TylerH
    Commented Feb 20, 2017 at 16:04
  • It looks a bit tricky to apply. As in, I'm assuming I'd have to install Linux first? And then run this in the background on my PC? Commented Feb 23, 2017 at 6:30
  • 6
    @SirAdelaide You don't need to do anything, we (Charcoal) host the bot (see here for current location) and metasmoke (which does all the flagging), all you need to do is sign up and allow us to use your account for flagging. We then use the SE API to flag the posts. But yes, the bot does run on a form of linux/mac, due to compatiablility issues with bash and git which we use extensively. Feel free to drop into Charcoal HQ if you have any more questions Commented Feb 23, 2017 at 6:52
  • @Undo how does that balancing work with newly-created beta sites? Commented Feb 27, 2017 at 20:10
  • 3
    @NathanMerrill In practice, newly created beta sites have extremely low traffic anyway. Since our regexes are balanced for ~160 sites already, new ones usually don't fall much outside of what we've already seen. Usually, the only times we need to tune explicitly are for health-focused sites. We catch a lot of skin care spammers across the network, but the nature of those patterns see high false positive rates on health sites. It's always caught quickly and dealt with in a thirty-second-deploy cycle or two.
    – Undo
    Commented Feb 27, 2017 at 20:19