Skip to main content
107 events
when toggle format what by license comment
Jan 6, 2023 at 19:15 history protected CommunityBot
May 7, 2021 at 1:23 comment added chivracq Sorry '-1' from me directly, for using the very vague Term "machine" in the Qt Title and not finding a Def about it, through "diagonal reading", I didn't read further although the Qt is certainly very interesting... "System" would sound more appropriate to me..., but would still need to define the "Scope"... (And the Answer is "Yep of course...!", I "do" it myself as (only) Mod on a (small) Tech Forum (for 6 years now), 30% manually and 70% through Bots/Scripts (that I all 100% wrote myself, I don't use any Plug-ins)), Spammers "welcome" if they "behave", the Forum is about Web-Automation...!)
Jul 3, 2018 at 9:00 history edited iBug says Reinstate Monica CC BY-SA 4.0
Update links
Mar 7, 2018 at 10:05 history edited ArtOfCode CC BY-SA 3.0
added 107 characters in body
Mar 6, 2018 at 16:44 comment added Andy There is a difference between the posts detected and the posts flagged. Flagged posts are done based on a combination of the reasons you see. Posts fall into multiple categories and it's those multiple categories that determine if a post will be flagged or not. Individual reasons are less accurate on their own.
Mar 6, 2018 at 16:27 comment added WGroleau In about 76 active checks on that dashboard, I can see that at least 24 are so far from accurate that they need to be retired. Probably more than half of them, like any that are less than 95% accurate.
Mar 1, 2018 at 14:19 history edited Sonic the Anonymous Hedgehog CC BY-SA 3.0
added 183 characters in body
S Nov 8, 2017 at 19:35 history bounty ended canon
S Nov 8, 2017 at 19:35 history notice removed canon
S Nov 6, 2017 at 20:45 history bounty started canon
S Nov 6, 2017 at 20:45 history notice added canon Reward existing answer
May 23, 2017 at 12:36 history edited CommunityBot
replaced http://stackoverflow.com/ with https://stackoverflow.com/
Mar 20, 2017 at 10:30 history edited CommunityBot
replaced http://meta.stackexchange.com/ with https://meta.stackexchange.com/
Mar 1, 2017 at 19:18 history edited ArtOfCode CC BY-SA 3.0
deleted 1 character in body
Mar 1, 2017 at 18:22 history edited Shog9
edited tags
Mar 1, 2017 at 15:34 answer added moooeeeep timeline score: 4
Mar 1, 2017 at 12:51 comment added ArtOfCode @SteveBennett They're not unwilling to help us out with resources and tools, but they don't have a lot of spare developer time to do it with - so, if we can do it ourselves (even if it's not quite optimal), that's often a better way.
Mar 1, 2017 at 5:24 comment added Magisch @SteveBennett SE is in a tough spot here. They're very busy, and giving us extra tools takes work on the part of the devs. We try not to steal too much of their attention when it's undue. There are things in the works, but they take time.
Mar 1, 2017 at 0:29 comment added Steve Bennett @Magisch Yeah, I guess I don't understand the premise here. You're doing a ton of work for SE. Are they actually unwilling to give you the tools and access you need for your stuff to work better? Why?
Feb 28, 2017 at 14:21 comment added ArtOfCode @Bohemian That's not something we're currently set up to do. I suppose it would be possible, with a bunch of work, but it's not the project's focus and I would think long and hard before doing it. It would also require all users to install a userscript, unless Stack Exchange were willing to do some work on their end too.
Feb 27, 2017 at 22:37 comment added user315433 @Bohemian Makes sense, although it's a job for SE developers, a community project can't do that. And SE tried without enough success to implement such a system.
Feb 27, 2017 at 21:29 comment added Bohemian @andy you're all not understanding what I am saying. There is no flag!!! Or flagger!!! Let me spell it out... 1) User think a post is "spam" and clicks "Flag" > "Spam", 2) In real time, a SmokeScreen service is synchronously invoked using an AJAX call from the user's browser to assess the spamminess of said post, 3) if the post is not "spammy" enough, a warning msg "this doesn't look like advertising, are you sure?" is shown. The idea is to intercept the flag as it is being made, not afterwards.
Feb 27, 2017 at 19:54 comment added Bohemian @M.A.R. No. Analyze in real time the current post which is in the process of being flagged as spam to verify that it indeed looks like spam and challenge the user who is flagging to justify the flag if it doesn't look like spam.
Feb 27, 2017 at 19:52 comment added M.A.R. @Boh but for that we need to see who flagged something, no?
Feb 27, 2017 at 19:47 comment added Bohemian @M.A.R. I meant to engage it during the flagging process, to stop inappropriate spam flagging during the flagging process
Feb 27, 2017 at 19:39 comment added M.A.R. @Bohemian I'm not sure that's possible, since I don't think any such tool exists in the API. We can't get data for flags on a post.
Feb 27, 2017 at 18:57 comment added Bohemian Can SmokeDetector be set up to detect and reject invalid spam flags? The vast majority of spam flags I see as moderator on SO are on posts that are not spam; in fact, usually there's no mention of any site, product or anything like that. For some reason users regularly flag nonsense as "spam", which wastes moderators' time. If it can detect spam, then the reverse is also true which can be used to warn users when they try to flag non-spam as spam with a message like "This doesn't look like spam - are you sure there is advertising here?".
Feb 27, 2017 at 9:44 comment added angussidney @EJP I'm not entirely sure what you're saying here. The whole point of review audits is to check whether or not you're being accurate with your reviewing. If review audits were always spam, than robo-reviewers would know to always click spam so that they don't get caught. Anyway, as David said eariler: review audits have nothing to do with Charcoal or SmokeDetector.
Feb 27, 2017 at 9:34 comment added user207421 @DavidPostill So why can't something useful be used as an audit, instead of a pointless spam check? There is a cognitive dissonance here. Reviewers are being asked to waste time to prove that they aren't wasting time.
Feb 27, 2017 at 8:56 comment added Magisch @SteveBennett Failing SE giving us special treatment in this regard, thats all we can do. Charcoal is a community effort, not affiliated with SE. So we have to operate within the boundaries of normal users, largely. We're also not "appropriating" human accounts in order to act, the users explicitly give us their permission to do so.
Feb 27, 2017 at 5:07 comment added Steve Bennett Just seems a bit gross that this whole thing lives outside SE, and has to appropriate human accounts in order to act. But hey, if you're willing to work in those conditions...
Feb 26, 2017 at 22:53 comment added AStopher Obligatory xkcd comic.
Feb 26, 2017 at 21:29 comment added Undo @Mahesh I'm seeing a data-method="post" on that button. It shouldn't have been changed in the last 24 hours; I'd be interested to see what HTML you're seeing.
Feb 25, 2017 at 19:49 comment added Magisch @RudolfL.Jelínek To put M.A.R's comment in perspective, querying metasmoke, only ~2.900 / 45.000 spam posts we've caught so far contain "diet" Source
Feb 25, 2017 at 18:35 comment added M.A.R. @Rudolf I haven't taken a look at data, but off the top of my head, spam containing diet makes up a small proportion of what constitutes spam that SE sites get, and there are quite a lot of posts on, for instance, Health.SE and Biology.SE that use this legitimately. We all wish it were that simple. :)
Feb 25, 2017 at 15:37 comment added ArtOfCode @Mahesh mind throwing an issue on the metasmoke repo, so I remember to check on that?
Feb 25, 2017 at 14:19 comment added Mahesh @ArtOfCode I got that from following your wizard. Looks like the wizard's button was just a plain link and sent a get request.
Feb 25, 2017 at 13:46 comment added user320844 Just delete posts with the word "diet" to catch 90% (95%?) of the spam?
Feb 25, 2017 at 6:17 history edited angussidney CC BY-SA 3.0
edited body
Feb 25, 2017 at 4:32 comment added angussidney @user2284570 You're right. And while we do take multiple precautions to avoid false positives being deleted (humans must flag, report must hit multiple reasons to be autoflagged, etc), there is always going to be some false positives. As we said in the post, if you'd like to see us retract flags, please go vote for this [feature-request].
Feb 25, 2017 at 4:02 comment added Andy Every system is going to have false positives. To reduce that, we require humans to cast flags too.
Feb 25, 2017 at 3:56 comment added user2284570 @Andy : are you sure your bot is definitely unaffected by the poodle human facial recognition syndrome ? Or more recently the semi‑trailer truck being recognized as a road sign. I know myself than even in the case of spell checking and getting better than human accuracy, bots will always do errors than human never do. In my case my actions were seen as deliberately harmful.
Feb 25, 2017 at 2:27 answer added Christopher King timeline score: 4
Feb 25, 2017 at 1:11 comment added angussidney @JasonC I think someone picked a theme for #3 while the others just used the google docs blank theme
Feb 25, 2017 at 1:02 comment added Jason C Why is RFC 3 so much prettier than the first two? Did somebody drink too much coffee that night?
Feb 24, 2017 at 23:56 comment added ArtOfCode @EJP We're not affiliated with Stack Exchange; none of us are SE staff, and this is a community project - we have nothing to do with review audits.
Feb 24, 2017 at 16:23 comment added endolith “Once they became self-modifying, spam-filters and spam-bots got into a war to see which could act more human, and since their failures invoked a human judgement about whether their material were convincingly human, it was like a trillion Turing-tests from which they could learn. From there came the first machine-intelligence algorithms, and then my kind.”
Feb 24, 2017 at 15:40 comment added DavidPostill @EJP Because the whole purpose of review audits is to test whether you are paying attention ... and in any case there is no link between review audits and Charcoal ...
Feb 24, 2017 at 15:18 comment added user207421 If you're so good at spotting spam why are you asking reviewers to do it as well, in review audits?
Feb 24, 2017 at 11:10 comment added ArtOfCode @Mahesh we're in prod mode, but with traces turned on for debugging. Your link is a POST route only, so GET requests result in routing errors.
Feb 24, 2017 at 10:38 answer added Nemo timeline score: 18
Feb 24, 2017 at 7:53 comment added Mahesh @Andy Getting a routing error on metasmoke.erwaysoftware.com/flagging/run_ocs. Also looks like rails is left in dev mode. It should've been a 404 instead of traces.
Feb 23, 2017 at 19:43 comment added BoltBait Did Smokey mark this post as spam? If not, it is broken.
Feb 23, 2017 at 15:40 comment added James Is this post spam?
Feb 23, 2017 at 14:22 comment added JohnLBevan Thanks @Andy; agreed that without SE explicitly contributing to further the automation of Smokey (i.e. by creating several accounts with a fixed level of reputation across all sites) that makes sense.
Feb 23, 2017 at 14:12 comment added Andy @JohnLBevan The maintenance of such a network would be challenging. We'd need multiple accounts for every site on the network (plus managing that every time a public beta launched) plus the required reputation needed to flag on each site. We'd spend more time managing the accounts than we would fighting spam.
Feb 23, 2017 at 13:50 comment added JohnLBevan Thanks @Andy; is there any argument against having 100 dedicated Smokey "normal user" bot accounts?
Feb 23, 2017 at 13:45 comment added Andy @JohnLBevan, Everything we do is done via the API. If there is a major problem, SE has the ability to see what is done using our application key. As for the unrestricted access, that seems more dangerous because someone needs to be able to use those credentials. Since Smokey is run by community members and not SE itself (like the Community User), that would mean a user has moderator (or higher) level access. We've been careful to build the system to not allow users with diamonds that ability to flag spam on their sites. We want to keep a human in the loop.
Feb 23, 2017 at 12:10 comment added Toby Who the hell writes at 36pt?
Feb 23, 2017 at 11:11 comment added JohnLBevan Great system & write up. One question; if SE are onboard with this, why do you need real user's accounts to flag things; couldn't they give Smokey an unrestricted account; or if that's problematic give it a few hundred designated accounts. That seems safer as it's then clear what's done by the bot vs a human, and avoids any risk of future misuse of this privilege (not that you would, but when talking of spam and security that option should be taken into account).
Feb 23, 2017 at 11:05 answer added Matthieu M. timeline score: 11
Feb 23, 2017 at 9:48 comment added angussidney @fedorqui actually, we didn't think it through it that much, we just wanted to halve the number of spam flags required :) However, what you've linked there does reinforce our decision to go with 3 flags
Feb 23, 2017 at 9:41 comment added fedorqui 'SO stop harming' Amazing job, well done! It is probably worth explaining the reason for using three flags. I assume it is from What are the “spam” and “rude or abusive” (offensive) flags, and how do they work?: 3 flags on a question (spam or rude or abusive): question is banished from the front page and all question lists except search results.
Feb 23, 2017 at 9:08 history edited angussidney CC BY-SA 3.0
fix incorrect info
Feb 23, 2017 at 9:07 comment added angussidney @JacobRaihle thanks for pointing that out, you can blame Jon Ericson for editing in the wrong info :)
Feb 23, 2017 at 8:53 comment added Jacob is on Codidact The description of the first graph seems wrong - the green line is the total amount of reports, the blue line is the amount of true positives, and the number of reports decreases as the minimum weight increases.
Feb 23, 2017 at 8:01 review Suggested edits
Feb 23, 2017 at 8:16
Feb 23, 2017 at 7:04 comment added angussidney @billynoah -1, your spam is too grammatically correct
Feb 23, 2017 at 6:14 comment added Eaten by a Grue This would be cool except for the fact that I JUST LOST 20 LBS IN LESS THAN 2 WEEKS WITH THIS NEW DIET! Click HERE to learn more!
Feb 23, 2017 at 5:20 comment added TigerhawkT3 Given that the bot has higher accuracy than the average user, and the bot arrived at that accuracy via feedback from a small group of certain users, it seems like doing something about those incompetent flaggers with a FallibleHumanFlaggerDetector is the next logical step. :P
Feb 22, 2017 at 23:30 comment added Largato The 3rd graph is not a hat. That's a boa constrictor digesting an elephant.
Feb 22, 2017 at 23:08 answer added StudyStudy timeline score: 12
Feb 22, 2017 at 20:35 comment added Undo @ShiranDror The first number is from long before we started automated flagging, so depending on what you're asking there is no overlap (none of the posts from the first number had flags from the bot, as the bot wasn't running then). If you're asking what proportion in the first number would have been flagged, we don't have the necessary data to find that out.
Feb 22, 2017 at 20:29 comment added ArtOfCode All of them, @Shiran. We only cast 3 flags on a post at maximum, and it takes 6 flags to remove a post - so the other 3 flags are cast by humans.
Feb 22, 2017 at 20:28 comment added Shiran Dror You showed the percent of posts flagged by users and accepted by moderators (95.4%) and that the bot accurately flags (99.5%). How many of these posts were flagged both by the bot and users? @ArtOfCode
Feb 22, 2017 at 20:25 comment added ArtOfCode What do you mean by the overlap, @ShiranDror?
Feb 22, 2017 at 20:25 comment added Shiran Dror What is the overlap between human and bot spam flags?
Feb 22, 2017 at 19:17 history edited Undo CC BY-SA 3.0
deleted 6 characters in body
Feb 22, 2017 at 18:34 history edited This_is_NOT_a_forum CC BY-SA 3.0
Active reading. [<https://en.wiktionary.org/wiki/meantime#Noun> <https://en.wiktionary.org/wiki/noon#Noun>]
Feb 22, 2017 at 17:58 comment added Andy @user3791372, As a general rule, spammers are lazy. Very few spammers will read this or dig into it more. The few that do are the ones that were already actively working to avoid detection anyway.
Feb 22, 2017 at 17:57 comment added user3791372 Isn't information like this helping the spammers evade detection? Kinda like those empty repos on Github with a readme that is just one huge advertisement for a commercial service with a few fake issues evade Github's cleanup?
Feb 22, 2017 at 17:50 history edited Undo CC BY-SA 3.0
update graph
Feb 22, 2017 at 17:30 history edited Jon Ericson CC BY-SA 3.0
Feature the post and clarify the first chart's labels.
Feb 22, 2017 at 16:51 history edited ArtOfCode CC BY-SA 3.0
added 116 characters in body
Feb 22, 2017 at 16:48 history edited Undo CC BY-SA 3.0
added 252 characters in body
Feb 22, 2017 at 16:38 history edited ArtOfCode CC BY-SA 3.0
added 432 characters in body
Feb 21, 2017 at 9:28 history edited Shadow Wizard
edited tags
S Feb 21, 2017 at 4:18 history suggested Pandya
As the post deals with flagging, added relevant tag.
Feb 21, 2017 at 3:24 comment added Andy @JasonC This? or this?
Feb 21, 2017 at 3:19 review Suggested edits
S Feb 21, 2017 at 4:18
Feb 21, 2017 at 2:48 comment added Jason C And I'm very happy to see a public post about it here!
Feb 21, 2017 at 2:00 history edited angussidney CC BY-SA 3.0
edited body
Feb 20, 2017 at 20:35 comment added Monica Cellio Charcoal team: excellent work! Thank you for all the effort you've put into this (and will continue to put in). This is freaking awesome.
Feb 20, 2017 at 20:25 answer added Petter Friberg timeline score: -17
Feb 20, 2017 at 20:10 comment added Daniel Fischer Thanks, Andy. So more or less exact numbers. (Though if the number is about the posts, the actual flagging stats are probably better, since mistakenly spam-flagged posts tend to only get one flag [and frivolously spam-flagged posts only get more than one flag if the flagger uses socks], while actual spam usually gets several user flags.)
Feb 20, 2017 at 20:00 comment added Andy @DanielFischer, Those numbers/percentages were provided by a community manager. That stat means that of all the spam posts raised, X% were deleted as spam. On Stack Overflow, 85% of the posts flagged as spam were deleted as spam. Across the rest of the network, 95% of the posts flagged as spam were deleted as such.
Feb 20, 2017 at 19:59 answer added SpockPuppet timeline score: 23
Feb 20, 2017 at 19:55 comment added Daniel Fischer What precisely does an "accuracy of P%" in spam-flagging mean? For Smokey it's probably the percentage of flags raised (earlier of reports) on posts that were subsequently deleted as spam. But for the human users? Have you exact numbers of spam flags raised and data on the subsequent fate of the posts?
S Feb 20, 2017 at 17:51 history suggested Mithical CC BY-SA 3.0
added link to which rooms
Feb 20, 2017 at 17:50 review Suggested edits
S Feb 20, 2017 at 17:51
Feb 20, 2017 at 17:24 history migrated from meta.stackoverflow.com (revisions)
Feb 20, 2017 at 15:59 answer added rene timeline score: 113
Feb 20, 2017 at 15:56 comment added Brad Larson "An interesting thing this graph shows is that time to deletion during spam hour was higher when we didn't cast any automatic flags. It was removed faster outside of spam hour." - Guessing this correlates with the time zones that moderators tend to be active, which is something we've seen before when it comes to how long spam lives when flagged.
Feb 20, 2017 at 15:39 answer added TylerH timeline score: 72
Feb 20, 2017 at 15:33 comment added Seth Good job everyone! Looks amazing. Smokey itself is already fantastic, and the automated flagging looks neat! I hope that the proposed change to the API makes it sooner than later.
Feb 20, 2017 at 15:24 comment added ArtOfCode /me is leaving a comment here so I'm pingable. I'm one of those elusive "system administrators" this talks about.
Feb 20, 2017 at 15:22 history asked Andy CC BY-SA 3.0