110

Meta notes: This was originally posted as an answer in response to What is the network policy regarding AI Generated content?. Some of the requests for more information have been addressed in GPT on the platform: Data, actions, and outcomes, and I have written my response to that in an answer post there.


We recently performed a set of analyses on the current approach to AI-generated content moderation.

How exactly did you perform it? (partially )


The conclusions of these analyses strongly indicate to us that AI-generated content is not being properly identified across the network, and that the potential for false-positives is very high.

Can you please show us the data? (partially )

And what does hard data have to do with "potential for false-positives"? (I don't see what you could possibly have to extrapolate such a conclusion based on) Are you speculating whether deleted content was really AI-generated? Are you also speculating that the only reason deleted content with AI-related flag reasons was deleted was because it was AI-generated? There are other reasons, such as not following the rules for referencing content written by others.


we also suspect that there have been biases for or against residents of specific countries as a potential result of the heuristics being applied to these posts.

Can you please spell out what exactly is substantiating your suspicions? Is it Usernames? Avatar images? Profile location fields?

If a large percentage of the flags are on content written by people suspected to be from a specific set of countries, how exactly does that indicate antagonistic bias against those users? Have you ruled out the possibility that users from those countries are just proportionally violating the current AI-generated-content policies more than people from other countries? Or that a larger proportion of users from those countries are violating the policy compared to proportions of users in other individual countries? The reality is that there are countries where lying and fraud are just part of business / culture. Why would you rule out the possibility that certain cultures or subcultures care less about following the current AI-generated-content policy and won't mind violating our policies to do... whatever it is they're trying to?

For example, Wikipedia has a page on List of countries by intentional homicide rate. If you look at that page, you will find that the rate distribution by country is not- in fact- uniform (flat)... Does that make the people who sourced or compiled that that information... racist? Obviously not. So even if I observe that usernames or profile pictures of users who I have strong reason to suspect for violating per-site policies- such even as self-admittance/"confession"- are coming from a particular demographic (which I do), neither does that make me racist- as long as those usernames and profile pictures do not form part of my analysis (which they do not).

Adding to that, there is data that shows that developers in different countries have different sentiments about AI tools: Your 2023 SO dev survey.

  • 83.6% of respondents from India use or plan on using AI tools, and from Brazil, 78.0%. (the top two countries in that response category)
  • 82.2% of respondents from Brazil view AI tools favourably. (the top country in that response category)
  • 15.4% of respondents from India think the most important benefit to AI tools is improved accuracy in coding, and from Brazil, 13.4%. (the top two countries by proportion of that response)
  • 55.2% of respondents from India trust in the accuracy of AI, and from Brazil, 45.0%. (the top two countries in that response category)

Finally, internal evidence strongly suggests that the overapplication of suspensions for AI-generated content may be turning away a large number of legitimate contributors to the site.

The purpose of the suspensions and ban policies on Stack Overflow is rate-limiting bad content. If those people want to contribute "legitimately" (in accordance with site-policies), they can wait for the suspension to pass. I'm pretty sure the people who actually want to do the right thing will generally care enough and have enough grit to actually learn from their mistake and try again. The people who don't will give up. I see that as an absolute win. A one-week suspension is less than a slap on the wrist.

Would you please consider making that "internal evidence" less internal? (attempted . I find the attempt unsatisfactory / flawed / incomplete)


In order to help mitigate the issue, we've asked moderators to apply a very strict standard of evidence to determining whether a post is AI-authored when deciding to suspend a user.

This standard of evidence excludes the use of moderators' best guesses based on users' writing styles and behavioral indicators, because we could not validate that these indicators are actually successfully identifying AI-generated posts when they are written.

What exactly is this standard? You say what it excludes, but not what it includes.

Considering that the question post is titled "What is the network policy regarding AI Generated content?", and the body asks "Earlier this week, Stack Exchange released guidance to moderators on how to moderate AI Generated content. What does this guidance include?", the instant-self-answer doesn't really answer the question.

Using scanning tools may not be highly accurate, but it had at least one good property of being concrete.

There are good, non-biased ways to evaluate probability of using AI-generated-content that are not based on hunches: look at the writing style, phraseology, and compare it with the post owner's past writing. If there's a huge difference, that's a big red flag.

Also,... why do you point at problems with "bias", and then ban an approach that you haven't made any argument about being biased? What's the connection? (you later say scanners have rates of false-positives, but I don't see any clear connection between false-positives and bias).

... If anything, a scanner would prevent bias against users based on things like their username, avatar image, and profile location field, (assuming you don't plug that info into the scanner, which I'm almost certain nobody does).

And if people don't plug that info into the scanner, what else is there except the content written in near-perfect English by the one-and-the-same ChatGPT? Unless someone prompted along the lines of: "Please write me and answer to the following question in the style of a <ethnic/racial/geographical group> person". And let me tell you as matter-of-factly as I can that from what I've seen in usernames, I don't think that these policy violators would feel very inclined to do that.


This standard would exclude most suspensions issued to date.

Again, can we please see the numbers and facts?


We've also identified that current GPT detectors have an unacceptably high false positive rate for content on our network and should not be regarded as reliable indicators of GPT authorship. While these aren't the sole tools that moderators rely upon to identify AI-generated content, some of the heuristics used have been developed with their assistance.

Can you please show us the numbers and your methodology? (some attempt made at )


As always, moderators who identify that a user has a problematic pattern of low-quality posts should continue to act on such users as they otherwise would. Indicators moderators currently use to determine that a post was authored with the help of AI can in some cases form a reliable set of indicators that the content quality may be poor, and moderators should feel free to review posts as such. If someone is repeatedly contributing low-quality content, we already have policies in place to help handle it, including a suspension reason that can, in those cases, be used.

A friendly reminder to readers that we have a list of ChatGPT or other AI-related discussions and policies for our sites, and that at the time of this writing, ChatGPT is banned on SO, and this decision is (was?) supported by SE. Quoting from the SO banner announcement:

We’ve just published a new Help Center article outlining our expectations and rationale for GPT-generated content on Stack Overflow and decided, together with moderators, to add a banner for all users pointing to it. We've also explicitly allowed more leeway for moderators in how they handle suspensions for this matter.

It seems you've changed your tune now.

I suspect there's a deeper reason behind what you've given in your new policy post and I'm curious about what it is... traffic dropping?

Why not make us part of the conversation? Maybe we could help. A post here on MSE like- "We're concerned about traffic dropping ever since the release of ChatGPT. How can Stack Exchange and Stack Overflow for Teams continue to stay relevant and competitive in the knowledge-sharing space?" is one way to do that.

12
  • 31
    They're burning through community goodwill like nuts right now... even if the deletion of feedback from the policy post truly is just a formality, e.g. allowing answers was a genuine mistake, it's still in suuuuuper bad taste, and the fact that staff/ CMs haven't already jumped in to clarify why they went missing is wild to me. It's a terrible look. Add that to the note on SE violating their own stated policy on amendments to the mod agreement, and you've got a pretty sizeable fire on your hands.
    – zcoop98
    Commented May 30, 2023 at 23:02
  • 2
    @zcoop98 FWIW, CesarM left a comment on my deleted answer "Deleting this because we don't typically host answers on policy posts, if you wish you can create a new question on MSE for it." Not sure that's worth a lot, but maybe something.
    – Chris
    Commented May 30, 2023 at 23:05
  • 3
    @Chris-RegenerateResponse Even that answer deletion explanation is questionable. I just jumped on your question post in relation to this (I hope you don't mind).
    – starball
    Commented May 30, 2023 at 23:13
  • @starball Not at all.
    – Chris
    Commented May 30, 2023 at 23:17
  • 2
    Wouldn't hurt for someone to start a feedback post then. Commented May 31, 2023 at 0:22
  • 9
    Even the claim of detectors' having high false positive rates for S.E. content is perplexing. The detectors return a percentage score purporting to index the likelihood of the supplied text's having been written by ChatGPT; a moderator could then choose a threshold over which to classify a post as having been written by ChatGPT & take action accordingly. Even if the company were under the mistaken impression that this is what moderators have been doing, they'd need to know the threshold to calculate a relevant false positive rate. Commented May 31, 2023 at 10:40
  • 9
    At any rate, they have shared a false positive rate from their study (on the Stack Moderators site) but not the threshold for which they've calculated it. (You can obtain as high a false positive rate as you like by setting the threshold low enough.) Commented May 31, 2023 at 10:40
  • 5
    Might the countries thing also potentially have something to do with the fact stack overflow reputation has a real monetary value in some places, it can be a precondition for work, providing more incentive to use things like ai. Commented May 31, 2023 at 20:29
  • 1
    Yes, it does seem like the beginning of the end. For what? Some short-term gains (the numbers must go up again)? It would have been wise to wait until after the AI hangover. Have we just wasted 15 years on another cycle of this? What will come after? What will the new thing be? Commented Jun 1, 2023 at 2:34
  • I understand that this topic is fraying nerves and straining relationships, but I'm too curious to not ask: Would it possible to get out of this thicket and heal up by changing the posture toward the AI (or suspected) posts to begin with? I'm sure smarter people than I must have considered this, but why not treat AI or likely AI posts with the usual incentives for contributing good stuff to the site?
    – danh
    Commented Jun 25, 2023 at 3:42
  • @danh wouldn't that be like embracing LMGTFY? except LMCGPTTFY?
    – starball
    Commented Jun 25, 2023 at 7:16
  • @starball - Yes, I guess so. It's a good analogy, too, because the site copes well enough with merely googled answers, and without strife.
    – danh
    Commented Jun 25, 2023 at 13:15

1 Answer 1

51

Moved here as suggested by starball (the OP). Context: in response to What is the network policy regarding AI Generated content?


I strongly oppose this new policy

I'm trimming the quotes slightly, but I've done my best to keep their meaning.

... The conclusions of these analyses strongly indicate to us that AI-generated content is not being properly identified across the network, and that the potential for false-positives is very high. Through no fault of moderators ..., we also suspect that there have been biases for or against residents of specific countries as a potential result of the heuristics being applied to these posts ... internal evidence strongly suggests that the overapplication of suspensions for AI-generated content may be turning away a large number of legitimate contributors to the site.

Please cite this evidence. If a user is wrongly suspended (which AFAIK is very rare here), there is a well-established process to handle this (mod messages). If a specific mod has been wrongly issuing a large number of suspensions, we also have established processes to deal with that as well (To my knowledge, all mods involved with AI-content flags require convincing reasons that something is fake and aren't randomly suspending users).

... we've asked moderators to apply a very strict standard of evidence to determining whether a post is AI-authored when deciding to suspend a user. This standard of evidence excludes the use of moderators' best guesses based on users' writing styles and behavioral indicators, because we could not validate that these indicators are actually successfully identifying AI-generated posts when they are written. This standard would exclude most suspensions issued to date.

A few questions

  • A number of users have already been suspended for ChatGPT-answers. Will those currently suspended be unsuspended?
  • While I'm not going to name names, I'm aware of at least one high-rep (over 10k) user that got a suspension for that. Will high-rep users that were previously suspended for that be permitted to run in moderator elections? (Normally suspended users can't for one year after the suspension unless they get an exception)?
  • What alternative methodology do you propose for finding AI-content? Unless a user outright admits it, there isn't much to go on beyond the text of a post.
  • Also, in a different MSE answer (by a Staff member), they stated "And .. if any site experiences a volume of GPT posts that are cumbersome to manage ... we are always happy to help apply the tools we have at our disposal.". Are you retracting that statement?

We've also identified that current GPT detectors have an unacceptably high false positive rate for content on our network and should not be regarded as reliable indicators of GPT authorship. While these aren't the sole tools that moderators rely upon to identify AI-generated content, some of the heuristics used have been developed with their assistance.

This is a somewhat poor metric. It's known that they have a high FP rate. Can you provide statistics on the FP rate for suspensions? Mods aren't purely using detector results, so looking at wrongful suspensions would a far better metric

We've reminded moderators that suspensions (and typically mod messages as well) are for real, verifiable malfeasance only, and should not be enacted on the basis of hunches, guesses, intuition, or unverified heuristics. Therefore, we are not confident that either GPT detectors or best-guess heuristics can be used to definitively identify suspicious content for the purposes of suspension.

Moderators were never allowed to randomly suspend users. That... isn't new. What was allowed was suspending users when they continuously posted answers that were fake in a deliberate attempt to gain rep and cheat the system. Again though, can you share an exact percentage of the number of users wrongly suspended for ChatGPT usage vs. the total number of ChatGPT-related suspensions (or exact numbers)?

As always, moderators who identify that a user has a problematic pattern of low-quality posts should continue to act on such users as they otherwise would. Indicators moderators currently use to determine that a post was authored with the help of AI can in some cases form a reliable set of indicators that the content quality may be poor, and moderators should feel free to review posts as such. If someone is repeatedly contributing low-quality content, we already have policies in place to help handle it ...

Um... what other methods should they use? Votes are a terrible metric [here] as ChatGPT content tends to get a few upvotes/accepts as it looks convincing even if it is totally wrong. So... what do you suggest there?

8
  • 12
    I really would like to have some stats, even stats from the moderators about how many people got suspended for posting allegedly AI generated content. The company and everyone else is discussing wild stuff, but nobody presents any numbers at all. So what should I believe? Maybe I am an AI??? Commented May 31, 2023 at 7:00
  • 3
    If users got suspended for contributing good AI-generated content, I hope they get unsuspended. If they got suspended for flooding the site with junk, they simply should stay suspended because of that; it doesn't matter how the junk was generated.
    – Philippos
    Commented May 31, 2023 at 13:00
  • 2
    "Can you provide statistics on the FP rate for suspensions" The assertions made in the post you are quoting indicate that the new rules would "exclude most suspensions issued to date", meaning "most" of the users suspended wouldn't have been, under the new rules. Is it safe to say then that their "statistics" are that "most" of the suspensions were false positives?
    – Kevin B
    Commented Jun 1, 2023 at 20:24
  • 2
    Well... not exactly. Clearly they're claiming that most suspensions to date would be invalid. But, I'm wondering about for every user suspended where the reason involves a Detector, what percent of those users never posted ChatGPT content to Stack Exchange and so were wrongly suspended. I'm also looking for information on how they determined that suspensions based on detectors have a high rate of FPs, as mods also look at other factors.
    – cocomac
    Commented Jun 1, 2023 at 20:29
  • 2
    I mean, i know what you're asking for, however, that's not what they seem to be basing this decision off of. therefore it's likely this data simply doesn't exist
    – Kevin B
    Commented Jun 1, 2023 at 20:41
  • 1
    "If a user is wrongly suspended (which AFAIK is very rare here), there is a well-established process to handle this (mod messages). If a specific mod has been wrongly issuing a large number of suspensions, we also have established processes to deal with that as well" that presumes that the user would ever want to do anything with SE after this. In fact, I suspect that the only ones that do are very argumentative by nature and that makes the disqualified to present a solid case.
    – Braiam
    Commented Jun 2, 2023 at 16:38
  • BTW, what stats are there that this is not happening? That content identified as AI is actually generated by AI, and posted as is?
    – Braiam
    Commented Jun 2, 2023 at 16:38
  • 1
    "What was allowed was suspending users when they continuously posted answers that were fake in a deliberate attempt to gain rep and cheat the system" SO policy is a one and you are done. You post stuff that gets identified as AI, you are immediately suspended. That's what their policy does.
    – Braiam
    Commented Jun 2, 2023 at 16:39

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .