10

With the public release of ChatGPT, which uses AI to create conversational responses to questions, there has been discussion of banning such answers from the SE network. At least a couple of SE sites have introduced bans at a local level (Stack Overflow & ELU) citing both the lack of quality of such answers and their use being a form of plagiarism as reasons.

Should History SE have a policy on AI-generated answers? Or are we happy that the current requirements for references (which aren't supplied in AI answers) and general quality checks are sufficient to deal with them?

3
  • 1
    Related on Meta: Is there a list of ChatGPT discussions and policies for our sites?.
    – justCal
    Commented Dec 18, 2022 at 16:05
  • Please review this answer Open admission of AI authorship in the context of this question. Should we approve this as approved-site-policy?
    – MCW Mod
    Commented Apr 20, 2023 at 12:51
  • 2
    I am unclear as to the purpose of reviewing the above-linked/deleted answer? It appears to contain another fiction created by the chatbot; I find no record of the first source listed "The Archaeology of Crucifixion" by Vassilios Tzaferis, published in the journal Near Eastern Archaeology in 2001 existing, though the individual, the journal and the article title mentioned do exist in separate occurrences related to the subject. Like in this question we closed, the AI just jumbles things together to make something up that sounds correct.
    – justCal
    Commented Apr 21, 2023 at 12:55

4 Answers 4

14

I will play devils advocate here, and recommend a stated policy against the posting of AI generated answers here.

Our main users may attempt to enforce a no-sources/no votes policy, but when a question goes HNQ, we get a large influx of non-site regulars that pop in and vote, often with no knowledge of the policies we try to uphold here. We have all seen poor questions, which later get closed, first reach the HNQ list. If the non-moderators can point to a specific policy in comments, perhaps it might slow down these random votes until moderators can step in. (this would assume of course the AI/content can be recognized as such; I look forward to seeing the tools mentioned by @MCW)

The problem with assuming it will be closed by the users goes back to an old discussion we had here once about bad questions/answers. If it has no sources, but looks ok otherwise, users may not up vote it, but also may not down vote it or move to close.

I would recommend a strict policy in fact, requiring such content be deleted and the suspensions mentioned by T.E.D levied. Remove any incentive to 'play games' with the system simply for the purpose of earning rep.

I watched one video about detecting this type of content, which fed the info into another Ai to evaluate, here. In the comments below the video there were already discussions on how to beat the system. Please set a policy now, and get ahead of those individuals.

For those that think this will not be a problem, there is an answer (since deleted by @MCW) from a few days ago, which had 2 upvotes, and only one down vote. It looked ok, so most of the users ignored it. It tests at the above linked site as 99.97% probability as fake. (deleted answer is here, only visible if you have enough rep to view deleted answers)

Another recent answer (12/17) posted (and then self-deleted) here also fails the test. (Only users with sufficient rep to see deleted answers will see this post).

So you can see this is an ongoing issue. (It is also worth noting both of the examples I cite came to the site with association bonus, so this abuse is not limited to unregistered/new users.)

I will add Steve Bird's comment from below, which also points out another danger of allowing this type of content to go on unchecked:

If the ChatGPT experiment starts to produce "good enough" answers, it would also be a tool that could be open to abuse by trolls. As we've seen in the past, posting a few 'good' answers to build up rep can be misused to sock puppet and upvote push questions. Being able to quickly generate an acceptable answer with little effort would be a boon to them.

1
  • 6
    If the ChatGPT experiment starts to produce "good enough" answers, it would also be a tool that could be open to abuse by trolls. As we've seen in the past, posting a few 'good' answers to build up rep can be misused to sock puppet and upvote push questions. Being able to quickly generate an acceptable answer with little effort would be a boon to them.
    – Steve Bird
    Commented Dec 14, 2022 at 23:43
4

Concur - @T.E.D beat me to it. I believe that existing policy covers this situation. I've processed several flags for AI generated answers. They tend to fall short of our quality standards - they lack references.

Note that the flags I've processed have referenced some tools to analyze text for probability of AI generation, and I think it would be a very good idea to collect those tools and link them to this question as a resource.

6
  • I'm not sure I understand. can you spell it out for me?
    – MCW Mod
    Commented Dec 17, 2022 at 17:42
  • If I share a link/s to answers which fail the AI test, for demonstration purposes in my above answer, would it be considered 'unfriendly' to point out the user/s is/are using AI to generate fake answers.
    – justCal
    Commented Dec 17, 2022 at 17:45
  • IMHO, so long as the focus is on improving the site, not on scolding the user, it is not unfriendly.
    – MCW Mod
    Commented Dec 17, 2022 at 17:55
  • @justCal, I find your comment on this intriguing.
    – MCW Mod
    Commented Dec 19, 2022 at 16:44
  • Intersting case, first time I have seen a test prediction fall outside of 95 to 98% (either way). The openAI detector (I believe another AI itself) is apparently still learning and fallible, since I see no reason to doubt the users response to my query. Having problems with the site or my computer this mourning, but rechecking it the single edit you did seems to flip-flop the result. If the results are anything under 95% at this point I think cautious response would be warranted.
    – justCal
    Commented Dec 19, 2022 at 17:48
  • But it supports my thesis that high quality questions & answers will pass the AI detector. Low quality questions are similar to AI content - no sources, no specifics, no definitions. There is doubtless a more elegant way to say that; perhaps it is "H:SE relies on scholarship, which AI cannot yet emulate"... not enough time to refine the language
    – MCW Mod
    Commented Dec 19, 2022 at 18:02
2

I'd like to see contrary answers (because I have a bad tendency to err on the side of lazyness), but my current opinion is that our existing rules about answers needing to be addressing the question, not utter nonsense, and supported with references, will likely handle this just fine.

If existing users start using it to post bad answers, it doesn't seem totally unreasonable to treat them like any other user that starts posting "badly received" answers (a process that starts with gentle direction, and ends in suspensions of geometrically increasing length). When brand new accounts start posting spam or nonsense, they just get destroyed.

OTOH, if the AI starts generating good answers, that's a different kettle of fish. Personally, I'm inclined to think a (legally-posted) good answer is a good answer. But its my understanding ChatGPT isn't up to that level for non-trivial questions, and of course doesn't provide references when it generates text.

2
  • Would compiling a list of AI generated answers in my response be acceptable, or would it violate the 'be nice' policy?
    – justCal
    Commented Dec 17, 2022 at 17:25
  • @justCal - That's actually kind of funny. And it is a serious flaw in my argument that answers here on Meta in fact do not generally require references, so existing policy doesn't help here. However, I think your good old fashioned hand-crafted answer and the large number of upvotes it appears to be getting is probably making the point just fine. :-)
    – T.E.D. Mod
    Commented Dec 18, 2022 at 1:45
2

I also agree with MCW and T.E.D. here. If this were a site such as Worldbuilding or Writing where an answer can pretty much be someone's opinion, then there could be an issue. As it is, a good answer here requires references and an answer without references (or nonsensical references) draws attention to itself like a magnet and can be downvoted, custom mod flagged for plagiarism, or flagged as Not An Answer (NAA) if it makes no attempt to answer the question.

I tried out ChatGPT myself and posed it some historical questions. While it did have some basic insights that could be found in a high school history textbook, it was completely incapable of doing any sort of reasoning about more complex topics. When I posed such a question, it would say something like, "There are many possible reasons for [x], some have said [y]...." and then completely fail to cite anyone. Those are low quality answers if I ever saw one.

I did review our New Answers to Old Questions feed under the 10k tools and it looks like our current rate of answers is slow enough that each of then can be manually checked. This doesn't apply to sites like Stack Overflow (the originator of the ban) that gets as many posts in an hour as we get in a month. There really is a need there to apply blunt tools that cut down the flow of crap to a reasonable amount.

2
  • 1
    Since you bring up Worldbuilding, they have a discussion of the issue in their Meta.
    – justCal
    Commented Dec 17, 2022 at 17:56
  • 3
    @justCal - Gotta admit I find FuzzyChef's short argument there pretty compelling: "If the asker wanted to ask ChatGPT the question, they could have done so themselves. If they're posting here, it's because they want a human to answer."
    – T.E.D. Mod
    Commented Dec 18, 2022 at 3:58

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .