This question is concerned with finding possible middle grounds in the ongoing debate about banning and detecting AI generated content, so it will probably get criticized from all sides. I'm prepared.
Nevertheless I ask myself if it would be possible at the same time to:
- ban (identify and remove) AI generated content that is taken from somewhere as-is without any fact check nor any improvement whatsoever and
- keep content that among others draws on AI generated content, fact checks it and improves all the things that can be improved, potentially adding own ideas and
- has a reasonable low error rate of differentiation between the two ?
I think this is very difficult, because people who just copy&pasting AI generated content can always just claim that they did the fact checking without really doing it. While some people would certainly fact check AI content there are even more lazy people out there, so it's not like one can expect that only one of the two things will happen at any time.
It would be desirable because we decided that simply copy&pasting content from AI is banned because the quality is too low, but in many comments we nevertheless agreed that fact-checked and polished/drawn upon AI generated content is not a problem. A reliable differentiation of the two would therefore be helpful.
On the other hand, the company recently made a u-turn on banning AI content, questioning the accuracy of AI generated content detection at all (leading to a moderator strike). If we were not able to reliably detect AI generated content usage at all, there is also no hope to differentiate between the two usages. And even if we could, the second group, which is AI generated but human fact checked and edited content, is likely somewhere in between AI generated and Human generated content from their characteristics, so one would expect it to be even harder to detect or differentiate between.
So maybe this cannot be achieved and there is no middle ground and you either have to ban all AI generated content or none (meaning the strike will end by one side giving up). Or it's actually somehow possible and that's the reason for this question.
I think that the largest differences are:
- Human checked and edited AI generated content will contain a working solution while AI generated content alone will typically not be working (otherwise it would be high quality and much less controversial)
- Human checked and edited AI generated content might be more precise and to the point than AI generated content (because humans can trim the content more efficiently)
If we would have to fact check ourselves each time to be sure that the answer was already fact checked, this solution would be too inefficient (and typically not doable by moderators on for example Stack Overflow). However, maybe we could require that people actually describe how they made sure that their answer is actually working. Is this even possible? Can one credibly prove that the own answer is actually more than copy&pasted from some obscure source on the internet?
Maybe the personal history can be taken into account (somebody with a long track record of well received answers might be more likely to no just copy&past) to get more confidence. Or the time between posting answers or the number of saved drafts maybe (even though they might not be reliable)?
I think this question is interesting even in a broader context because not only copy&pasted AI generated answers are low quality. It might improve the quality of all generated answers if we had a way to be more sure of how much they were fact checked before posting.
For the sake of this question I would exclude issues with attribution and rather assume that always all sources are properly attributed. This question is just about the content.
I searched within Is there a list of ChatGPT or other AI-related discussions and policies for our sites? for similar discussions. The closest is still How can we determine whether an answer used ChatGPT?, ChatGPT assisted questions, Is re-worded ChatGPT answer allowed?, which touch upon the broader subject of detectability.