12

We have had our first post from the ChatGPT AI answering bot. Or at least, a post flagged by an AI detecting AI as 99% probable AI generated. The answer has not been deleted - yet, it isn't a very good answer but it doesn't flag up as NAA and nobody has flagged it VLQ.

There is some angst about AI-generated questions across SE at the moment. SO in particular has a ban on it...
Temporary policy: ChatGPT is banned
That post gives some good reasons for the ban. Various other sites are also banning it, and some have decided to cautiously accept it.

So, this is to solicit the community's opinion. Do we wish to ban answers detected as being AI generated? Do we want to treat them like any other answer?

With only one such answer so far, we aren't suffering any dilution of quality at the moment.

9
  • 6
    I asked ChatGPT its opinion on this matter; it equivocated. "Ultimately, the decision of whether to ban or allow AI-generated answers will depend on the individual needs and preferences of each user or organization."
    – dave
    Commented Dec 11, 2022 at 0:41
  • 2
    How reliable is that detector? Cf. <aiweirdness.com/writing-like-a-robot> Commented Dec 11, 2022 at 12:40
  • 2
    @user3840170 Exactly! And that is another reason why I didn't want to just knee-jerk delete the answer.
    – Chenmunka Mod
    Commented Dec 11, 2022 at 14:42
  • In that case it’s worth distinguishing between banning AI-written answers in principle (regardless of how that is determined) and blanket-deleting answers suspected by a specific tool of being AI-written. Commented Dec 12, 2022 at 13:42
  • IDK. I've never seen any of it because y'all are freaked out and delete it all. Bad answers are what the DV button is for. Answers that don't belong is what flags are for. Is there a SE site that does allow them so I can go see how stupid our new robot overload is? Or can somebody give me their phone number so I can sign up? ;) How much is a million people's phone numbers worth....
    – Mazura
    Commented Dec 17, 2022 at 3:50
  • As long as there's a human in the loop, that human has to abide the rules of SE : no sock puppets, and if enough of your answers get deleted we are no longer accepting answers from this account. - When it can solve captcha and create its own accounts, then this is all over anyway.
    – Mazura
    Commented Dec 17, 2022 at 3:57
  • Actually, here on Retrocomputing we aren't deleting them on sight. At least not unless everbody wants us to.
    – Chenmunka Mod
    Commented Dec 17, 2022 at 9:37
  • A new year and another ChatGPT post on the site. I've deleted it.
    – Chenmunka Mod
    Commented Jan 2, 2023 at 8:38
  • Marked status-review to make CMs aware of our policy forbidding such posts, as per “The "answers must be cited" version of the banner will be enabled network-wide as the default option for all sites, unless”.
    – wizzwizz4 Mod
    Commented Feb 8 at 19:33

9 Answers 9

23

I'm opposed. Perhaps this is just a knee-jerk bladerunner anti-AI reaction on my part, but I want to see answers (and questions) from someone who has an actual interest in the subject, not just someone operating an automated internet tool to synthesize a response.

I understand that SE says that the content of Q+A is the thing that matters, but I don't think that's the entire truth for Retrocomputing, which owes a substantial part of its appeal to first-hand experience.

1
  • But it’s not like we can verify the authenticity of recollections by ostensibly-human posters either… Commented Dec 11, 2022 at 16:52
9

No.

My concern is mostly with fully automated responses, which of course given the costs of GPT vs. human writers, will soon prefer GPT over humans in the sheer volume of text that can be produced.

I have fewer concerns about a response that was partially synthesized by statistical completion, under human supervision. For example, taking the output from this prompt: "Give an introductory paragraph about the typical workflow using a mainframe computer in the 1960s" and reviewing it, and working it into a larger human-originating synthesis, about the shift to CRT terminals in the 1970s.

When I write, I often outline like that. Ah, I will need a paragraph here to develop this idea, and another one to develop that idea, and then I need to link them somehow... automating that is more of a writing aid than an AI written post, IMHO. But since it will be difficult to distinguish machine-aided vs. fully automated sources, until we've had some time to work through the intellectual and social and other implications of this technology, I will apply the precautionary principle and say, at least for now: no.

8

Dilution of quality isn't the only problem. For some of our questions, the answers are externally-verifiable. For example:

For other questions, the answers are the sources:

And some questions are in-between:

For the first kind of question, we can tell when answers are wrong, and it's not so bad. For the second and third kinds, an authoritatively-written answer could be taken as authoritative; for these, I think we should ban the use of ChatGPT with extreme prejudice. Which is better?

  • No information is available online; or
  • Information is available online, but it's wrong.

My proposal

  • For questions about retrocomputing in the modern day, we could allow "AI-generated" answers for now.
  • For questions about well-known, easily-verifiable history (e.g. "what date was this released?"), we could likewise allow these tools.
  • For questions about obscure history, no.
  • For questions about people's motivations, no.
  • For questions where the AI might hallucinate a historical event, a lost artefact, or a quotation from somebody, no.

ChatGPT, and tools like it, are very good at making things up. This could easily lead to citogenesis. How would we know that this has occurred? It's not just the quality of our site that's at stake, here, but our ability to preserve knowledge at all.

xkcd Citogenesis; see transcript.

Where Citations Come From:

Citogenesis Step #1

Through a convoluted process, a user's brain generates facts. These are typed into Wikipedia.
[[A guy with short hair sits at a desk, typing on a laptop.]]
Guy: (typing) The "scroll lock" key was designed by future Energy Secretary Steven Chu in a college project.

Step #2

A rushed writer checks Wikipedia for a summary of their subject.
[[A woman with a ponytail sits at a desk, typing on a desktop.]]
Woman: (typing) US Energy Secretary Steven Chu, (Nobel Prizewinner and creator of the ubiquitous "scroll lock" key) testified before Congress today...

Step #3

Surprised readers check Wikipedia, see the claim, and flag it for review. A passing editor finds the piece and adds it as a citation.
[[A man sits on a couch with a laptop in his lap, typing.]]
Man: Google is your friend, people. (typing) <ref>{{cite web|url=

Step #4

Now that other writers have a real source, they repeat the fact.
[[A flow chart, with "Wikipedia citation" in the center. The word "Wikipedia" is in black, the word "citations" is white with a red background. A black arrow leads from "brain" to "Wikipedia."
A black arrow labeled "words" leads from "Wikipedia" to "careless writers," and a red arrow labeled "citations" leads back to "Wikipedia citations."
A black & red arrow leads from "Wikipedia" to "cited facts" which leads to "slightly more careful writers," which leads to "more citations," which leads back to "Wikipedia" (all black & red arrows).]]
References proliferate, completing the citogenesis process.

Title text:

I just read a pop-science book by a respected author. One chapter, and much of the thesis, was based around wildly inaccurate data which traced back to ... Wikipedia. To encourage people to be on their toes, I'm not going to say what book or author.

1
  • Good proposal, I don't disagree. External verification is always welcomed and ChatGPT won't provide any (until v2.0?) but in our site we do welcome anecdotal evidence from those of us who used the stuff for real in years gone by. Here the lines blur slightly. We need to keep our eyes open.
    – Chenmunka Mod
    Commented Dec 9, 2022 at 18:28
7

GPT answers should be banned because that’s modern technology. Retro systems use MBR partitioning.

More seriously, though. Any policy that aims to combat misinformation should target it regardless of whether it was LLM-written or not, as a matter of both fairness and practicality:

  • As for the former, it has been pointed out (especially on Stack Overflow) that LLM-written answers don’t create qualitatively different problems. The volume may be higher, but the problems remain fundamentally the same: answers that are incoherent, subtly wrong, misguided, or miss the point of the question entirely, posts that go ‘I don’t know the answer, but here’s something vaguely tangentially related just so I can scoop some Internet points from sympathy/A-for-effort upvotes’. And it’s actually those issues that are what makes LLM-written answers problematic, more than being LLM-written on its own; if an LLM always wrote perfectly accurate and relevant answers, it would be much less problematic, putting aside the question of plagiarism.

    I wouldn’t mind that much, for example, if someone were to use an LLM like a ghostwriter instead of a source of knowledge: that is, knowing an answer yourself, you could prompt the model ‘write an answer explaining that X, Y and Z’, refine the response and then post it; as opposed to asking the model for an answer and then copy-pasting it. It could be a great tool for getting around a writer’s block.

  • As for the latter, we don’t really have a reliable way to tell whether an answer was written by ChatGPT or not. False positives are not unheard of (though so far evidence is mostly anecdotal: see https://meta.stackoverflow.com/q/422066/3840170 and https://www.aiweirdness.com/writing-like-a-robot/). If we take the output of a specific LLM-detection tool too seriously, we may end up flushing the baby out with the bathwater.

Even if a policy banning answers generated from language models makes sense as a matter of plagiarism, the lack of reliable methods of detection means it would amount to a symbolic declaration at best, and an excuse for witch-hunts at worst, mired in false positives and true positives claimed to be false positives.

As such, being LLM-written should generally not be a primary reason to delete an answer. So what interventions would I recommend against such answers? I can think of a few:

  • Be more generous with downvotes, delete votes and ‘not an answer’ flags (and perhaps even ‘rude or abusive’ flags), and more strict with handling of the latter, especially watching out for:

    • Links that do not resolve to anything relevant
    • References to never-existent products and companies
    • Anachronisms
    • Long-winded, incoherent rambling
    • Posts that answer a different question from what was asked, even if they appear superficially related
    • Answers giving advice that is misguided or subtly (or not-so-subtly) wrong

    In particular: often when a post that misses the point of the question is flagged ‘not an answer’, moderators wash their hands off hiding behind the policy that ‘flags are not for technical inaccuracies’. Even though what was reported was not a mere technical inaccuracy, but a failure of basic reading comprehension. That needs to change.

  • Be more stingy with upvotes. Review and fact-check before upvoting. (Even if it’s a human poster!) Don’t upvote answers just because they are long and nicely formatted.

  • Raise the reputation threshold needed to upvote answers, and the penalty when receiving a downvote for an answer. This should reduce the impact of careless drive-by upvoting from HNQs and thwart attempts at reputation farming.

  • Speak up: leave comments when an answer seems fishy. Demand verifiable sources for claims. Be sceptical (though maybe not entirely dismissive) of personal recollections.

These may be no panacea, but then, I don’t expect any such to appear soon.

2
  • 1
    Re references to non-existent products and companies. Well, that's going to make it hard for me to post answers about anything from DEC, ICL, or English Electric.
    – dave
    Commented Dec 18, 2022 at 23:49
  • Beside that this completely misses the topic of I answers (as stated) ,I have to disagree at most points. Fact checking isn't anything moderators are about to do or that should result in any other action than up or downvoting - in fact, even downvoting is nothing that should be tied to simple details being off. An answer may be quite useful despite having errors. That usefulness is what we need to keep in mind. it's not about being right, but being helpful.
    – Raffzahn
    Commented Dec 24, 2022 at 23:34
5

Allow these answers only when clearly marked as such

It's possible (perhaps unlikely, but possible) that AI learning systems could pick up real information that's not known to the crowd here, and present it along with links to credible sources for future readers.

But it's equally likely for them to misunderstand or invent something and present as established fact, and be completely indistinguishable from the former case.

So machine-generated answers should clearly indicate how they were created, allowing readers to use their own judgement and follow up the references. Anything detected as generated but without the disclaimer should be summarily deleted.

If nothing else, it's clearly fraudulent to pass off someone else's work (or some machine's) as one's own, and that's not something I would like to encourage.

3

No ML (so-called “AI”) content

Besides the very good reasons listed in the SO ban, the output of these models is derived by a program running on a deterministic computer from its inputs, and therefore almost always illegal because it’s a combination of suspiciously and unlawfully sourced material, including material under conflicting copyleft and other licences.

0
3

I think it's by now clear it's about this answer.

Let me sumarize my reaction when I saw it:

  • I thought of it as really bad
  • Only mentioning common 'knowledge'
  • Yes, 'knowledge' is quoted as all information presented are platitudes of the worst kind without any reflection
  • It doesn't touch the subject of the question asking about 'why' at all
  • It's more like a rant going astray around a keyword - ELIZA on speed.
  • Or better just a random collection of statements without any inherent value.

I guess it's pretty clear I do not give that Answer any credit. So, why didn't I downvote or flag?

  1. No downvote, as it's right in that soft area of so-so answers not mentioning anything wrong but not providing any content.
  2. No flaging, as it's on that neither good nor explicit bad just deep in the clueless area
  3. Being friendly to newbies. For such common keywords we often get new users that simply dump their 'knowledge', no matter if useful or not, and this was a new, low score account.
  4. While I was close to write a comment about it, I did abstain, as it seemed like a very fruitless endeavour to argue with people writing texts like that. Not just because of the OP, but also because of possible secondary arguments about me being nasty.

Long story short, I simply decided to ignore it.


It does, like many other answers in such 'common knowledge' areas, repeat what 'must be true'. In the end it's exactly what AIs can do: provide non-contend, non-offensive and not-wrong texts - like the schmooze those side table magazines use to sell their paparazzi pictures.

And also exactly the reason why AI generated texts have no place on RC.SE. AI can not provide any information. They do not and will never. RC.SE is a medium intended to increase knowledge, for each single question and the site at whole.

There is no filter of collected information against knowledge or even basic logic. They only re-play what they 'hear' most about around a given topic, resulting in not in a scientific summary of that topic, but collected and averaged gossip. It's well known from process control that output of systems working on feedback of averaged values will always degrade.

This is the exact counter-thesis to what RC.SE is intended for. We do not want to average knowledge but collect the peaks of knowledge.

Bottom line: There is no place in RC.SE for any kind of automated answers


On a side note, it's a good example why it's important to have your personal information filled out - something AIs not really can provide.

2

I didn’t find an answer that I felt was the best place to put the following information, so I’m adding and answer. This information is quoted from the ChatGPT web site [emphasis mine]:

Limitations

  • ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers. Fixing this issue is challenging, as: (1) during RL training, there’s currently no source of truth; (2) training the model to be more cautious causes it to decline questions that it can answer correctly; and (3) supervised training misleads the model because the ideal answer depends on what the model knows, rather than what the human demonstrator knows.

  • ChatGPT is sensitive to tweaks to the input phrasing or attempting the same prompt multiple times. For example, given one phrasing of a question, the model can claim to not know the answer, but given a slight rephrase, can answer correctly.

  • The model is often excessively verbose and overuses certain phrases, such as restating that it’s a language model trained by OpenAI. These issues arise from biases in the training data (trainers prefer longer answers that look more comprehensive) and well-known over-optimization issues.

  • Ideally, the model would ask clarifying questions when the user provided an ambiguous query. Instead, our current models usually guess what the user intended.

  • While we’ve made efforts to make the model refuse inappropriate requests, it will sometimes respond to harmful instructions or exhibit biased behavior. We’re using the Moderation API to warn or block certain types of unsafe content, but we expect it to have some false negatives and positives for now. We’re eager to collect user feedback to aid our ongoing work to improve this system.

-4

The question has been up now for a week and has attracted some good, well argued answers. So let me put up an answer that summarises these.

Incidentally, as I write this, the generated answer that started this off is still on the site, with 3 upvotes and 1 downvote. I deliberately haven't linked to it.

The consensus seems to be we don't want them. However, the feeling is not to delete any answers flagged as being AI generated just because they appear to be AI generated - can we trust the flagging mechanism any more than we trust the AI. Treat them like any other answer in terms of Very Low Quality and Not an Answer flags. Either of those flags will put the answer into the review queue and those with the privilege may Vote To Delete. Suspicion of AI generation may be taken into account by the reviewers.

We are not under the barrage of answers that some other sites are receiving. I believe we are not really suitable for the AI - at present. So The issue is not a major problem here.

I'll keep an eye on answers posted and this question can stay open to receive your views.

11
  • 7
    That does not seem to be a fair summary of the posts I just read here. The top voted post says no bots at all. The rest seem to only want to allow them in limited circumstances when properly marked or only after changing policies. As for the example Answer--since it has so few votes, it's not really a good way to gauge consensus. Plus the people who care about the health of the site check out Meta, while those who vote tend to only care at the individual question level.
    – trlkly
    Commented Dec 17, 2022 at 16:53
  • 4
    By number of votes, the consensus is "no". By number of answers, the consensus is also "no". There are two answers saying "in limited circumstances" (one mine), and both disagree on what those limited circumstances are. user3840710's answer leans closest to "yes", but actually it's arguing for stricter attitudes all round. I really don't see where you got this consensus from.
    – wizzwizz4 Mod
    Commented Dec 18, 2022 at 20:43
  • As per what I see as the consensus, I think I'm going to delete that answer ­– unless you have a compelling argument that it should stay. People don't seem to want them here.
    – wizzwizz4 Mod
    Commented Dec 20, 2022 at 1:23
  • So this is the one? <retrocomputing.stackexchange.com/a/25761/15334> I guess I can reveal that I was the lone downvoter. The thought that this was an AI-generated post didn’t even cross my mind, but even then, it was the kind of evasive, generic, missing-the-point non-answer that I wouldn’t accept even from an ostensibly human poster, and that unfortunately I have been seeing on Stack Overflow a lot, long before ChatGPT even existed. (I’m kind of surprised that I haven’t flagged it NAA as well.) Commented Dec 24, 2022 at 21:17
  • @user3840170, yes, that's the one. And as far as I know, the only one so far. I didn't think it was so bad as to warrant unilateral mod deletion, but I won't miss it
    – Chenmunka Mod
    Commented Dec 24, 2022 at 21:34
  • While I agree that this 'summation' is not really capturing what has been said, please stop downvoting. Downvoting has no place on Meta, as a meta answer can by definition not be useless.
    – Raffzahn
    Commented Dec 24, 2022 at 22:38
  • 3
    @Raffzahn Voting is different on meta.
    – wizzwizz4 Mod
    Commented Jan 15, 2023 at 18:16
  • @wizzwizz4 Can't find the 'feature request' here on this, can you? Point is that voting with multiple votes isn't as easy and will for sure not work here - not at least as 3 downvotes are enough to suppress an answer, which is a quite hefty malus.
    – Raffzahn
    Commented Jan 16, 2023 at 0:51
  • 2
    @Raffzahn You could ask a separate meta question – or a Meta Stack Exchange question – but it's been how Stack Exchange meta sites have worked for the past decade.
    – wizzwizz4 Mod
    Commented Jan 16, 2023 at 2:36
  • @wizzwizz4 They work that way because they use the same mechanic as the main site, not because it's useful or appropriate. Just ask yourself, do we need software enforcing good behaviour, or are we able to act sensible on our own? Do we?
    – Raffzahn
    Commented Jan 16, 2023 at 2:47
  • 2
    @Raffzahn If you want to change how meta works, you'll need to get community consensus. This is how meta works on every other Stack Exchange site, and as far as I can tell, it's how the other users here are using it, too.
    – wizzwizz4 Mod
    Commented Jan 17, 2023 at 15:54

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .