81
$\begingroup$

For the impatient reader, the guidance is in the title: please don't use computer-generated text for questions or answers on Physics.

  • Computer-generated text violates our expectation that users are posting substantially original content.

  • Computer-generated text which is not identified as such is plagiarized, and should be flagged as such.

  • Computer-generated answers are, in our experience, quite likely to leave the initial question unanswered. The most common failure modes are non-sequiturs, complete nonsense, or plausible-sounding fictional information. Such non-answers are an abuse of our users' time, and should be flagged as "rude or abusive."

Stack Exchange has, in mid-2023, expressed concern that automated detectors for computer-generated text may discriminate against non-native writers of English. The moderator team here on Physics have reason to believe that our approach to moderating suspected chatbot output has, in the first half of 2023, not suffered from the false-positive bias which is suspected for the network as a whole. We therefore have no plans to substantially change our moderation strategies.

In particular, the moderation team do not intend to ignore or decline flags on low-quality content because the flagger happens to speculate about the involvement of computer-generated text. We will instead continue to evaluate posts in the interest of cultivating and maintaining our community.

The original version of this post follows below.


In the past couple of weeks [in late 2022], a new generation of computer language-generating tools has become available to the public. The main bit of news is about a product called "ChatGPT," but that's just the most recent iteration of a class of software "chat bots." (Whether it's appropriate to refer to these systems as "artificial intelligence" is a philosophical question.)

Within a few days of ChatGPT's release, Stack Overflow issued a temporary don't-use-this policy, stating

Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking or looking for correct answers.

The primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce. There are also many people trying out ChatGPT to create answers, without the expertise or willingness to verify that the answer is correct prior to posting.

On Stack Overflow, the blanket ban has mostly been a volume problem. Physics is a much smaller community, and we have so far detected only a smattering of such posts. However, the ones we have found have been pretty terrible, ranging from low-information word salad to obvious physical errors. For example, the sentence

In the case you described, the Drapher's point [sic] corresponds to a temperature of approximately 3,631 K and a wavelength of approximately 3631 nanometers.

should raise the eyebrows of anyone whose physics education has gotten as far as Wien's Law. (It may not, however, raise the eyebrows of anyone who has tried to teach Wien's Law to reluctant intro-astronomy students.)

In another post, the asker ended their question with "I asked an AI, but it didn't help me," followed by a properly-quoted paragraph which hadn't helped them because it didn't make any sense. I had a little flashback to when my children were small, and would sometimes run excitedly up to me, saying, "this thing! i found it on the floor! it tastes so gross! you have to try it!"

Some posts have even crossed the line from well-intentioned to deceptive. On one now-deleted post, a commenter asked the user who posted the answer to include references. The post was edited to include

Some references for spin fluctuation are:

  • "Pairing in Type-II Superconductors Induced by Spin Fluctuations" by D. J. Scalapino, E. Loh, Jr., and J. E. Hirsch, Physical Review Letters, Vol. 50, No. 4 (1983)
  • "Spin Fluctuation-Mediated Pairing in Type-II Superconductors" by D. J. Scalapino, E. Loh, Jr., and J. E. Hirsch, Physical Review B, Vol. 34, No. 6 (1986)
  • "Spin Fluctuation-Mediated Superconductivity: A Review" by D. J. Scalapino, Physics Reports, Vol. 250, No. 3 (1995)

It's instructive to compare these "references" to a search of this time period at the Physical Review, which should turn up the first two. Scalapino and Hirsch coauthored a number of papers on superconductivity in the 1980s, including one in PRL v50 (1983) and another in PRB v34 (1986). However, Loh doesn't seem to have joined the group until 1986, and none of the team's coauthored titles includes the phrase "spin fluctuations." Likewise, the best candidate for the third reference has a different issue number and title. Is it a good use of anyone's time to pursue this detective work into thirty- and forty-year-old literature to see whether these rhymes-with-correct citations address the question at hand? Almost certainly not.

Note that my request to "please don't do this" isn't a new fancy policy tailored to the existence of an exciting new chatbot which superficially appears intelligent. Our community has a number of established posting standards which are violated by these low-quality contributions:

  • Originality. User contributions on this site are expected to be primarily the poster's own original work. If properly cited, including a small passage from a third party is fine, but complete answers are not.

  • Attribution. Content which originally appeared elsewhere, including your own content, must be posted with attribution. Plagiarized content may be hidden until appropriate attributions are added, or may be removed altogether. It isn't common, but some serial plagiarists have found their site-use privileges suspended.

  • Respect for others. If a user posts a question or an answer, our community needs to be able to expect that the post is a good-faith effort to learn things, or to help other people to learn things. Note that the network-wide policy is that "abuse of the system or the community," including cat-on-keyboard gibberish posts, can reasonably be flagged using the "rude or abusive" option, where enough flags will automatically delete the post and apply a reputation penalty. Surreptitiously involving Physics users in your tests of some chatbot software is rude. Generating "citations" without any idea whether they refer to real documents or not, much less whether the cited documents are relevant, is an abuse of other people's time.

$\endgroup$
8
  • 3
    $\begingroup$ Added to the network-wide list: meta.stackexchange.com/a/384923 $\endgroup$
    – PM 2Ring
    Commented Dec 19, 2022 at 15:15
  • 1
    $\begingroup$ at least the chatbot is less fantastical than snarxiv.org and more realistic than snarxiv.org/vs-arxiv $\endgroup$ Commented Dec 20, 2022 at 0:58
  • 1
    $\begingroup$ A recent BBC article, bbc.com/news/technology-65202597, noted that OpenAI says on their blog that ChatGPT "sometimes writes plausible-sounding but incorrect or nonsensical answers". We could do without such answers. $\endgroup$
    – Jon Custer
    Commented Apr 14, 2023 at 14:23
  • $\begingroup$ This is npt a question. $\endgroup$
    – kludg
    Commented Apr 15, 2023 at 11:10
  • 2
    $\begingroup$ @kludg Meta posts are used for announcements by moderators as well as for questions. This also opens up the topic for discussion, in the form of answers, if you agree or disagree with the post, and allows you to upvote or downvote if you agree or disagree with what it is saying. $\endgroup$
    – Chris Mod
    Commented Apr 23, 2023 at 5:33
  • 1
    $\begingroup$ @Chris The purpose of this post is to prevent discussions that may be opened in future by flagging them as duplicates of this post. That is why I dislike this post. $\endgroup$
    – kludg
    Commented Apr 23, 2023 at 8:16
  • $\begingroup$ @kludg Historically we have done pretty well at distinguishing between actual duplicates of policy questions, versus questions about how a policy applies in a particular case, versus questions about whether a policy should be revisited based on new information or based on the evolution of the community’s interests. For an example, see the links in this answer about homework. $\endgroup$
    – rob Mod
    Commented Apr 23, 2023 at 16:31
  • $\begingroup$ @JonCuster We have posted a response. $\endgroup$
    – rob Mod
    Commented Jun 1, 2023 at 18:31

1 Answer 1

-1
$\begingroup$

I think that a difference should be made between questions and answers. While I agree with the cited violation of standards for the answers, I disagree in the case of questions. Information from AI may trigger a desire to understand its validity precisely, like in the case of information provided by humans or found in books (which are the ultimate source for AI too).

$\endgroup$
8
  • 1
    $\begingroup$ No problem with downvotes. I expected them. However, would it be too much to spend one minute on an understandable comment? $\endgroup$ Commented Dec 31, 2022 at 18:54
  • 11
    $\begingroup$ Remember that, on Meta, there is some blurriness between whether votes mean “this post is useful” versus “I agree with this proposal.” One of the downvotes is mine, expressing disagreement but not disapproval. I discussed one example in the main post. I have seen other examples elsewhere, where “what is wrong with this computer-generated explanation” would be longer than the computer-generated explanation. It might be possible to construct a targeted question based on a chatbot’s output, but it would require enough skill that I’d prefer it were an exception to a general guideline. $\endgroup$
    – rob Mod
    Commented Jan 1, 2023 at 18:16
  • 1
    $\begingroup$ Here is such a question (which will probably eventually become a high-rep-only link) which clearly breaks our guidance about soliciting peer review. $\endgroup$
    – rob Mod
    Commented Jan 2, 2023 at 15:19
  • 3
    $\begingroup$ For those without sufficient rep, the example above is a wall of text (about 3000 words long) that was a response by ChatGPT to define a quantum gravity model with the user asking PSE members to review it. $\endgroup$
    – Kyle Kanos
    Commented Jun 3, 2023 at 16:26
  • $\begingroup$ The request that computer-generated text not be used appeared, incomprehensibly, in response to a post I've been considering that has, so far, contained absolutely no computer-generated content, whatsoever, unless the meaning of "computer-generated" might be interpreted to mean that I (a human containing, as far as I know, no artificial parts except for tooth fillings) had typed a question (not yet submitted) on the keyboard of a computer. $\endgroup$
    – Edouard
    Commented Dec 22, 2023 at 17:00
  • 2
    $\begingroup$ @Edouard Misidentification is a big concern. If you would like to discuss your experience, a fresh meta question is better than a buried comment on an old one. $\endgroup$
    – rob Mod
    Commented Jan 6 at 4:21
  • $\begingroup$ @rob - I've marked your comment as "read" (in the past tense), but, unfortunately, I no longer have any recollection about whoever or whatever included me in the discussion at hand. The only "ChatGPT" (-I hope I'm spelling or formatting it correctly) content that I've ever read was on Quora, not PSE. $\endgroup$
    – Edouard
    Commented Jan 6 at 15:40
  • $\begingroup$ @rob I guess I should've said "knowingly read", in the last sentence, rather than simply "read": I don't have total recall, and I write a lot. Since high school, many decades ago, I've been aware that the sources of material taken from other writers must be identified, except in personal communications. $\endgroup$
    – Edouard
    Commented Jan 6 at 15:45

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .