0

Setting aside the issues with spam and other illegitimate uses, is it necessary to provide attribution to works generated entirely by a machine? The US Copyright office states that such works are not subject to copyright, so the legal aspect is clear.

Note that I'm not necessarily talking about copying answers from ChatGPT/GPT-3+ verbatim. One could also start from an answer generated by ChatGPT and then edit it to become a high-quality SE answer with reference links.

2
  • I'm not sure if the legal aspects are clear. For example but not limited to that, the US Copyright office may not have jurisdiction everywhere in the world where this site operates. However, what I would have liked is an answer from the company. But then, unfortunately, they do not give legal counsel. So let's simply look at the TOS which says "...all such Public Content must have appropriate attribution...". This looks kind of clear to me. Yes, whenever somebody did not write substantial parts of a post himself, some sort of attribution must be given in order to comply with the TOS, I think. Commented Jun 7, 2023 at 8:04

4 Answers 4

44

The US Copyright office states that such works are not subject to copyright, so the legal aspect is clear.

That doesn't mean it shouldn't be attributed, though. By that argument, anything you copy that is not copyrighted does not require attribution (such as public domain works). Regardless of the technical legality of doing such a thing, that is not acceptable behavior on our network.

Think of it in the inverse. If you do not attribute that the content was not created by you, it is assumed that you are trying to claim copyright for it as your own work, which you did not create. How would anyone else know otherwise?

Attribution is about more than just protecting copyright. It emphasizes that pieces were created by someone else and not you, which can be important information in determining the validity of an answer, where it can be used elsewhere, or how to properly attribute another reproduction. If it wasn't created by you, attribution is always required here. We do not make exceptions because the content being attributed is not copyrighted.

13
  • 2
    No, public domain works are different because there's still an actual person somewhere who did the work in question (even if they're long dead). In this case the work was created by a machine algorithm and it's not possible to "steal" it. Commented Dec 12, 2022 at 21:22
  • 10
    @JonathanReez What's your point? By definition, a public domain work is not eligible for copyright and thus also not possible to steal. My point is that whether something is copyrighted is not really relevant to us. What matters is if you created it.
    – animuson StaffMod
    Commented Dec 12, 2022 at 21:25
  • 3
    No, a public domain work has expired copyright or a waived copyright. Otherwise it would normally be subject to copyright. For this reason you don't need to provide attribution to "mother nature" if you create a pattern based off a butterfly's wing - the butterfly's patterns are not copyrightable in the first place. Commented Dec 12, 2022 at 21:34
  • 3
    And you did create the work! You've entered a prompt into ChatGPT, which spit out an answer. You've analyzed it to see if it looks reasonable, added a few edits, then posted it on SE. It's a tool that you've used to create an original work. Commented Dec 12, 2022 at 21:35
  • 10
    @JonathanReez At best, the person who created the tool could be considered the creator of the content. Telling a computer "create this" does not constitute you creating anything. That would be akin to telling an assistant to go make a cup of coffee and then claiming you made it because you issued the order. If the text was generated and you didn't make substantial edits to make it your own, then it is right to attribute the tool that created it.
    – animuson StaffMod
    Commented Dec 12, 2022 at 21:41
  • 3
    @animuson no, the creator of ChatGPT doesn't get credit just like the creator of Word doesn't get credit just because it auto-capitalized and reformatted your text automatically. Google Translate doesn't get credit for translating text. Any fully automated system becomes a generic tool, no different from other tools. Commented Dec 12, 2022 at 21:43
  • 19
    @JonathanReez Our policy is that attribution is required if you didn't create it. That's all. I could frankly care less that it is ownerless, but we would not consider you to be the owner of text generated by such a tool because you did not actually write it. This tool is not comparable to how Word or Google Translate process input, and we're not going to get into a long comment discussion about copyright. Our policy is clear. You didn't write it. It requires attribution. Period.
    – animuson StaffMod
    Commented Dec 12, 2022 at 21:49
  • 2
    Okay, policy clarified, accepting this answer. Commented Dec 12, 2022 at 21:52
  • 10
    I think this might be clearer if you remove the argument in the middle paragraph, it's not really necessary and it can mislead people into thinking copyright matters for this issue, which is one of the major sources of confusion on this topic. The SE rules are really much closer to the academic rules of attribution than anything copyright-related. The law is irrelevant here (or rather anything that breaks the law is handled by a different rule), it only matters that you properly attribute content that is not your own. Commented Dec 12, 2022 at 22:07
  • 1
    "In this case the work was created by a machine algorithm..." @JonathanReez The algorithm is still using text that was created by other people.
    – BSMP
    Commented Dec 14, 2022 at 18:08
  • 2
    @BSMP it was trained on text written by other people but doesn't use any single piece of text directly. Just like every single piece of text written by humans is inspired/based on text that they've previously read/heard. Commented Dec 14, 2022 at 19:12
  • 2
    A chisel, pen, paintbrush, typewriter, camera, word processor, computer, spell checker, grammar checker, AI assistant, etc are all just artists' tools. Cameras automated the work of painters. Photography has become an art form in its own right. AI is automating many forms of art, but it is still just an artist's tool. AI does not wake up in the morning saying "I think I will do an illustration this morning and this afternoon I will produce a children's book, tonight I will write a few poems. It takes Human interaction to do such things. Commented Jan 16, 2023 at 1:35
  • I think we should compare it to IDEs/linters/beautifiers. It adds spaces/tabs in code/pretty prints it. Sometimes, it even fixes some bugs based on inbuilt syntax trees. "If it wasn't created by you, attribution is always required here." We didn't fix/beautify the code, but we don't exactly attribute the little intelligence shown by the IDE. Why do we need to do for Artificial intelligence? I think the answer needs to focus on that - especially the ratio of work done by us to the work done by a machine in creating the final output. That ratio seems to determine policy here.
    – TheMaster
    Commented Feb 20 at 1:17
24

If you didn't write the text yourself, you need to quote it and provide a reference to its source. That's it.

Copyright is irrelevant; that is handled by different rules and lawyers if you violate it. The attribution rules here are not about copyright at all. They are solely about indicating which parts of your post are not your own.

2

Is attribution required for machine-generated text when posting on Stack Exchange?

Maybe if you're based in China. From China bans AI-generated media without watermarks:

Providers of deep synthesis services shall add signs that do not affect the use of information content generated or edited using their services. Services that provide functions such as intelligent dialogue, synthesized human voice, human face generation, and immersive realistic scenes that generate or significantly change information content, shall be marked prominently to avoid public confusion or misidentification.

It is required that no organization or individual shall use technical means to delete, tamper with, or conceal relevant marks.

I'm unclear whether this applies to ChatGPT users based in China, as ChatGPT is hosted in the United States.

Also, the license of some text-generation models require attribution. E.g., the language model BLOOM is licensed under the BigScience RAIL License v1.0, which specifies:

You agree not to use the Model or Derivatives of the Model: [...] To generate or disseminate information or content, in any context (e.g. posts, articles, tweets, chatbots or other kinds of automated bots) without expressly and intelligibly disclaiming that the text is machine generated;

0
2

Note that I'm not necessarily talking about copying answers from ChatGPT/GPT-3+ verbatim. One could also start from an answer generated by ChatGPT and then edit it to become a high-quality SE answer with reference links.

I think this part is not answered by other answers. The leading policy here stated:

If it wasn't created by you, attribution is always required here

There are issues with this policy statement.

Programmers generally use IDEs/linters/beautifiers. It adds spaces/tabs in code/pretty prints it. Sometimes, it even fixes some bugs based on inbuilt syntax trees. However, we generally don't attribute the tool. What if we used ChatGPT to lint and beautify the code instead of traditional tools? Do we suddenly need to attribute it now, because we used ChatGPT to do the same?

What about grammar/capitalization? If we used ChatGPT instead of Google docs/Grammarly(a add on) to do the same, do we suddenly need to attribute it?

What if the entire post is written by ChatGPT? But you research it for quality and add links, fix bugs in the code. Do we need to attribute ChatGPT?

All the above is a mix of machine generated text and human content.

A better policy would be

If it wasn't created primarily by you, attribution is required here

How would we judge what is primarily created by you? In other words, who is the major contributor of this work? If the ratio of work done by you to the work done by a machine is at least more than 50%, I believe we can attribute it to you - It's a case of a human using a tool to create the content - that's it. As with any quality standards, this would also be a case by case subjective evaluation.

3
  • 1
    I view attribution not only as acknowledgement to the original source but also as a way to verify what was posted. If I use X as my knowledge source to create a post here and attribute to X, now I as a consumer of the post have better grounds to trust it, since I can directly go the knowledge source itself and determine if its trustworthy. With posts containing AI content this might get a bit murky, but I think the least a poster can do is acknowledge which ai tools were used to generate the post, so that the consumer can determine how much trust he wants to put into it, in case ai was...
    – user13267
    Commented Feb 20 at 2:44
  • ...used for more than cosmetic changes such as formatting
    – user13267
    Commented Feb 20 at 2:45
  • "such and such AI has participated in the creation of this question/answer" Commented May 16 at 15:50

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .