Skip to main content
added 344 characters in body
Source Link
Xirema
  • 2k
  • 1
  • 10
  • 14

Important: something being "technologically infeasible" is not the same thing as saying "so we don't/shouldn't have to care about it". If AI-Art or AI-Writing can't be done ethically, I would argue it shouldn't be done at all, and whether or not running these algorithms ethically is a thing that can be done isn't in my purview.

But we need to dispose of this notion that these algorithms are, in any meaningful sense, engaging in original thought. They're not, and they're not truly replicating human thought, either at the macro level of human consciousness or at the micro level of individual brain neurons. Neural Networks are designed to emulate and reproduce existing works based on what the algorithm detects as being similar to those existing works, and they can be in some cases very uncanny replications, but they aren't original creations.

But we need to dispose of this notion that these algorithms are, in any meaningful sense, engaging in original thought. They're not, and they're not truly replicating human thought, either at the macro level of human consciousness or at the micro level of individual brain neurons. Neural Networks are designed to emulate and reproduce existing works based on what the algorithm detects as being similar to those existing works, and they can be in some cases very uncanny replications, but they aren't original creations.

Important: something being "technologically infeasible" is not the same thing as saying "so we don't/shouldn't have to care about it". If AI-Art or AI-Writing can't be done ethically, I would argue it shouldn't be done at all, and whether or not running these algorithms ethically is a thing that can be done isn't in my purview.

But we need to dispose of this notion that these algorithms are, in any meaningful sense, engaging in original thought. They're not, and they're not truly replicating human thought, either at the macro level of human consciousness or at the micro level of individual brain neurons. Neural Networks are designed to emulate and reproduce existing works based on what the algorithm detects as being similar to those existing works, and they can be in some cases very uncanny replications, but they aren't original creations.

An essay about what Neural Networks actually do because I keep getting annoying comments insisting Neural Networks are "just like brains!!1!"
Source Link
Xirema
  • 2k
  • 1
  • 10
  • 14

EDIT: I don't know why this answer in particular is attracting users who seem ignorant about the relationship between Neural-Network AIs and actual biological human cognition, but to address a few recurring comments I've been seeing:

No, Neural Net AI Algorithms are not capable of Original Thought in the same way that humans are

Addendum: I am not saying that any/all AI Algorithms are not/will never be capable of original thought, but Neural Networks certainly are not.

Neural Networks get their name from a conceptual similarity between how biological neurons function and the abstract model of "Neurons" deployed in Neural Networks, which unfortunately has led to false equivalencies being made between the two, implying that human neurons and AI neurons are essentially equivalent or "the same, but varying in speed/power/etc.", and it needs to be understood that this is fundamentally untrue.

There's a lot of technical reasons why the comparison isn't particularly cogent, but to reduce it down to the simplest form possible: the biggest reason that Neural Networks cannot produce original thought is because they're not trying to. Neural Networks are designed with 'emulation' of existing data as an end-goal. To use ChatGPT as an example, its goal is not to create original ideas, its goal is to produce text that its model detects, with high consensus, is similar to the text that a human* has already produced. Stable Diffusion and other AI-Art-Generating algorithms operate on similar principles, attempting not to produce original paintings, but instead to produce an image that is similar to images that humans have already created.

* as defined by the model's training data, which I am assuming consists entirely of human-produced content but it should be acknowledged this may not be the case since the sloppiness with which sources are pulled into the model can in many cases pull in other AI-generated content, which would train the model not to talk like a human but instead talk like an AI

A really good case study for this "don't produce something original, produce something that resembles something that has already been created" function of neural networks is a test I performed, using Stable Diffusion, where I described a hypothetical D&D character ("Githyanki Woman in Red Trenchcoat") and had the algorithm try to generate an image. These were the produced results, which give us some important insights. The algorithm attempted to produce 4 images, and of those 4 images, two of them clearly modeled Cosplay photos (with some distressing facial distortion, which I'm guessing is the influence of the 'Githyanki' species modifier) and the other two generated... images of physical tabletop tokens, stand and everything.

It's not difficult to figure out what happened. The metadata of 'Githyanki' (a rare but playable humanoid species in some editions of Dungeons and Dragons) connected the algorithm to images that were created in Heroforge and other TTRPG token creators, along with cosplay photos of people modelling Critical Role characters (or their own original D&D characters). This is how the algorithm tried to establish 'similarity' between the images it created and the images it associates with the prompt provided.

But it's also very clear that none of the images produced are images that a human artist would create. In fact, the only way a human would produce any of those images themselves is if they had attempted to cheat the prompt—grabbing an image from a token generating website or a cosplay photoshoot and passing it off as their own work.

In other words, the AI did an exemplary job of replicating what a human committing blatant plagiarism might attempt to produce.


The reason I'm going into this long explanation and case study, aside from reinforcing my point about modern AI-generation algorithms only really being capable of committing mass-scale plagiarism, is to emphasize this point about these algorithms not being capable of original thought, and why arguments like "it's just like a human brain!" or "humans don't have to cite their training, why should AI have to?" are invalid. Some prompts are easier for the AI to replicate than others; certainly my example prompt broke the limits of what the AI was capable of replicating. But it's critical to understand that even when the AI is doing a much better job of reproducing what the prompt has asked it to generate, it's still doing the exact same thing as when it broke under the weight of my request: copying the data it found in its training data.

Now, the ethics of these algorithms can be solved: these algorithms could purge their databases, begin taking data only from artists who have explicitly consented to have their art/writing ingested, properly cite each work ingested, and make that data publicly available and easy to access, solving the widespread plagiarism I started out by addressing. I would even go a step further and argue that to be truly ethical the algorithm would also have to be able to cite the specific works whose influences composed the specific resulting output, but I'm guessing that might be technologically infeasible.

But we need to dispose of this notion that these algorithms are, in any meaningful sense, engaging in original thought. They're not, and they're not truly replicating human thought, either at the macro level of human consciousness or at the micro level of individual brain neurons. Neural Networks are designed to emulate and reproduce existing works based on what the algorithm detects as being similar to those existing works, and they can be in some cases very uncanny replications, but they aren't original creations.

EDIT: I don't know why this answer in particular is attracting users who seem ignorant about the relationship between Neural-Network AIs and actual biological human cognition, but to address a few recurring comments I've been seeing:

No, Neural Net AI Algorithms are not capable of Original Thought in the same way that humans are

Addendum: I am not saying that any/all AI Algorithms are not/will never be capable of original thought, but Neural Networks certainly are not.

Neural Networks get their name from a conceptual similarity between how biological neurons function and the abstract model of "Neurons" deployed in Neural Networks, which unfortunately has led to false equivalencies being made between the two, implying that human neurons and AI neurons are essentially equivalent or "the same, but varying in speed/power/etc.", and it needs to be understood that this is fundamentally untrue.

There's a lot of technical reasons why the comparison isn't particularly cogent, but to reduce it down to the simplest form possible: the biggest reason that Neural Networks cannot produce original thought is because they're not trying to. Neural Networks are designed with 'emulation' of existing data as an end-goal. To use ChatGPT as an example, its goal is not to create original ideas, its goal is to produce text that its model detects, with high consensus, is similar to the text that a human* has already produced. Stable Diffusion and other AI-Art-Generating algorithms operate on similar principles, attempting not to produce original paintings, but instead to produce an image that is similar to images that humans have already created.

* as defined by the model's training data, which I am assuming consists entirely of human-produced content but it should be acknowledged this may not be the case since the sloppiness with which sources are pulled into the model can in many cases pull in other AI-generated content, which would train the model not to talk like a human but instead talk like an AI

A really good case study for this "don't produce something original, produce something that resembles something that has already been created" function of neural networks is a test I performed, using Stable Diffusion, where I described a hypothetical D&D character ("Githyanki Woman in Red Trenchcoat") and had the algorithm try to generate an image. These were the produced results, which give us some important insights. The algorithm attempted to produce 4 images, and of those 4 images, two of them clearly modeled Cosplay photos (with some distressing facial distortion, which I'm guessing is the influence of the 'Githyanki' species modifier) and the other two generated... images of physical tabletop tokens, stand and everything.

It's not difficult to figure out what happened. The metadata of 'Githyanki' (a rare but playable humanoid species in some editions of Dungeons and Dragons) connected the algorithm to images that were created in Heroforge and other TTRPG token creators, along with cosplay photos of people modelling Critical Role characters (or their own original D&D characters). This is how the algorithm tried to establish 'similarity' between the images it created and the images it associates with the prompt provided.

But it's also very clear that none of the images produced are images that a human artist would create. In fact, the only way a human would produce any of those images themselves is if they had attempted to cheat the prompt—grabbing an image from a token generating website or a cosplay photoshoot and passing it off as their own work.

In other words, the AI did an exemplary job of replicating what a human committing blatant plagiarism might attempt to produce.


The reason I'm going into this long explanation and case study, aside from reinforcing my point about modern AI-generation algorithms only really being capable of committing mass-scale plagiarism, is to emphasize this point about these algorithms not being capable of original thought, and why arguments like "it's just like a human brain!" or "humans don't have to cite their training, why should AI have to?" are invalid. Some prompts are easier for the AI to replicate than others; certainly my example prompt broke the limits of what the AI was capable of replicating. But it's critical to understand that even when the AI is doing a much better job of reproducing what the prompt has asked it to generate, it's still doing the exact same thing as when it broke under the weight of my request: copying the data it found in its training data.

Now, the ethics of these algorithms can be solved: these algorithms could purge their databases, begin taking data only from artists who have explicitly consented to have their art/writing ingested, properly cite each work ingested, and make that data publicly available and easy to access, solving the widespread plagiarism I started out by addressing. I would even go a step further and argue that to be truly ethical the algorithm would also have to be able to cite the specific works whose influences composed the specific resulting output, but I'm guessing that might be technologically infeasible.

But we need to dispose of this notion that these algorithms are, in any meaningful sense, engaging in original thought. They're not, and they're not truly replicating human thought, either at the macro level of human consciousness or at the micro level of individual brain neurons. Neural Networks are designed to emulate and reproduce existing works based on what the algorithm detects as being similar to those existing works, and they can be in some cases very uncanny replications, but they aren't original creations.

Active reading [<https://en.wiktionary.org/wiki/de_facto#Adjective>]. Used more standard formatting (we have italics and bold on this platform). Endash is only for numbers, as least according to The Chicago Manual of Style. Toned down the formatting (use view "side-by-side Markdown" to compare).
Source Link

I believe any AI-Generated Content needs to be Banned Network-Wide because it is an Attribution/Plagiarism Nightmare Scenario

I believe any AI-generated content needs to be banned network-wide, because it is an attribution/plagiarism nightmare scenario

Some of the other answers have focused on the difficulties with moderating AI-generated content for correctness, accuracy, usefulness, etc., and I don't meaningfully disagree with those answers. But I do think there is a big, big problem, especially for a website network like Stack Exchange, which is the near-impossibility of properly attributing credit for the words that the AI is producing.

The main problem is that the vast majority of AI-generating algorithms publicly available, including ChatGPT, do not properly credit the sources that were used in the training model that the AI used to tune its internal model. This means that any answers generated by the AI are, de-facto facto, Plagiarismplagiarism.

Many of these AI algorithms/organizations will obviate around the plagiarism issues in ways that avoid obvious legal culpability: ChatGPT, for example, refers to this data as "human AI trainers". But it remains the case that any use of these algorithms will constitute plagiarism until/unless stringent rules are applied to these organizations that they A) directly attribute every text that was used to train the AI model, B) only obtained these texts from authors that gave explicit, documented consent to have their texts used in these models, and C) document these contributions and make that data publicly available.

That doesn't necessarily mean it will never be appropriate to reference an AI-generated text per the normal Stack Exchange rules about quoting sources/using quote blocks in an answer (I have some skepticism about how often that will be appropriate), but I do think that this means—pending major ethical overhauls in AI-generated content—it will never be appropriate for AI-generated text to manifest as the body of an answer posted on this network.

I believe any AI-Generated Content needs to be Banned Network-Wide because it is an Attribution/Plagiarism Nightmare Scenario

Some of the other answers have focused on the difficulties with moderating AI-generated content for correctness, accuracy, usefulness, etc., and I don't meaningfully disagree with those answers. But I do think there is a big, big problem, especially for a website network like Stack Exchange, which is the near-impossibility of properly attributing credit for the words that the AI is producing.

The main problem is that the vast majority of AI-generating algorithms publicly available, including ChatGPT, do not properly credit the sources that were used in the training model that the AI used to tune its internal model. This means that any answers generated by the AI are, de-facto, Plagiarism.

Many of these AI algorithms/organizations will obviate around the plagiarism issues in ways that avoid obvious legal culpability: ChatGPT, for example, refers to this data as "human AI trainers". But it remains the case that any use of these algorithms will constitute plagiarism until/unless stringent rules are applied to these organizations that they A) directly attribute every text that was used to train the AI model, B) only obtained these texts from authors that gave explicit, documented consent to have their texts used in these models, and C) document these contributions and make that data publicly available.

That doesn't necessarily mean it will never be appropriate to reference an AI-generated text per the normal Stack Exchange rules about quoting sources/using quote blocks in an answer (I have some skepticism about how often that will be appropriate), but I do think that this means—pending major ethical overhauls in AI-generated content—it will never be appropriate for AI-generated text to manifest as the body of an answer posted on this network.

I believe any AI-generated content needs to be banned network-wide, because it is an attribution/plagiarism nightmare scenario

Some of the other answers have focused on the difficulties with moderating AI-generated content for correctness, accuracy, usefulness, etc., and I don't meaningfully disagree with those answers. But I do think there is a big, big problem, especially for a website network like Stack Exchange, which is the near-impossibility of properly attributing credit for the words that the AI is producing.

The main problem is that the vast majority of AI-generating algorithms publicly available, including ChatGPT, do not properly credit the sources that were used in the training model that the AI used to tune its internal model. This means that any answers generated by the AI are, de facto, plagiarism.

Many of these AI algorithms/organizations will obviate around the plagiarism issues in ways that avoid obvious legal culpability: ChatGPT, for example, refers to this data as "human AI trainers". But it remains the case that any use of these algorithms will constitute plagiarism until/unless stringent rules are applied to these organizations that they A) directly attribute every text that was used to train the AI model, B) only obtained these texts from authors that gave explicit, documented consent to have their texts used in these models, and C) document these contributions and make that data publicly available.

That doesn't necessarily mean it will never be appropriate to reference an AI-generated text per the normal Stack Exchange rules about quoting sources/using quote blocks in an answer (I have some skepticism about how often that will be appropriate), but I do think that this means—pending major ethical overhauls in AI-generated content—it will never be appropriate for AI-generated text to manifest as the body of an answer posted on this network.

Source Link
Xirema
  • 2k
  • 1
  • 10
  • 14
Loading