Competing with ChatGPT Answers: An Expert's Perspective

Question

Although I'm a moderator on two network sites who has had to deal with GenAI answers, I'm writing this from the perspective of a lower (<2k reputation, with limited access to moderation tools) user with expertise in the subject matter of the site in question.

I was recently checking on some recent answers to questions that I've answered across the network. I check in to see if new answers to the question have been posted, if an answer other than mine was accepted, vote (if necessary) on new answers, or leave comments on the question or any answers. I noticed that one question that I answered had a different accepted answer, so I read it. Several aspects of that answer stood out as suspicious, so I dug deeper.

Several aspects of the answer, although poorly formatted, had several hallmarks of GenAI posts. I don't want to get into the heuristics that I used here, since they were ones previously shared among moderators under the previous GenAI policy to determine if a post is suspicious. The first part of the answer was created using something like the question as a prompt and the second part of the answer using something like a comment left by the question asker.

When I probed more, I noticed that the user had been answering a lot of questions recently. I opened several of them and saw a pattern of characteristics that I tend to associate with GenAI. I also observed a distinct change in writing patterns from early 2021 and before. Although the user has a history of decent to good quality posts, there was a distinct change after tools like ChatGPT and Bard became available to the public. I did notice that there was a huge change in answer volume in late 2022 through today.

I used the Stack Exchange Data Explorer to investigate the change in answer volume. Prior to 2022, the user posted less than 10 answers in a day at a maximum. There were also days, weeks, and in one case, years, between posting. However, since the availability of GenAI tools, the user was posting several answers most days and peaked at nearly 40 answers in a day, with several days at over 20 answers per day.

Part of the reason that I, as an expert, use the SE platform, is to build a portfolio of answers that demonstrate my expertise. I've used this "portfolio" in more than a couple interviews in the past, to demonstrate my ability to write as well as my knowledge in relevant topics. Being able to demonstrate that my answers were up-voted and accepted goes to that demonstration of knowledge, expertise, and experiences.

But now, I am "competing" with GenAI answers. I have concerns as to what the follow-on effects will be. If GenAI posts are created quickly, and especially if they are up voted or accepted, will anyone bother to read and up vote later answers written by humans? Will new experts bother to follow up posts with new answers over time? There has been past discussion that reputation is seen as expertise - will this rep inflation for people who are not truly experts prevent experts from being seen, voted, and accepted? Will people who are using GenAI be wrongly seen as experts because they can write convincing answers?

Experiencing the GenAI problem from an expert and non-curator/non-moderator perspective has opened my eyes to a whole new side of the problem. Do experts want to participate in a venue where their answers get buried by GenAI? The thought crossed my mind, and if it's crossing my mind, it's probably crossing the minds of people who are even more expert than me who have better things to do with their time and knowledge.

If the communities on the network are to thrive, they need experts. And this expert has no desire to compete with GenAI. For me, this experience underscores the need to have strong policies and low thresholds for removing content that is likely to be produced by GenAI and ensure that the Stack Exchange network is a place for human experts to produce content that shows off their knowledge and expertise.

In the comments, Chindraba_on_strike brought up a good point about engagement that I wanted to elevate.

Engagement metrics are important for businesses that have platforms like the SE network. Driving away experts will have a negative impact on a lot of these engagement metrics. Consider what is likely to happen if GenAI answers are not handled quickly and decisively:

Experts, like myself, who do not want to compete with GenAI answers leave the network. This reduces the number of people who can successfully perform actions like asking for the appropriate clarifications on questions, making assertions about the subjectivity or answerability of questions, vetting the usefulness of questions to the general population and voting on them, and vetting answers and voting on them.
As experts leave, a number of engagement metrics go down. Experts who regularly visit stop visiting, so page view and active user metrics go down. Since experts provide answers, new answers are likely to decrease. Depending on the volume of questions, the recurring user metrics also go down.
As experts leave, the "learner" class also leaves. If getting answers on the network is no better than getting answers from ChatGPT, Bard, Copilot or other Generative AI solutions, why shouldn't the people with questions just ask those things directly? Page view and user metrics continue to drop. New question metrics drop. As both questions and answers drop, votes also drop since there's less to vote on.
A continued cycle of fewer questions to keep experts engaged and fewer experts continue to decrease engagement until communities are dead.

Holding on to experts should be a primary focus, since catering to that class of user is paramount to keeping other classes of users active and engaged.

Seems that the question translates into what does the company wants to make the sites become? A place for them to profit from any engagement, or a place to profit from helping others. Not keeping genAI content in check, who cares how in this case, might increase "engagement" of random or casual users, and thus some metric of profit. If, however, it drives experts away from answering, removing the probability of "useful" answers in the future, it will also contribute to a decrease in engagement as there are fewer answers worth finding. It's the company's call, and their future profit at risk. — Chindraba, Commented Jul 1, 2023 at 13:47
Most constructive discussion lately was with general AI on a specific topic where many humans are most probably biased. I would vote for identifying AI generated content and for to display its suggestions for comparison, then quality of facts and interacting will decide on the long future. — beyondtime, Commented Jul 1, 2023 at 13:55
@Chindraba_on_strike I don't see how you get long-term engagement from random/casual users. You need the long-term engagement of experts who can find and answer questions. Having these experts to answer and help vet answers gives the "learners" a reason to ask their question here. Otherwise, why would the "learner" not just go to ChatGPT or Bard or whatever to begin with? Driving away experts will decrease engagement, since regular contributors will stop contributing (so answer and vote related metrics go down) and learners will stop asking (so question and acceptance metrics go down). — Thomas Owens, Commented Jul 1, 2023 at 13:56
@ThomasOwens That's the crux of the problem. Short-term metrics vs. long-term metrics. Long-term is slow to grow while providing sustainability. Short-term can grow fast, and provide no path to sustainability. Both are an option. Which option does the company really wish to navigate? — Chindraba, Commented Jul 1, 2023 at 14:37
To the list of detection signals, there is also the level of verboseness of answers before and after December 2022. — This_is_NOT_a_forum, Commented Jul 1, 2023 at 15:17
@This_is_NOT_a_forum That's not a good heuristic. There are plenty of long answers from well before November 2022. Although a particular user's verbosity changing may be an indicator, I'd consider it very weak. Some questions simply require more verbosity than others. — Thomas Owens, Commented Jul 1, 2023 at 15:21
@beyondtime apologies if I've misinterpreted your comment, but you say: "then quality of facts and interacting will decide on the long future." The issue is, that genAI is extremely good at being convincingly wrong. It's enough to fool non-experts some of the time, and if it gets ahead could snowball to the top of the answer pile, even if low quality. — AncientSwordRage, Commented Jul 1, 2023 at 15:33
Yes, verboseness is not a good heuristic, on its own, for any specific post. However, verboseness, which can be approximated as answer length, is an indicator in bulk that there's a significant change over the whole body of work that is answers on Stack Overflow with the advent of ChatGPT. Verboseness/answer length, like SE's "gold standard", isn't something that can be applied to a single post, but it is a decent indicator of the relative impact of AI generation in bulk (note: the graph in the linked post is without answers already deleted as AI-generated). — Makyen, Commented Jul 1, 2023 at 16:34
I would hesitate to call automatically generated texts "answers"... responses seem to be more appropriate word — gnat, Commented Jul 1, 2023 at 19:06
This question is totally interesting, but also very broad. It basically asks for the implications of recent advances in AI on possible futures of this site. Just one possible aspect: if the goal is only to build a library of knowledge then it probably doesn't matter where the knowledge comes from (except for legal angles). But there is much more to it. — NoDataDumpNoContribution, Commented Jul 1, 2023 at 19:16
@DanubianSailor That may be true on some sites, but it is not universally true. It's also not the intent of an up vote - the hover text also explains it well, where an up vote means that the answer is useful. The usefulness of an answer depends on the competence of the answerer to not only give factual information, but do so in a helpful manner. — Thomas Owens, Commented Jul 3, 2023 at 14:47
@ThomasOwens the problem is, and it's since the long time, maybe even from the beginning, that people upvote for any reason, eg. if the answer has helped them learn something or put them in the right direction. And it's independent from the fact that answer might be plainly wrong - but the OP is the only person who can say that, because others have maybe in a way similar, but different problems. — Cjxcz Odjcayrwl, Commented Jul 3, 2023 at 17:12
@DanubianSailor I don't agree that the OP is the only person who can say if an answer is right or wrong. They can say which answer helped them the most, but that's accepting an answer. If someone with the appropriate knowledge cannot judge the correctness of the answer, then that would mean there is insufficient constraints on the problem and the question should be closed until those details are added. There are plenty of times when I've been able to up vote multiple answers because they do accomplish the right outcomes, but with different drawbacks - they are all technical correct. — Thomas Owens, Commented Jul 3, 2023 at 17:34
@ThomasOwens the problem is that the appropriate knowledge is neither checked nor required for voting, which leads to the opinion that SO is filled with 'random nonsense'. Many people open SO link only when nothing else works, for exactly that reason. And it's much before ChatGPT. — Cjxcz Odjcayrwl, Commented Jul 4, 2023 at 7:02

anx · Accepted Answer · 2023-07-02 19:30:30Z

a whole new side of the problem

New? Looking specifically at this particular sub-problem of the whole mess, it may well be the one we are most intimately familiar with. Some share senior expert level answers and others spam low-effort posts all the time, and that has major undesirable impact on voting distribution. That observation is not that strongly connected to the recent advancements in language processing software.

Do experts want to participate in a venue where their answers get buried by GenAI?

Decreasingly, as the problem gets worse. I invite those who find this concerning only now, to renew their interest in the overall topic of "buried expert answers" - you might find, that there already was plenty of reason to be concerned before.

We already had this problem before that - 10 years ago. We had to work on that anyway. No progress in the field of artificial stupidity changed the order of magnitude of its importance.

Will people who are using GenAI be wrongly seen as experts because they can write convincing answers?

Not much more than we already see (in the post score) something not quite matching expertise. Some people post high quality answers every single time. I bookmark their https://site.example/users/[uid]/?tab=answers&sort=newest as one of my entry points to the sites, because reading all of their answers is worth it. Some other people answer mostly to learn new things, with accordingly spotty results. I am guilty of that. And then some other people.. just enjoy posting. That is a wide range, with or without SpamGPT.

What guarantees that the entire - continuous! - dynamic range is sufficiently covered with appropriate curation - primarily votes - is not just cutting off the garbage end. It's applying proper compression, so I am less likely reviewing 39 garbage posts and 1 good post. If I gave 20 votes to good ones, 20 votes to bad ones and reviewed only 4 spam posts while doing so, the good posts would be able to compete easily, in terms of total votes.

Will this [..] prevent experts from being seen, voted, and accepted?

Only if progress on the known mitigation strategies stays at its current rate. Go ahead, treat my votes as engagement if you prefer so. But please give me appropriate tools so I can stay engaged. Discovering content using a simple search, where I can chose from a few sorting presets is good, but not good enough. SEDE has plenty proof that with just a few words of SQL there is plenty room for improvement. I am just not willing to read multiple answers devoid of human knowledge every day. Even if it means I miss out on newly active authors that have not made it into my hand selected list yet.

My take is similar to yours: Some other people participate mostly to learn new things, with accordingly spotty results, not necessarily for the lure of reputation or recognition of "expertise" — prusswan, Commented Jul 1, 2023 at 18:08
"algorithmic content discovery tools" groans Please, SE, don't use AI for this. — Laurel, Commented Jul 2, 2023 at 0:24
As the author of that particular 10-year-old post encouraging people to vote more, yes, it was a frustration then and if the sites I am most active on still had a community, it would still be a frustration now. To me, voting is the critical thing that makes these sites useful, the crowd-sourced rating of content that allows you to quickly pick out the good stuff. — Ward - Trying Codidact, Commented Jul 2, 2023 at 1:57
@GertArnoldisonstrike Rewritten. I think the simplest TL;DR of my point would be "How to hold on to experts now? Do what needed to be done anyway." - Whatever else is needed beyond just better filtering, is probably also focused on the integral part of the quality/volume plot, far, far away from what is happening with ChatGPT. — anx, Commented Jul 2, 2023 at 19:58
@Ward-ReinstateMonica "voting is the critical thing that makes these sites useful" Fully agreed. The votes are an essential part of the content. They separate the pearls from the sand. On some questions I could even write multiple answers arguing for or against something, but the votes would then give meaning to these answers. — NoDataDumpNoContribution, Commented Jul 3, 2023 at 21:55

NoDataDumpNoContribution · Accepted Answer · 2023-07-04 12:10:06Z

16

Current AI-generated answers are often "eloquent bullshit" as people put it. Being simply a language model, they do show only limited understanding and have no ability to test their knowledge in any meaningful way. These things however are essential to correctly answer truly original, new questions. Additionally AI frequently suffers from misleading hallucinations. Consequently, AI-generated answers are very often wrong or low quality; at least this is the majority opinion of people on the Stack Exchange sites as far as I can see.

If this is true, you currently don't need to worry as an expert. AI will not be able to compete with you outside of basic tasks soon. Your knowledge is still a unique selling point and even if occasionally a wrong AI answer is upvoted, typically it only needs a single comment to point out the flaw in a post and voting will on average represent quality. There isn't any fundamental difference between low-quality human answers and low-quality AI-generated answers, except for fewer spelling mistakes and maybe higher volume (not now, but maybe in the future).

AI is a useful tool and will be used by people, as a more flexible search tool to retrieve existing knowledge. Take for example the top scored questions in a popular tag like Python on Stack Overflow. The highest voted questions, which also represent the questions with the highest traffic, like What does the "yield" keyword do in Python? and What does if __ name __ == "__ main __": do? or Does Python have a ternary conditional operator? are relatively simple to answer. Current AI models may be able do that already with a high degree of certainty. And they can adapt to the user and patiently explain content in different ways over and over and over again. AI scales better, is very interactive and human expert time may be seen as wasted explaining the basics multiple times. These low-hanging fruits, which nevertheless represent what most people typically need to know (Pareto principle), I fear that they will be taken over by AIs if not now then in the near future/next few years.

As a general rule, if you build a knowledge library, it's all about generating new knowledge and caring for existing knowledge, less about promoting single persons. All that reputation stuff is added on that and of course is also an incentive to cheat, for example to use AI as a tool to generate answers and then pass that off as one's own (plagiarism). This plagiarism is of course against the rules. AI-generated answers should at the very least be marked as such and be community wiki.

Now, it could be that AI becomes much more powerful than now, including methods to test knowledge and then becomes true competition to human experts. We are not there yet but that time may come. If human experts then consider giving up in this race, well that's a possibility and needs to be discussed in much more detail in the future.

To summarize a bit: Many complaints about AI and the Stack Exchange sites are about implementations like using AI to answer instead of using it as a search tool, not marking AI usage clearly as such, giving out incentives to plagiarize AI content (which itself is kind of plagiarized)... On the other hand AI is not going to go away. Even if it's banned here, people can directly use it and for example decide not to visit this network. The impact of AI will be felt anyway. We here simply have to decide how we want to deal with it. I think AI should be seen as a tool and be combined with Stack Overflow as such. If for example, AI-generated responses are automatically created and marked as such and used as templates for human supervised answers, that could be working, even though it would mean shifting the work from creating original content to fact-checking AI-generated content somewhat.

edited Jul 4, 2023 at 12:10

answered Jul 2, 2023 at 9:06

NoDataDumpNoContribution

14.6k2 gold badges31 silver badges66 bronze badges

25

I think you make a good point that experts are already competing with wrong answers that sound good and get upvotes. Right now however, the AI answers have the advantage of volume. Eventually people will be able to recognize AI generated text for what it is, but in the near term it will continue to put stress on curation systems that are already strained because they aren’t scaling well.
– ColleenV
Commented Jul 2, 2023 at 10:04
2

I agree with @ColleenV. Looking for unanswered questions is probably going to be harder. Not only does it require reading a question well (as always) but now also more answers (marked or not) that look deceptively good, but maybe aren't (who knows).
– Gert Arnold
Commented Jul 2, 2023 at 17:10
1

@GertArnoldisonstrike That's why at the very least AI generated answers should be marked as such and should be community wiki. If for example the company would do that automatically, there might not be much incentive left for tricksters to post their own version. Not sure though. It also touches on detectability, a hot topic at the moment.
– NoDataDumpNoContribution
Commented Jul 2, 2023 at 19:50
We entered an age of coexistence of AI and Human-I and we must figure out what that exactly means and how we can get the best out of it (or what the worst could be). An AI that is much more intelligent than us, even if it would be benevolent, would be deeply disturbing, because it would make a big part of what makes us less needed. It would have implications on all scales, purpose of life and everything...
– NoDataDumpNoContribution
Commented Jul 3, 2023 at 10:53
3

@ColleenV people weren't able to recognize bad advices sounding good for a long time, ChatGPT makes things even worse because it's optimized to be convincing.
– Cjxcz Odjcayrwl
Commented Jul 3, 2023 at 14:18
1

@DanubianSailor "optimized to be convincing" You don't fact check answers and if there is a flaw in it comment on it? For programming questions it's often possible to simply run the code. For other questions one could compare with external sources. The lack of references is often a weak point of current AI generated answers. Maybe simply don't trust any answer that doesn't back up its claims with references?
– NoDataDumpNoContribution
Commented Jul 3, 2023 at 15:45
@NoDataDumpNoContribution lack of reference? The whole point of SO is that there's no reference. If there were any reference, nobody would have to ask anything on SO, since everything in answered in tutorials. If the only point of SO would be to duplicate existing documentation, ChatGPT is simply a better tool for that task.
– Cjxcz Odjcayrwl
Commented Jul 3, 2023 at 16:54
3

I'd argue the "lack of references" bit is exceptionally easy to address on sites like Stack Overflow or Super User, where questions are practical and generally immediately verifiable (does it work? yes/ no). The truly existential threat, in my view, is for sites where answers aren't simple "yes/ no" verifiable; sites where advice is given, where things are more open ended... where bad answers can and will cause much greater damage than "oops my code didn't compile". Those sites just aren't as big or popular as SO though, so they presumably haven't seen as big a tidal wave of bad answers.
– zcoop98
Commented Jul 3, 2023 at 16:54
2

@DanubianSailor One of the hallmarks of a good answer is that it provides citations from official sources.documentation showing why the answer is correct. Much documentation is scattered around or uses terminology unfamiliar to the questioner or is just badly written or for SO is just the code - the answer puts the documentation in a form that the questioner can understand.
– mmmmmm
Commented Jul 3, 2023 at 19:35
@DanubianSailor "The whole point of SO is that there's no reference." Sorry for the misunderstanding. I didn't mean, the solution to the problem is behind that link, but rather, the solution uses the following well known elements on which one can read more about at these locations. You probably do not invent the wheel new every time and if you do not, linking to existing material that an answer builds upon is not only good practice but also makes checking the whole answer easier.
– NoDataDumpNoContribution
Commented Jul 3, 2023 at 21:49
2023 marked the end of an era, where basically all expert content was 100% human generated. One could get a bit nostalgic. The expectation is surely of a transformation similar in magnitude like when the internet was invented.
– NoDataDumpNoContribution
Commented Jul 5, 2023 at 11:45

Add a comment |

This_is_NOT_a_forum · Accepted Answer · 2023-07-02 11:19:39Z

-45

Some experts will leave, but others will remain. They just have to try a little harder than answering "easy" questions on popular tags that offer high reputation-to-effort ratio. This is just how SE works and is nothing personal.

Fewer experts are not necessarily a bad thing as learners will have to work harder. Some may even get to become experts faster as a result of this.

It might be better for the network to be less reliant on a smaller group of experts, and for the general population to step up. Every user needs to be responsible for curation of their own knowledge.

edited Jul 2, 2023 at 11:19

This_is_NOT_a_forum

6,6754 gold badges37 silver badges55 bronze badges

answered Jul 1, 2023 at 17:49

prusswan

7206 silver badges16 bronze badges

33

I think the last thing experts like to do is answering "easy" questions on popular tags that offer high reputation-to-effort ratio. It's extremely boring because these questions tend to be repetitive too. Opening the floodgates of GenAI will make it harder to identify "hard" questions that have no thoughtful answer yet. That's the problem here. Experts have to work harder to find questions that require experts. It's hard enough already.
– Gert Arnold
Commented Jul 1, 2023 at 19:19
@GertArnoldisonstrike but it is those questions that really gets reputation aka expertise to most people. by now you should have realized popularity plays a big part towards reputation gain
– prusswan
Commented Jul 1, 2023 at 19:40
what you describe sounds like human involvement here would eventually squeeze to only those answering difficult questions in obscure tags, correct? Gotta be orders of magnitude smaller community, something like what we have now at mathoverflow?
– gnat
Commented Jul 1, 2023 at 20:36
@gnat-onstrike- as a normal user, I don't think it is necessarily a bad thing. popularity can be a bane
– prusswan
Commented Jul 1, 2023 at 20:44
7

The term "experts" have been used in very broad ways, i.e., on some posts is used to refer to people answering questions, and on other posts is used to refer to people that have recognition from other persons in their respective knowledge and industry domains. I don't think that the use of "experts" is the same in the OP post and on this post. What do you mean by "experts"? What do you mean by a "normal user"? One, what communities do you have an interest in? What are your expectations about those communities?
– Rubén
Commented Jul 1, 2023 at 20:49
4

well yeah not necessarily. Though I doubt that company gotta be happy about such decrease in human participation. Especially considering that such a small and advanced community would be much easier to move, like MathOverflow folks did in the past. For example, they may be tempted to move to some other "GPT-free" site where they wouldn't need to prove that their reputation is different from one generated by GPT. If (when?) this happens, end result would be like Thomas described, woudn't it? "Experts, like myself, who do not want to compete with GenAI answers leave the network..."
– gnat
Commented Jul 1, 2023 at 20:54
@Rubén-PeopleFirst given current site design I assume experts to include people who care about high rep (since it is regarded as measurement of expertise). There are certainly experts with low rep but there is no good rule to establish who they are
– prusswan
Commented Jul 1, 2023 at 21:05
1

@gnat-onstrike- it looks more like getting overwhelmed with GenAI content, not really competing. If they had to compete with GenAI over difficult questions, they should have no problems "winning" with an astute community. But certainly not with something like print(sys.version)
– prusswan
Commented Jul 1, 2023 at 21:09
13

@prusswan It's well known that the reputation system is very bad as an expert discriminator.
– Rubén
Commented Jul 1, 2023 at 21:09
12

What makes you think that Chat-GPT spammers won't answer "obscure" questions?
– bobeyt6
Commented Jul 1, 2023 at 22:50
18

I would doubt that all experts are motivated purely to maximise their internet points.
– David Roberts
Commented Jul 2, 2023 at 2:08
@bobeyt6isstricken and why are people so afraid of that? being made redundant?
– prusswan
Commented Jul 6, 2023 at 2:43
3

@prusswan I'm not sure what you are asking, but I'll take the opportunity to expand on my comment. Bad answers to harder questions are harder to vet since they require more expertise, especially if they are convincingly written. With experts gone, there will be no way to tell and 'learners' will be fed misinformation.
– bobeyt6
Commented Jul 6, 2023 at 2:48

Add a comment |

Stack Exchange Network

Competing with ChatGPT Answers: An Expert's Perspective

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
discussion
answers
stack-exchange
generative-ai
.

Linked

Hot Network Questions

Competing with ChatGPT Answers: An Expert's Perspective

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged discussionanswersstack-exchangegenerative-ai.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
discussion
answers
stack-exchange
generative-ai
.