14

Currently most* of the questions in the have the following form:

I want to match 3 numbers, followed by foo and 2 letters. How can I get the number?

Such questions can be easily translated into a regex by looking up how to match numbers, literals, letters: (\d{3})foo\w{2}.

And the answers are just that: a piece of code. Nothing more. No explanation how this was written, 3 identical answers in a few minutes, and a few upvotes for that low quality.

I believe that such kinds of questions should be closed, but in a way that tells the op how to write regular expressions them self.

So I suggest that we create a canonical regex question with the basics rules, and explain how to apply them. Then close all new (and old) regex questions that match this pattern as duplicate of this new one.

* subjective

10
  • 1
    I wouldn't call the answers low quality...well depending on the regular expression given. An answer's quality is a summation of the answering individual's personal knowledge. Deeming such answers as low quality is not only infringing on the given answer, but also on the history of knowledge seeking to be able to create such an answer. We may be able to come up with a valid regular expression off the top of our heads now, but how long did it take us to get that way? Commented Dec 30, 2013 at 14:17
  • 3
    @ChristopherW The answerer might be very knowledgeable about regexes but the answer could be crap. A bad regex answer is an answer with pure regex without any explanation. So a give me the codez question + complex regex answer without explanation = ???
    – HamZa
    Commented Dec 30, 2013 at 14:23
  • 4
    Related: Quality problems in regex answers.
    – Jerry
    Commented Dec 30, 2013 at 14:33
  • 3
    @ChristopherW: The difference is what other people can learn from it. It's very difficult for a regex question to be anything other than a boring "here you go". (There are exceptions, such as this gem from tchrist.) I'm not sure that Johannes' suggestion is correct, but the status quo is certainly not correct either. Commented Dec 30, 2013 at 14:34
  • 1
    When I saw this mess, I thought "something has to be done", and this is what I see as probably best solution. If you have a different approach, tell me. But I think doing anything is in this case better than doing nothing. I don't think that things will magically change over time without intervention. Commented Dec 30, 2013 at 14:41
  • One of the rules: Do not try to parse (X|XHT|HT|SG)ML with regex!
    – mirabilos
    Commented Dec 30, 2013 at 14:54
  • 1
    We already have a canonical answer for that. The <center> can not hold. Commented Dec 30, 2013 at 15:30
  • I suppose I could have worded that better. (I also could have read the question more closely. Mea culpa.) So long as good, non-gimme-t3h-codez regex questions are left alone (all what, three of them?) I have no problems with the solution proposed in the question. Commented Dec 30, 2013 at 16:15
  • 1
    @hamza Oh don't get me wrong. I think help vampires should be treated like real vampires. I just think that blanketing all the answers as low quality is just incorrect. How do you judge quality when the problem is relatively easy (to people who understand regular expressions)? As for the help vampires? Torch and pitchforks. I think a regex page would definitely help, I agree in that aspect. Or how about we create a script that creates a regex that parses regex questions that creates a regex to post as a regex answer...that would be cool (yo dawg, I heard you like regexes) Commented Dec 31, 2013 at 0:40
  • Related: the question for which an answer starts with "You can't parse [X]HTML with regex.". Commented Jul 22, 2020 at 12:53

2 Answers 2

7

Closing all "give me teh codez" style questions as dupes of the same regex Q&A (which would basically be a general regex tutorial) seems like it could be...less than helpful (though there is a precedent in the .NET tags for this type of thing).

Is there a pattern of very common categories of regex questions? Or a way to group regex questions by specific technique / goal? I don't spend any time in the regex tag, so I truly don't know.

If so, I would say

  • analyze the "give me teh codez" syle regex questions, to find a pattern of question types,
  • find a canonical question for each category / general regex technique
    • or find some existing dupe target for that category / technique, and edit it into a canonical answer
  • add a list of such question / answer links to the regex tag wiki (under some heading like "Common regex questions")
  • start closing things as duplicates of those

For example, I saw this regex question earlier where someone found a clear duplicate. Could this be one category of regex questions? I don't know.

If there is not a way to categorize the regex questions, maybe your approach is the best way to do this. I just thought I would throw this idea that popped into my head out there.

5
  • 3
    or,a just close as "minimal understanding" those who lack any attempt. Commented Dec 30, 2013 at 16:22
  • 1
    "give me teh codex" is obviously for the most ancient "give me teh codez" questions (note to self: proofread more). Commented Dec 30, 2013 at 16:25
  • I talk specially about the things that can be translated 1:1 from English to regexp without any hard work. Commented Dec 30, 2013 at 16:36
  • @JanDvorak Yeah, that close reason certainly applies. I was just thinking, as SO is supposed to be the go-to place for answers to common programming questions, it seems like we have an opportunity to really improve the state of the regex info on the net if we have a lot of dupes pointing to actual solutions (rather than a bunch of questions closed as "mininal understanding"). I realize it's easier to just close them with that reason, especially when the asker appears to be lazy, so I don't blame people or doing so (I've certainly done it). Commented Dec 30, 2013 at 16:38
  • @JanDvorak It's such a grey line. I don't like these code only answers that it produces though. Like stackoverflow.com/questions/22097252/… doesn't really deserve minimal understand I don't think, but the answer shouldn't just be regex with a link to working proof.
    – Cruncher
    Commented Feb 28, 2014 at 14:19
1

I don't think it's possible to write a Q/A wiki expansive enough to cover every regex question that comes in, or even most of them. Often those kinds of questions are looking for a very particular pattern and a general Q/A would be insufficient for their needs. There are tons of resources online... clearly these users aren't interested in learning about regexes, they just want to get paid and are looking for the fastest, easiest route to achieve that goal. I can't say I blame them for that.

I do, however, blame the users who answer these kinds of questions. I suspect the majority of them are after easy rep (since these questions are often simple to answer and usually get a couple upvotes and an "accept"). Getting them to close a question (which nets 0 rep) rather than answer it (and earning 10 or 25 rep) is nearly impossible. Shutting down the easy rep mill would be extremely difficult.

See https://meta.stackexchange.com/q/188408/191410

2
  • Well, the scope would be easy: simple regex things that are in the majority of all regex engine, like quantifier, character classes ([]), grouping... Commented Dec 30, 2013 at 16:58
  • 1
    @JohannesKuhn - You have a "...". Would you like to complete the list? "..." is a cop-out. The point is, even if you could write a complete manual for using regex (which covers all the different regex engines and their unique quirks), getting people to actually close those questions would be very difficult. There's rep to be earned in answering, not in closing.
    – JDB
    Commented Dec 30, 2013 at 18:12

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .