59
\$\begingroup\$

Often, answers to questions asking for "programs" or talking about "programming languages" utilize things like sed, awk, … in order to get around having to write an actual shell script.

Therefore, a question comes to my mind:

What qualifies as a programming language?

Sure, ultimately the OP can define this themselves. But what is a reasonable understanding of a "default" in case the question does not clarify this (and how should it be clarified)?

Do coreutils count as "languages" and if so, how to handle different sets of coreutils on different systems?

To give a concrete example for a questionable usage of the term programming language, see my answer here. It uses a rot13 binary as the interpreter. In case this is invalid, but "coreutils" are valid, how can we define the difference?

\$\endgroup\$
18
  • 2
    \$\begingroup\$ Once we've settled this it should be added here. \$\endgroup\$ Commented Aug 16, 2014 at 20:52
  • \$\begingroup\$ I think the important difference between the sed/awk case and rot13 is that the former two take some file specifying a transformation (sed does so via the -f parameter) whereas rot13 does not. Or is rot13 specified to ignore any excess arguments? In that case, an empty file could be considered a "rot13 program" if one wants to bend the rules. \$\endgroup\$
    – FireFly
    Commented Aug 16, 2014 at 21:14
  • \$\begingroup\$ As for different sets of coreutils on different systems, this is partly settled by standards, and partly by specifying what version a program is for (e.g. "GNU sed" or "bash" rather than "sed" or "sh"). Besides, the situation is pretty similar with e.g. BASIC in that a program for one platform might not work on another one. \$\endgroup\$
    – FireFly
    Commented Aug 16, 2014 at 21:17
  • \$\begingroup\$ @FireFly Maybe one way to put it is that the "code" must do something rather than (just) having something done to it? Or that the code must contain the logic that characterizes it as an answer for the question? \$\endgroup\$
    – Ingo Bürk
    Commented Aug 16, 2014 at 21:33
  • \$\begingroup\$ On that note maybe we should also discuss how to count bytes of such solutions. Generally it looks like people will count the entire command (awk …), when, to be fair, only the argument containing the logic should count. \$\endgroup\$
    – Ingo Bürk
    Commented Aug 16, 2014 at 21:38
  • 1
    \$\begingroup\$ One way to look at this is to focus on the program not the language. Is the input to the alleged language a computer program? If so, then perhaps it's a language; if not then it's not. Re Turing completeness, anything that is Turing complete or anything where it's hard to show it's not Turing complete is in my view a language, but that definition might not be exhaustive. \$\endgroup\$
    – abligh
    Commented Aug 17, 2014 at 9:31
  • \$\begingroup\$ @abligh In my opinion both coreutils and regular expressions would fail that test. I'm not sure this is what we want, though. \$\endgroup\$
    – Ingo Bürk
    Commented Aug 17, 2014 at 9:38
  • \$\begingroup\$ @IngoBürk It really depends on the regex flavour. There are Turing complete ones and some that come close to it, so at least those wouldn't really fail this test. \$\endgroup\$ Commented Aug 17, 2014 at 11:07
  • \$\begingroup\$ It may be more useful to ask "what is not a programming language", as it is easier to make a list of things which have been used as programming languages here but aren't (morse code, rot13, English) than to compile an exhaustive list of programming languages. PS: sed is a programming language and it's great for string manipulation golfing, codegolf.stackexchange.com/a/28655/16402 and codegolf.stackexchange.com/a/32526/16402 \$\endgroup\$
    – user16402
    Commented Aug 18, 2014 at 16:46
  • \$\begingroup\$ My answer below is rather myopic, in that it addresses the literal question being posed, but doesn't address the deeper question of what is a useful programming "language" and why playing with them helps us. Programming languages are tools that we use to create abstractions that make it easier to create, understand and maintain things that are useful. Not all of those tools are Turing-equivalent. SQL and regex for example. Playing with these tools and watching other people explore off the beaten path with these tools is very useful to me. Plus the occasional Aha! as a bonus. \$\endgroup\$ Commented Aug 22, 2014 at 6:21
  • \$\begingroup\$ I'm willing to give any Turing-complete language a pass, including brainfuck and golf-script, just on principle. I have occasionally learned something interesting from those answers. If someone posts something interesting, clever or useful in Haml & Sass I'll give it an upvote. Even RPG, if it were clever or interesting (doubtful, but there may be someone lurking in the shadows who will prove me wrong). \$\endgroup\$ Commented Aug 22, 2014 at 6:21
  • \$\begingroup\$ So what is the process here, generally. When do we decide on a definition nad put it into the list of standard definitions? \$\endgroup\$
    – Ingo Bürk
    Commented Aug 22, 2014 at 14:13
  • \$\begingroup\$ At this point, I think we're still just sitting on a park bench BSing. This issue has come up multiple times in the BS (Before Scott) era and other than some items in the Standard loopholes which are no longer funny list, doesn't appear to have been resolved. Resolution would involve coming up with a good answer ("good" probably means simple to state and apply and fairly inclusive) and convincing the PP&CG public, especially the mods and oldtimers, that it's better than what we have now. \$\endgroup\$ Commented Aug 22, 2014 at 16:30
  • \$\begingroup\$ @Wheat Wizard This question has been asked years earlier. \$\endgroup\$
    – Ingo Bürk
    Commented Aug 3, 2017 at 5:17
  • \$\begingroup\$ I'm aware. Older questions can be marked as duplicates. Since the question here is resolved by the newer question I have foraged it as a duplicate. \$\endgroup\$
    – Wheat Wizard Mod
    Commented Aug 3, 2017 at 5:19

6 Answers 6

78
\$\begingroup\$

My previous answer was criticised for not drawing a line in a sand, so following some discussion on chat I propose a line.

Executive Summary

A purported programming language should be accepted as such if and only if it is capable of addition of natural numbers and primality testing of natural numbers.

More precise description

The language must:

  1. Support a representation of natural numbers and of tuples. (We're talking about languages rather than implementations, so we will leave to one side the issue of type widths).
  2. Be able either to transform inputs into outputs (transformational model) or to distinguish an "accepted" input from a "rejected" input (decision model).
  3. Be able to take two natural numbers and add them. In the transformational model, this means transforming an input tuple of two numbers into an output which correctly represents their sum. In the decision model this means deciding whether an input contains the representation of a tuple of three natural numbers such that the third is the sum of the first two.
  4. Be able to take a natural number and say whether or not it is a prime. In the transformational model this means transforming a natural number into the representation of 0 or 1 according to whether it is a composite or a prime number. In the decision model it means accepting precisely those inputs which represent a prime.

Observations.

In roughly decreasing order of importance.

  • This definition does not require a language to be Turing-complete (although it certainly permits any Turing-complete language). This is intentional. Turing-completeness is unnecessary for many problems on this site, and there are interesting non-Turing-complete languages and interesting languages whose status with respect to Turing-completeness is unknown.
  • My original aim was to capture the intuition of accepting primitive-recursive languages. However, I considered that primality testing is a lot easier to explain and to demonstrate, while requiring enough power that it probably doesn't include any undesirable "language". The additional requirement to support addition is to exclude arguments about the factor tool qualifying.
  • Although some might object to the inclusion of the decision model, it corresponds more closely to the formal definitions of well-known complexity classes than the transformational model does.
  • If you're using an obscure language, please add an entry to the list of installation and testing instructions so that other people can test your code.
  • This definition allows regex flavours which include backreferences. I don't consider this a problem.
  • This definition excludes HQ9+. I don't consider this a problem either, for two reasons:
    1. It was created as a joke rather than a language, and has ceased to be funny in the context of this site.
    2. I think that every interesting problem which HQ9+ can "solve" has already been asked, so I don't think this will exclude any interesting answers in the future.
\$\endgroup\$
19
  • 3
    \$\begingroup\$ I like this answer a lot. Simple, clear rules that pretty much any language I intuitively consider a language passes while others don't. Also rules you don't need to really check for most languages because it's obvious, even for very specific languages (where turing-completeness would've been a problem). \$\endgroup\$
    – Ingo Bürk
    Commented Aug 22, 2014 at 5:07
  • 18
    \$\begingroup\$ +1 your answer, I like it a lot, although it does pave the way for a new "language" HQ9AddP. It would mean I would win the following argument, though it did give us a good laugh at the time: codegolf.stackexchange.com/a/23003/15599 \$\endgroup\$ Commented Aug 23, 2014 at 13:52
  • 1
    \$\begingroup\$ So... it needs to be able to solve at least those problems whose recursive complexities are of order less than ω? \$\endgroup\$ Commented Aug 18, 2015 at 17:29
  • 2
    \$\begingroup\$ @SuperJedi224, I don't know what your definition of recursive complexity is (and any attempt to Google it just gets a tonne of basic pages about big-O notation), but I don't think so. It doesn't have to be Turing-complete. (It might even be the case that PRIMES is in LOGSPACE - I don't think anyone's proven or disproven that). \$\endgroup\$ Commented Aug 18, 2015 at 18:44
  • 2
    \$\begingroup\$ A turing complete system can solve (in theory, in practice this would require insanely impractical amounts of time and memory) problems whose recursive complexities are on orders quite a bit higher than ω... (Knuth's Arrow Notation and the Ackermann Function, for example, both have recursive complexities of order ω, and Graham's Sequence has recursive complexity of order ω+1). ω<sub>1</sub><sup>CK</sup> is by definition the least order of recursive complexity that no computable model can solve, even given unbounded time and memory. \$\endgroup\$ Commented Aug 22, 2015 at 12:15
  • 4
    \$\begingroup\$ @SuperJedi224, without a link to a good definition and description of "recursive complexity", you're wasting your time (and mine). \$\endgroup\$ Commented Aug 22, 2015 at 12:43
  • 1
    \$\begingroup\$ What does supporting a representation of tuples mean exactly? For example, how does regex support tuples? \$\endgroup\$
    – jimmy23013
    Commented Oct 12, 2015 at 7:06
  • 4
    \$\begingroup\$ @jimmy23013, any mechanism which allows items to be grouped and ungrouped would do. For computational models which support arbitrarily large integers, a classic way of representing tuples is using powers of primes (e.g. you represent the pair (a, b) as 2^a * 3^b and extract a and b with division and modulus). For regexes with backreferences, one easy way to demonstrate that they meet the criteria is to use unary with digit 1 for numbers and separate elements of a tuple with whitespace. Then a regex for addition would be ^(1*) (1*) \1\2$ \$\endgroup\$ Commented Oct 12, 2015 at 8:04
  • 1
    \$\begingroup\$ That is, to support tuples in the input, but not necessarily anywhere else in the computation? \$\endgroup\$
    – jimmy23013
    Commented Oct 12, 2015 at 8:06
  • 1
    \$\begingroup\$ @jimmy23013, yes (although I'm curious to know what model you've found which supports them in the input but not elsewhere). \$\endgroup\$ Commented Oct 12, 2015 at 8:15
  • 1
    \$\begingroup\$ @PeterTaylor Regex? It can only use the tuples in the input, but not generate new ones. \$\endgroup\$
    – jimmy23013
    Commented Oct 12, 2015 at 8:17
  • 4
    \$\begingroup\$ @PeterTaylor I think this is what SuperJedi is referencing: Ordinal Complexity of Recursive Definitions. The authors seem to have invented the terminology, given that the first instance of "complexity" is in quotes. \$\endgroup\$
    – primo
    Commented Oct 18, 2015 at 19:22
  • 5
    \$\begingroup\$ I would recommend adding a deterministic requirement to this - a language must be capable of producing consistent output for a consistent combination of code and input, and those deterministic parts are the ones that must meet these requirements. This does not disallow RNG components - it just requires that the language has the capability to deterministically add, determine primality, and either follow the transformational or decision model. \$\endgroup\$
    – user45941
    Commented Nov 14, 2015 at 18:17
  • 2
    \$\begingroup\$ Does this definition require the language to support unbounded memory? (This situation has come up with a language which requires the type width of every variable used to be specified, in unary, within the program itself.) It sort-of implies that it does, but on the other hand, that would exclude many more normal programming languages (e.g. in C, the mere existence of size_t places a hard bound on how large memory can be in any given program). \$\endgroup\$
    – user62131
    Commented Jan 5, 2017 at 18:07
  • 2
    \$\begingroup\$ @ais523, I don't think this criterion really says much about memory capacity. It's implied that you have more than two bits of memory, but I certainly don't think that unbounded memory is implied. \$\endgroup\$ Commented Jan 6, 2017 at 0:14
46
\$\begingroup\$

We're asking the wrong question

We're having the XY problem. The question is really, "What formats should we allow in answers?" For this purpose, I think that markup languages and limited output languages should be treated the same as programming languages.

Having fewer features puts a language at a disadvantage. If a language does your challenge, yet cannot implement something like primality testing, that's more impressive, not less. It's silly to put a minimum on the functionality of the language. This is just making user-made languages cheated their way into being Turing complete, like with Bubblegum and CHIQRSX9+.

If a challenge is fixed-output and is dominated by answers that are basically the output string, that's a problem with the challenge. Little will change with a ban on "print-this" languages the output their text, since plenty of fully-fledged languages have really short ways to quote and print text.

\$\endgroup\$
2
  • 7
    \$\begingroup\$ Re fixed-output questions, see also "downvote the question rather than post a protest answer consisting of the literal output". (There were some ridiculous extremes in the recent French flag question: I downvoted and flagged one answer which was a raw PNG file). I think the motivation of this question is insisting that people actually use a language, rather than claiming that they have a 0-byte answer with $unix-tool-which-does-exactly-what's-needed. But the site is PPCG, and answers which don't involve programming should be off-topic. \$\endgroup\$ Commented Nov 19, 2015 at 10:35
  • 2
    \$\begingroup\$ @PeterTaylor That has happened before when we had the Super Mario graphical output challenges. I started a separate meta thread back then and the consensus seems to be that kolmogorov-complexity challenges don't require answers in programming language. :/ \$\endgroup\$ Commented Nov 19, 2015 at 12:32
9
\$\begingroup\$

There may be some grey areas (regexes having been mentioned), but none of the cases explicitly named in the question falls into one.

Definitely languages

sed

Wikipedia says:

sed (stream editor) is a Unix utility that parses and transforms text, using a simple, compact programming language.

It has commands and is Turing-complete.

awk

Wikipedia says:

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool.

It has variables, types, control flow... There is a book called The AWK Programming Language.

Definitely not a language

rot13

Forget Turing-completeness (equivalent to mu-recursive functions): rot13 doesn't even allow you to implement primitive-recursive functions. In fact, it doesn't allow you to implement anything: it implements a single bijective function (or, at best, 26 closely related bijective functions), and the only thing you can do is choose the input to that function.

(NB although you didn't mention it, I've seen it in some "answers", so: cat falls in this category as well for the same reasons as rot13. And since I've just seen a comment in which you suggest claiming it as a language, date is definitely not either).

\$\endgroup\$
4
  • 7
    \$\begingroup\$ Sed is Turing-complete?!?! That's what I like about PP&CG, it jostles my certainties. Here's the most interesting page Google lead me to: A proof that Unix utility "sed" is Turing complete. \$\endgroup\$ Commented Aug 17, 2014 at 15:28
  • \$\begingroup\$ @Scott random application of string-rewriting rules (or s/// regex replacement) is Turing-complete. Check out the esolangs Thue and /// on the esolang wiki (can't provide link at the moment). \$\endgroup\$
    – FireFly
    Commented Aug 17, 2014 at 16:04
  • 6
    \$\begingroup\$ None of this defines a difference, though. With these arguments I have trouble accepting HQ9+ as a programming language. It does nothing that really differs from rot13. We can call this gray area, but that doesn't get us anywhere. Why is one allowed but not the other? \$\endgroup\$
    – Ingo Bürk
    Commented Aug 17, 2014 at 16:41
  • 2
    \$\begingroup\$ @IngoBürk, so do I. \$\endgroup\$ Commented Aug 17, 2014 at 16:54
7
\$\begingroup\$

Yes, awk(1) and bc(1) and dc(1) are all programming languages. One can do factorial in AWK.

Yes, sed(1) is a programming language. I solved "Bring out the inner llama of a sentence" in sed. My program is too long to win code golf, because sed has no numeric operations, but it does answer the question. Programs in sed can do input and output, and make conditional jumps. I made a while loop, starting with the label :w and ending with the conditional jump /[%z]/bw, which branches to w if the line contains any % or z characters. With input and output and conditional jumps, one can write programs in sed.

No, ed(1) is not a programming language. I believe this because I never learned how to do conditional jumps in ed; it has no labels or branches. People can write scripts in ed, and ed can do some conditional logic, so perhaps one can disagree and claim that yes, ed is a programming language.

No, most other shell utilities are not programming languages. Commands like ls or sort or rot13 are not programming languages by themselves. If the question is rot13, I can't use rot13, because I must not use a library function, or command, to do all the work. I can answer tr A-Za-z N-ZA-Mn-za-m and count it as a shell script, 22 characters. As tr is no programming language, I must count the call to tr as part of my program.

Answers need not be portable. It is fine to write a shell script that works with GNU coreutils, but fails with BSD or Solaris. This is no worse than a PowerShell answer that only works with Windows! One platform is enough.

\$\endgroup\$
2
  • \$\begingroup\$ There's a misunderstanding about the rot13 abswer. Your tr program would not be a valid answer there, the goal was not to write a rot13 program. I wasn't using rot13 as the submission, I was using it as the "language". The actual submission was the input "A". \$\endgroup\$
    – Ingo Bürk
    Commented Aug 18, 2014 at 4:37
  • \$\begingroup\$ Another argument for sed is programming language: one can code a SedSokoban game in it. (Though I'm afraid “one” can be taken literally in this case…) \$\endgroup\$
    – manatwork
    Commented Aug 18, 2014 at 9:06
6
\$\begingroup\$

Aside from allowing programming languages as in Peter's definition, I think we should also allow declarative languages.

In computer science, declarative programming is a programming paradigm, a style of building the structure and elements of computer programs, that expresses the logic of a computation without describing its control flow.

Common declarative languages include those of database query languages (e.g., SQL, XQuery), regular expressions, logic programming, functional programming, and configuration management systems.

This means that we should also allow languages such as HTML (+SVG), CSS.

\$\endgroup\$
3
  • \$\begingroup\$ My definition already includes most declarative languages, including the combination of HTML+CSS3. It doesn't include HTML by itself, but I can't think of any question where you'd want to include HTML by itself. \$\endgroup\$ Commented Jan 1, 2015 at 14:26
  • \$\begingroup\$ @PeterTaylor I see. While I also cannot think of such a question, there are surely questions where you'd want to use SVG, which is not included by your definition, I think. So I edited my answer. \$\endgroup\$
    – ProgramFOX
    Commented Jan 1, 2015 at 14:32
  • \$\begingroup\$ @PeterTaylor I know this postdates your comment, but the "create a checkbox" challenge certainly qualifies - and HTML did participate. \$\endgroup\$ Commented Mar 17, 2017 at 18:37
3
\$\begingroup\$

Clearly a general purpose programming language -
Turing equivalence is sufficient to decide it's a programming language, whether well-defined (C, Python, Haskell, etc.) or ad-hoc (e.g., bash + Linux utilities, PowerShell + Windows utilities, etc.).

Useful exceptions -

  • SQL
  • state machine languages, including all flavors of regex without code injection
  • Until I read the Wikipedia article on Turing completeness, I'd never heard of useful almost Turing-complete programming languages before. I, and I think most people reading PP&CG, would accept something like Charity as a programming language.

Clearly not a general purpose programming language -
No ability to construct a branch and/or a loop control structure. This definition is woefully incomplete, since the breadth of state manipulation is also an issue.

\$\endgroup\$
7
  • \$\begingroup\$ What about the countless languages that are suspected to be Turing complete, but have not been proven to be so? \$\endgroup\$
    – Ingo Bürk
    Commented Aug 17, 2014 at 7:14
  • 2
    \$\begingroup\$ I personally think that regular expressions shouldn't even be ruled out. If the problem is trivial to solve with regular expression it probably wasn't interesting enough. In all other cases solving it with a regular expression is often just a skillful (if not more) than solving it with a "normal" language. Furthermore, at least PCRE is Turing complete, and other flavours (like .NET's and Ruby's) are similarly powerful. The only thing that's a bit problematic with regular expressions is to adhere to input/output specifications. But if that works, then people should be allowed to use regex. \$\endgroup\$ Commented Aug 17, 2014 at 8:49
  • \$\begingroup\$ I agree with @MartinBüttner. Turing-completeness is not a good criterium when it comes to regular expressions, neither is classical branching/looping. There are fantastic regexp answers that I don't think should be excluded. \$\endgroup\$
    – Ingo Bürk
    Commented Aug 17, 2014 at 9:00
  • \$\begingroup\$ I'm not a snob and would upvote an elegant or eloquent or jaw-dropping regex-only answer. I'd do the same for an answer that used any useful state machine language. \$\endgroup\$ Commented Aug 17, 2014 at 14:25
  • \$\begingroup\$ @MartinBüttner Being pedantic, I have it on "good authority" (meaning I haven't paid much attention to it, but it sounds right) that perl regular expressions, without code insertion, are not Turing-complete. If you know otherwise, I'd be grateful for a reference or two. \$\endgroup\$ Commented Aug 17, 2014 at 14:42
  • \$\begingroup\$ @ScottLeadley Turns out I can't find any conclusive information about it. But I can't find a prove for it either way. Apparently it can match context-sensitive languages (but my very short search hasn't turned up a proof for that either). \$\endgroup\$ Commented Aug 17, 2014 at 14:47
  • 1
    \$\begingroup\$ ANSI SQL is in fact Turing Complete, but it's a tarpit when you need it to be Turing complete. \$\endgroup\$
    – Joshua
    Commented May 4, 2016 at 15:55

Not the answer you're looking for? Browse other questions tagged .