What you are essentially asking about is the difference between the computational power and what is commonly called the expressive power (or just expressiveness) of a language (or system of computation).
Computational Power
The computational power refers to what kinds of problems the language can compute. The most well-known class of computational power is that which is equivalent to a Universal Turing Machine. There are a lot of other systems of computation, such as Random Access Machines, λ-calculus, SK combinator calculus, µ-recursive functions, WHILE
programs, and many others. And as it turns out, all of these can simulate each other, which means that they all have the same computational power.
This gives rise to the Church-Turing Thesis (named after Alonzo Church who created the λ-calculus and Alan Turing who created the Universal Turing Machine). The Church-Turing-Thesis is a hypothesis on computability with two aspects:
- all computing systems capable of general computation are equally powerful, and
- a human being following an algorithm can compute exactly the functions that a Turing Machine (and thus any of the other systems) can compute.
The second is more important in the field of philosophy of mind than computer science, though.
However, there are two things the Church-Turing-Thesis doesn't say, that are very relevant to your question:
- how efficient the various simulations are and
- how convenient the encoding of a problem is.
A simple example for (1): on a Random Access Machine, copying an array takes time proportional to the length of the array. On a Turing Machine, however, it takes time proportional to the square of the length of the array, because the Turing Machine does not have random memory access, it can only move across the tape one cell at a time. Therefore, it needs to move across the n elements of the array n times to copy them. So, different models of computation may have different performance characteristics, even in the asymptotic case, where we try to abstract away from implementation details.
Examples for (2) abound: both λ-calculus and Python are Turing-complete. But would you rather write a program in Python or in λ-calculus?
There is also a third wrinkle I have skirted around until now: all of those original systems were designed by logicians, philosophers, or mathematicians, not by computer scientists … simply because computers and thus computer science didn't exist. These all go back to the early 1930s, even before Konrad Zuse's very first experiments (which weren't programmable and/or Turing-complete anyway). They only talk about "computable functions on the natural numbers".
Now, as it turns out, there's a lot you can express as functions on natural numbers – after all, our modern computers even get by with a lot less than that (basically 3-4 functions on the numbers 0 and 1, and that's it), but, for example, what function does an operating system compute?
This notion of I/O, side-effects, interacting with the environment, is not captured by the idea of "functions over natural numbers". And yet, it is kind of important, since, as Simon Peyton Jones once put it "All a pure function with no side-effects does, is make your CPU hot", to which an audience member replied "actually, that is a side-effect, too!"
Edwin Brady, the designer of Idris, (only half) jokingly uses (I don't know whether he invented it) the term "Tetris-complete" to express this difference between "can compute any computable function on natural numbers" and "can be used to write non-trivial programs that interact with the environment". Even more ironically, he demonstrates this by having an implementation of a Space Invaders clone in Idris, but he says that he is confident that Tetris reduces to Space Invaders.
Another thing to point out is that not only is Turing-equivalence not necessarily enough to talk about actually writing "useful" programs, it may OTOH also not even be necesssary. E.g. SQL has only become Turing-equivalent with ANSI SQL:1999, but it was still useful before that. In fact, some might argue that making it Turing-equivalent hasn't added to its usefulness at all. There are many Domain-Specific Languages that aren't Turing-equivalent. Data Description Language typically aren't (and shouldn't be). Total Languages obviously can't be Turing-equivalent, yet you can still write event loops, web servers, or operating systems in them. There are also languages which are Turing-equivalent but where this is actually considered to be a mistake.
So, all in all, Turing-equivalence is not terribly interesting, unless you want to statically analyze programs.
Expressiveness
Assuming that our computation system is computationally powerful enough to even solve our problem at all, what we have to do next, is to express our algorithm for solving that problem in some sort of formal notation for that system. In other words: we need to write a program in some computer language. That's where the notion of expressiveness comes in.
It refers to, essentially, how "easy" or "enjoyable" it is to write our program in our particular programming language. As you can see, the notion is quite vague, subjective, and more psychological than technical.
However, there are attempts at more precise definitions. The most famous one (and the most rigorous one I know of) is by Matthias Felleisen in his paper On the Expressive Power of Programming Languages (the first two pages contain a gentle introduction, the rest of the paper is more meaty).
The main intuition is this: when translating a program from language to another language, some of the changes you need to make are locally contained (such as e.g. turning FOR
loops into WHILE
loops or loops into conditional GOTO
s), and some require a change to the global structure of the program.
When you can replace one feature of one language with a different feature of a different language by only local transformations, then these features are said to have no effect on the expressive power. This is called syntactic sugar.
On the other hand, if it requires a change of the global structure of the program, then the language you are translating to is said to be unable to express the feature. And the language you are translating from is said to be more expressive (with respect to this feature).
Note that this gives an objectively measurable definition of expressiveness. Note also that the notion is context-dependent on the feature, and it is comparative. So, if every program in language A can be translated to language B with only local changes, and there is at least one program in language B which can not be translated to A with only local changes, then language B is strictly more expressive than language A. However, the more likely scenario is that many programs in both languages can be translated back and forth, but there are some programs in both languages that cannot be translated to the other. This means that neither language is strictly more expressive than the other, they just have different features that allow different programs to be expressed in different ways.
This gives a formal definition of what it means to be "more expressive", but it still doesn't capture the psychological notions behind the phenomenon. For example, syntactic sugar, according to this model, does not increase the expressive power of a language, because it can be translated using only local changes. However, we know from experience that having FOR
, WHILE
, and IF
available, even if they are just syntactic sugar for conditional GOTO
makes expressing our intent easier.
The fact is, different languages have different features that make expressing different ways of thinking about a problem easier or harder. And some people might find one way of expressing their intent easier and others a different way.
An example I found in the Ruby tag on StackOverflow: many users who follow the Ruby tag claim that loops are easier to understand than recursion and recursion is only for advanced functional programmers and loops are more intuitive for newcomers, but I have seen multiple cases of complete newcomers who intuitively write code like this:
def rock_paper_scissors
get_user_input
determine_outcome
print_winner
rock_paper_scissors # start from the top
end
Which usually leads to several people commenting that "this doesn't work" and "they are doing it wrong" and the "correct way" is this:
def rock_paper_scissors
loop do
get_user_input
determine_outcome
print_winner
end
end
So, clearly, there are some people for whom tail recursion is a more natural way to express the concept of "looping" than loop constructs.
Summary
The fact that two languages are Turing-equivalent says one and exactly one thing: that they can compute the same set of functions on natural numbers as a Turing Machine can. That's it.
It doesn't say anything about how fast they compute those functions. It doesn't say anything about the ease of expressing those functions. And it doesn't say anything about what else they can do besides computing functions on natural numbers (e.g. linking to C libraries, reading input from the user, write output to the screen).
does that mean that the class of problems that each programming language can solve acctually varies by language, even though these languages are all turing complete?
Yes.
- There are problems that aren't covered by the term "Turing-complete" (which only concerns itself with computing functions on natural numbers) such as printing to the screen. Two languages can be Turing-complete but one can allow printing to the screen and the other not.
- Even if both languages can solve the same problems, that doesn't say anything about how complex the encoding is, and how easy it is to express this encoding. E.g. C can solve every problem Haskell can, simply by writing a Haskell interpreter in C … but you have to write the Haskell interpreter first in order to solve a problem this way!