51

Stroustrup claims that Cfront, the first C++ compiler, was written in C++ (Stroustrup FAQ).

However, how is it even possible that the first C++ compiler be written in C++?

The code that makes up the compiler needs to be compiled too, and thus the first C++ compiler couldn't have been written in C++, could it?

2

5 Answers 5

61

The key is right here:

The first C++ compiler (Cfront) was written in C++. To build that, I first used C to write a "C with Classes"-to-C preprocessor. "C with Classes" was a C dialect that became the immediate ancestor to C++. That preprocessor translated "C with Classes" constructs (such as classes and constructors) into C. It was a traditional preprocessor that didn't understand all of the language, left most of the type checking for the C compiler to do, and translated individual constructs without complete knowledge. I then wrote the first version of Cfront in "C with Classes".

So the first version of Cfront wasn't written in C++, rather in the intermediate language. The ability to create C compilers and preprocessors directly in C led to many of the innovations (and massive security holes) in C. So you write your new preprosessor that turns your "C with Classes" code into straight C (because straight C can do anything) and then you use "C with Classes" to write a C++ compiler (not that you couldn't do it in C, just it would take awhile) and then you use that C++ compiler to write a more effecient/complete compiler in C++. Got it?

9
  • 5
    +1 for including a link to one of my favorite tales of things that can be done (and shouldn't).
    – jwernerny
    Commented Sep 1, 2011 at 16:58
  • 4
    The compiler was written in valid C++ code, but only used a few of the full C++ features, those which were supported by the "C with Classes" preprocessor. It used a subset of the full language, so it also compiled on the result (the first working version of Cfront). After performing this "bootstrap" step, he probably never needed to use the preprocessor again. Commented Jan 2, 2013 at 2:30
  • 2
    @jwernerny - I've always found that article unsatisfying. He glosses over the most difficult and non-trivial part: "The bug would match code in the UNIX 'login' command. The replacement code would miscompile the login command so that it would accept either the intended encrypted password or a particular known password." But how would this be done? Has it ever actually been demonstrated?
    – detly
    Commented Feb 12, 2013 at 5:07
  • 3
    "led to many of the innovations (and massive security holes) in C": As far as I know these tricks can be used in any language, not just in C. So any other language can have the same security holes.
    – Giorgio
    Commented Feb 12, 2013 at 8:13
  • 2
    @detly: It sounds trivial now, but in 1983 this was a novel attack made viable by a lack of implementation diversity. We were more trusting of binaries back then, partially because compiling everything from source was a much bigger ordeal than it is now.
    – Blrfl
    Commented Feb 13, 2013 at 22:45
17

It was bootstrapped. As soon as a C++ feature was added to cfront, then cfront could also use that feature from that point on (but not to implement that very feature). This worked because cfront had the ability to convert C++ code to C code. So if some new platform came out, you could use cfront on another platform to convert cfront from C++ to C, and then use the new platform's C compiler to finish the compilation from C to object code.

9

I think B.S. answers that question:

The first C++ compiler (Cfront) was written in C++. To build that, I first used C to write a "C with Classes"-to-C preprocessor. "C with Classes" was a C dialect that became the immediate ancestor to C++. That preprocessor translated "C with Classes" constructs (such as classes and constructors) into C. It was a traditional preprocessor that didn't understand all of the language, left most of the type checking for the C compiler to do, and translated individual constructs without complete knowledge.

I then wrote the first version of Cfront in "C with Classes". Cfront was a traditional compiler that did complete syntax and semantic checking of the C++ source. For that, it had a complete parser, built symbol tables, and built a complete internal tree representation of each class, function, etc. It also did some source level optimization on its internal tree representation of C++ constructs before outputting C. The version that generated C, did not rely on C for any type checking. It simply used C as an assembler. The resulting code was uncompromisingly fast.

First he created something he called "C with Classes" implemented by a simple preprocessor into C. It was basically C++, but the preprocessor did little or no checking. He then used that to write Cfront, the more powerful version of the translator of C++ into C, complete with type checking, symbol tables, etc.

5
  • 1
    so basically when we compile a C++ program, it gets converted into C, then after it's converted into C, it gets compiled again to machine code?
    – Pacerier
    Commented Sep 2, 2011 at 7:06
  • @Pacerier: Originally, yes, but not now I think. Commented Sep 2, 2011 at 13:24
  • i don't quite understand your comment. do you mean now there are compilers that skip the second step and simply take the C++ source and compile to machine code?
    – Pacerier
    Commented Sep 3, 2011 at 4:00
  • 7
    @Pacerier: Well, they don't go directly to assembly language or machine code. Usually they first go to a machine-independent intermediate representation (triples or quads) and analyze that for optimization. From that they generate assembly or machine code. If you pick up a book on compiler design (Aho & Ullman) I'm sure you'll find it interesting. Commented Sep 3, 2011 at 14:06
  • 1
    It is important to note that the C++ he was building was also a fraction of the language that now exists. It had no templates, no new libraries, used C casting only and if I recall correctly, had no exceptions.
    – user53141
    Commented Aug 21, 2013 at 20:17
3

I'll add this answer since no answer covered this aspect.

You technically don't need software to compile code. As long as you have the necessary compiler specifications you can do the actual compilation manually. This is not how the first C++ compiler was compiled. I'm just saying it's possible.

Compare with assembly language. When they were used in the early days, there were no assembler software to convert the assembly code to machine code. It was done by hand, but the assembly language gave the programmers a better overview.

-1

A Computer is like a king or queen who is fluent in machine language, but has to ask translators, assemblers, and interpreters to translate languages like C++ into machine language for the reigning monarchs ears.

Computers do not directly understand C++. When compiling your code into machine language, we can talk about the following:

  1. The source language
  2. The language in which the compiler is written.
  3. The target language

A C++ compiler compiles C++ into machine language.

The source language is C++.

The destination language is machine language.

The language in which the compiler is written can be anything available which you enjoy using.

The compiler can be written in python, FORTRAN, Intel x86, or anything that you desire.

The compiler essentially takes a text file as input and outputs a executable binary file (a file written in machine language).

If you make many many new versions of the same language, then you can compile newer versions using older versions.

Originally, people:

  1. compiled C++ into C
  2. compiled C into x86 Assembly
  3. assembled x86 Assembly into a machine language.

Eventually, people started skipping steps in the middle. For example, some code written in C++ is never translated into C before it becomes an executable binary file.

Feel free to edit the following table to make it more historically accurate.

Someone else will pick up where you left off if you get tired.

SOURCE LANGUAGE LANGUAGE OF THE COMPILER DESINATION LANGUAGE
C++ version 1.0 C C
C++ version 2.0 C++ version 1.0 C
C++ version 3.0 C++ version 2.0 C
C++ version 4.0 C++ version 3.0 C

Not the answer you're looking for? Browse other questions tagged or ask your own question.