221

I've never really understood why C++ needs a separate header file with the same functions as in the .cpp file. It makes creating classes and refactoring them very difficult, and it adds unnecessary files to the project. And then there is the problem with having to include header files, but having to explicitly check if it has already been included.

C++ was ratified in 1998, so why is it designed this way? What advantages does having a separate header file have?


Follow up question:

How does the compiler find the .cpp file with the code in it, when all I include is the .h file? Does it assume that the .cpp file has the same name as the .h file, or does it actually look through all the files in the directory tree?

2

13 Answers 13

173

Some people consider header files an advantage:

  • It is claimed that it enables/enforces/allows separation of interface and implementation -- but usually, this is not the case. Header files are full of implementation details (for example member variables of a class have to be specified in the header, even though they're not part of the public interface), and functions can, and often are, defined inline in the class declaration in the header, again destroying this separation.
  • It is sometimes said to improve compile-time because each translation unit can be processed independently. And yet C++ is probably the slowest language in existence when it comes to compile-times. A part of the reason is the many many repeated inclusions of the same header. A large number of headers are included by multiple translation units, requiring them to be parsed multiple times.

Ultimately, the header system is an artifact from the 70's when C was designed. Back then, computers had very little memory, and keeping the entire module in memory just wasn't an option. A compiler had to start reading the file at the top, and then proceed linearly through the source code. The header mechanism enables this. The compiler doesn't have to consider other translation units, it just has to read the code from top to bottom.

And C++ retained this system for backwards compatibility.

Today, it makes no sense. It is inefficient, error-prone and overcomplicated. There are far better ways to separate interface and implementation, if that was the goal.

However, one of the proposals for C++0x was to add a proper module system, allowing code to be compiled similar to .NET or Java, into larger modules, all in one go and without headers. This proposal didn't make the cut in C++0x, but I believe it's still in the "we'd love to do this later" category. Perhaps in a TR2 or similar.

4
  • 15
    THIS is the best answer on the page. Thank you! Commented May 31, 2020 at 13:54
  • 21
    This answer should be the accepted one as it really explains why C++ was designed that way, and not "why you might want to separate"
    – SubMachine
    Commented Aug 12, 2020 at 13:51
  • 2
    I love this. Usability should always be placed forefront. I hope this where C++ is heading for.
    – kakyo
    Commented Apr 29, 2021 at 3:43
  • 7
    C++20: modules
    – Eljay
    Commented Oct 31, 2021 at 19:49
129

You seem to be asking about separating definitions from declarations, although there are other uses for header files.

The answer is that C++ doesn't "need" this. If you mark everything inline (which is automatic anyway for member functions defined in a class definition), then there is no need for the separation. You can just define everything in the header files.

The reasons you might want to separate are:

  1. To improve build times.
  2. To link against code without having the source for the definitions.
  3. To avoid marking everything "inline".

If your more general question is, "why isn't C++ identical to Java?", then I have to ask, "why are you writing C++ instead of Java?" ;-p

More seriously, though, the reason is that the C++ compiler can't just reach into another translation unit and figure out how to use its symbols, in the way that javac can and does. The header file is needed to declare to the compiler what it can expect to be available at link time.

So #include is a straight textual substitution. If you define everything in header files, the preprocessor ends up creating an enormous copy and paste of every source file in your project, and feeding that into the compiler. The fact that the C++ standard was ratified in 1998 has nothing to do with this, it's the fact that the compilation environment for C++ is based so closely on that of C.

Converting my comments to answer your follow-up question:

How does the compiler find the .cpp file with the code in it

It doesn't, at least not at the time it compiles the code that used the header file. The functions you're linking against don't even need to have been written yet, never mind the compiler knowing what .cpp file they'll be in. Everything the calling code needs to know at compile time is expressed in the function declaration. At link time you will provide a list of .o files, or static or dynamic libraries, and the header in effect is a promise that the definitions of the functions will be in there somewhere.

7
  • 4
    To add to "The reasons you might want to separate are:" & I thing the most important function of header files is: To Separate code structure design from implementation, Because: A. When you get into really complicated structures that involve many objects it is much easier to sift through header files and remember how they work together, supplemented by your header comments. B.Went one person un taking care of defining all the object structure and some else is taking care of implementation it keeps things organized. Over all I think it makes complex code more readable. Commented Jul 12, 2012 at 16:19
  • In a simplest way I can think of the usefulness of header vs. cpp files separation is to separate Interface vs. Implementations which truly helps for medium/big projects.
    – krishna
    Commented Mar 19, 2015 at 7:09
  • 14
    @AndresCanella No it does not. It makes reading and maintaining not-your-own-code a nightmare. To fully understand what something does in the code you need to jump through 2n files instead of n files. This just isn't Big-Oh notation, 2n makes a lot of difference in comparison to just n. Commented Apr 10, 2016 at 12:37
  • 4
    I second that its a lie that headers help. check minix source for example, it's so hard to follow where it starts to where control is passed, where things are declared/defined.. if it was built via separated dynamic modules, it would be digestible by making sense of one thing then jumping to a dependency module. instead, you need to follow headers and it makes reading any code written in this way hell. in contrast, nodejs makes it clear where what comes from without any ifdefs, and you can easily identify where what came from.
    – Dmytro
    Commented Aug 16, 2016 at 18:01
  • 4
    "why are you writing C++ instead of [x]". We don't write C++ because we want to, we write C++ because we have to :P
    – Seb
    Commented Oct 29, 2021 at 0:26
118

C++ does it that way because C did it that way, so the real question is why did C do it that way? Wikipedia speaks a little to this.

Newer compiled languages (such as Java, C#) do not use forward declarations; identifiers are recognized automatically from source files and read directly from dynamic library symbols. This means header files are not needed.

3
  • 18
    +1 Hits the nail on the head. This really doesn't require a verbose explanation.
    – MSalters
    Commented Aug 20, 2009 at 14:01
  • 15
    It didn't hit my nail on the head :( I still have to look up why C++ has to use forward declarations and why it can't recognize identifiers from source files and read directly from dynamic library symbols, and why C++ did it that way just because C did it that way :p Commented Aug 1, 2015 at 0:11
  • 7
    And you are a better programmer for having done so @AlexanderTaylor :) Commented Aug 2, 2015 at 2:13
35

To my (limited - I'm not a C developer normally) understanding, this is rooted in C. Remember that C does not know what classes or namespaces are, it's just one long program. Also, functions have to be declared before you use them.

For example, the following should give a compiler error:

void SomeFunction() {
    SomeOtherFunction();
}

void SomeOtherFunction() {
    printf("What?");
}

The error should be that "SomeOtherFunction is not declared" because you call it before it's declaration. One way of fixing this is by moving SomeOtherFunction above SomeFunction. Another approach is to declare the functions signature first:

void SomeOtherFunction();

void SomeFunction() {
    SomeOtherFunction();
}

void SomeOtherFunction() {
    printf("What?");
}

This lets the compiler know: Look somewhere in the code, there is a function called SomeOtherFunction that returns void and does not take any parameters. So if you encouter code that tries to call SomeOtherFunction, do not panic and instead go looking for it.

Now, imagine you have SomeFunction and SomeOtherFunction in two different .c files. You then have to #include "SomeOther.c" in Some.c. Now, add some "private" functions to SomeOther.c. As C does not know private functions, that function would be available in Some.c as well.

This is where .h Files come in: They specify all the functions (and variables) that you want to 'Export' from a .c file that can be accessed in other .c files. That way, you gain something like a Public/Private scope. Also, you can give this .h file to other people without having to share your source code - .h files work against compiled .lib files as well.

So the main reason is really for convenience, for source code protection and to have a bit of decoupling between the parts of your application.

That was C though. C++ introduced Classes and private/public modifiers, so while you could still ask if they are needed, C++ AFAIK still requires declaration of functions before using them. Also, many C++ Developers are or were C devleopers as well and took over their concepts and habits to C++ - why change what isn't broken?

10
  • 5
    Why can't the compiler run through the code and find all the function definitions? It seems like something that would be pretty easy to program into the compiler.
    – Marius
    Commented Aug 20, 2009 at 13:16
  • 5
    If you have the source, which you often don't have. Compiled C++ is effectively machine code with just enough additional information to load and link the code. Then, you point the CPU at the entry point, and let it run. This is fundamentally different from Java or C#, where the code is compiled into an intermediary bytecode containing metadata on its contents.
    – DevSolar
    Commented Aug 20, 2009 at 13:27
  • 3
    Yup - compiling on a 16 bitter with tape mass torage is non-trivial.
    – MSalters
    Commented Aug 20, 2009 at 14:10
  • 2
    @Puddle I don't think that's the true reason, because in the 70's when C was developed, sharing the source code was the norm rather than the exception. I believe it's because random access to files wasn't easily possible - back then, using magnetic tapes was common, and so the language can be compiled by only ever going forward through the files, never backwards or jumping around. .h files seem like a great way to move declarations forward without introducing an even bigger mess of conflicting implementations. Commented Jun 11, 2018 at 14:46
  • 1
    @MichaelStum but why then? why would they keep it in? language is about understanding the purpose of what the programmer is writing. everybody can understand how to create headers based upon all the classes. it's a meaningless task if it literally does nothing but helps c++ compile. we've moved on and could make that automated if it does nothing else. if it serves no other purpose...
    – Puddle
    Commented Jun 11, 2018 at 14:50
16

First advantage: If you don't have header files, you would have to include source files in other source files. This would cause the including files to be compiled again when the included file changes.

Second advantage: It allows sharing the interfaces without sharing the code between different units (different developers, teams, companies etc..)

11
  • 2
    Are you implying that, e.g. in C# 'you would have to include source files in other source files' ? Because obviously you don't. For the second advantage, I think that's too language dependent: you wont use .h files in e.g. Delphi
    – Vlagged
    Commented Aug 20, 2009 at 13:01
  • You have to recompile the entire project anyways, so does the first advantage really count?
    – Marius
    Commented Aug 20, 2009 at 13:06
  • ok, but I don't think that a language feature. It is more something practical to deal with C declaration before definition "problem". It is like someone famous saying "that's not a bug that's a feature" :)
    – neuro
    Commented Aug 20, 2009 at 13:14
  • @Marius: Yes, it really counts. Linking the whole project is different from compiling&linking the whole project. An as the # of files in the project increases, compiling all of them gets really annoying. @Vlagged: You are right, but i didn't compare c++ with another language. I compared using only source files vs using source&header files.
    – zweihander
    Commented Aug 20, 2009 at 13:17
  • C# doesn't include source files in others, but you still have to reference the modules - and that makes the compiler fetch the source files (or reflect into the binary) to parse the symbols that your code uses.
    – gbjbaanb
    Commented Aug 20, 2009 at 13:18
7

The need for header files results from the limitations that the compiler has for knowing about the type information for functions and or variables in other modules. The compiled program or library does not include the type information required by the compiler to bind to any objects defined in other compilation units.

In order to compensate for this limitation, C and C++ allow for declarations and these declarations can be included into modules that use them with the help of the preprocessor's #include directive.

Languages like Java or C# on the other hand include the information necessary for binding in the compiler's output (class-file or assembly). Hence, there is no longer a need for maintaining standalone declarations to be included by clients of a module.

The reason for the binding information not being included in the compiler output is simple: it is not needed at runtime (any type checking occurs at compile time). It would just waste space. Remember that C/C++ come from a time where the size of an executable or library did matter quite a bit.

1
6

Well, C++ was ratified in 1998, but it had been in use for a lot longer than that, and the ratification was primarily setting down current usage rather than imposing structure. And since C++ was based on C, and C has header files, C++ has them too.

The main reason for header files is to enable separate compilation of files, and minimize dependencies.

Say I have foo.cpp, and I want to use code from the bar.h/bar.cpp files.

I can #include "bar.h" in foo.cpp, and then program and compile foo.cpp even if bar.cpp doesn't exist. The header file acts as a promise to the compiler that the classes/functions in bar.h will exist at run-time, and it has everything it needs to know already.

Of course, if the functions in bar.h don't have bodies when I try to link my program, then it won't link and I'll get an error.

A side-effect is that you can give users a header file without revealing your source code.

Another is that if you change the implementation of your code in the *.cpp file, but do not change the header at all, you only need to compile the *.cpp file instead of everything that uses it. Of course, if you put a lot of implementation into the header file, then this becomes less useful.

5

C++ was designed to add modern programming language features to the C infrastructure, without unnecessarily changing anything about C that wasn't specifically about the language itself.

Yes, at this point (10 years after the first C++ standard and 20 years after it began seriously growing in usage) it is easy to ask why doesn't it have a proper module system. Obviously any new language being designed today would not work like C++. But that isn't the point of C++.

The point of C++ is to be evolutionary, a smooth continuation of existing practise, only adding new capabilities without (too often) breaking things that work adequately for its user community.

This means that it makes some things harder (especially for people starting a new project), and some things easier (especially for those maintaining existing code) than other languages would do.

So rather than expecting C++ to turn into C# (which would be pointless as we already have C#), why not just pick the right tool for the job? Myself, I endeavour to write significant chunks of new functionality in a modern language (I happen to use C#), and I have a large amount of existing C++ that I am keeping in C++ because there would be no real value in re-writing it all. They integrate very nicely anyway, so it's largely painless.

2
  • 2
    How do you integrate C# and C++? Through COM? Commented Aug 20, 2009 at 14:23
  • 2
    There are three main ways, the "best" depends on your existing code. I've used all three. The one I use the most is COM because my existing code was already designed around it, so it's practically seamless, works very well for me. In some odd places I use C++/CLI which gives incredibly smooth integration for any situation where you don't already have COM interfaces (and you may prefer it to using existing COM interfaces even if you do have them). Finally there is p/invoke which basically lets you call any C-like function exposed from a DLL, so lets you directly call any Win32 API from C#. Commented Aug 20, 2009 at 15:58
3

It doesn't need a separate header file with the same functions as in main. It only needs it if you develop an application using multiple code files and if you use a function that was not previously declared.

It's really a scope problem.

2

C++ was ratified in 1998, so why is it designed this way? What advantages does having a separate header file have?

Actually header files become very useful when examining programs for the first time, checking out header files(using only a text editor) gives you an overview of the architecture of the program, unlike other languages where you have to use sophisticated tools to view classes and their member functions.

2

If you want the compiler to find out symbols defined in other files automatically, you need to force programmer to put those files in predefined locations (like Java packages structure determines folders structure of the project). I prefer header files. Also you would need either sources of libraries you use or some uniform way to put information needed by compiler in binaries.

1

I think the real (historical) reason behind header files was making like easier for compiler developers... but then, header files do give advantages.
Check this previous post for more discussions...

1

Well, you can perfectly develop C++ without header files. In fact some libraries that intensively use templates does not use the header/code files paradigm (see boost). But In C/C++ you can not use something that is not declared. One practical way to deal with that is to use header files. Plus, you gain the advantage of sharing interface whithout sharing code/implementation. And I think it was not envisionned by the C creators : When you use shared header files you have to use the famous :

#ifndef MY_HEADER_SWEET_GUARDIAN
#define MY_HEADER_SWEET_GUARDIAN

// [...]
// my header
// [...]

#endif // MY_HEADER_SWEET_GUARDIAN

that is not really a language feature but a practical way to deal with multiple inclusion.

So, I think that when C was created, the problems with forward declaration was underestimated and now when using a high level language like C++ we have to deal with this sort of things.

Another burden for us poor C++ users ...

Not the answer you're looking for? Browse other questions tagged or ask your own question.