56

I know that virtual functions have an overhead of dereferencing to call a method. But I guess with modern architectural speed it is almost negligible.

  1. Is there any particular reason why all functions in C++ are not virtual as in Java?
  2. From my knowledge, defining a function virtual in a base class is sufficient/necessary. Now when I write a parent class, I might not know which methods would get over-ridden. So does that mean that while writing a child class someone would have to edit the parent class. This sounds like inconvenient and sometimes not possible?

Update:
Summarizing from Jon Skeet's answer below:

It's a trade-off between explicitly making someone realize that they are inheriting functionality [which has potential risks in themselves [(check Jon's response)] [and potential small performance gains] with a trade-off for less flexibility, more code changes, and a steeper learning curve.

Other reasons from different answers:

Virtual functions cannot be in-lined because inlining have to happen at runtime. This have performance impacts when you expect you functions benefits from inlining.

There might be potentially other reasons, and I would love to know and summarize them.

3
  • It is also possible to inline functions which are not virtual, which allows for lots of compiler optimizations that wouldn't be available in cases where the function is defined as virtual. Commented Jul 8, 2011 at 1:50
  • Hi Thoman, Can you explain why it wont be possible to inline virtual functions ?. Is it limitation of avail compilers or there is a theoretical blocker . How does JVM optimize it ? Commented Jul 8, 2011 at 3:02
  • @codeObserver In virtual functions the decision about what method to call is made at runtime. With inline functions the method's body is compiled into the caller, a decision that has to be made at compile time. Commented Jul 8, 2011 at 18:56

11 Answers 11

78

There are good reasons for controlling which methods are virtual beyond performance. While I don't actually make most of my methods final in Java, I probably should... unless a method is designed to be overridden, it probably shouldn't be virtual IMO.

Designing for inheritance can be tricky - in particular it means you need to document far more about what might call it and what it might call. Imagine if you have two virtual methods, and one calls the other - that must be documented, otherwise someone could override the "called" method with an implementation which calls the "calling" method, unwittingly creating a stack overflow (or infinite loop if there's tail call optimization). At that point you've then got less flexibility in your implementation - you can't switch it round at a later date.

Note that C# is a similar language to Java in various ways, but chose to make methods non-virtual by default. Some other people aren't keen on this, but I certainly welcome it - and I'd actually prefer that classes were uninheritable by default too.

Basically, it comes down to this advice from Josh Bloch: design for inheritance or prohibit it.

20
  • I haven't had much experience with utilizing virtual functions, but couldn't you cause the same situation utilizing only one virtual function and a different, non-virtual function in a subclass, as long as it's an object of the subclass itself, without any casting to the superclass, utilizing the non-virtual function? I suppose my knowledge of the specifics of virtualization is still pretty vague.
    – JAB
    Commented Jul 7, 2011 at 14:26
  • But then C++ lets you override virtual function and have them called (sometimes, depending). Its that decision which seems problematic to me. Commented Jul 8, 2011 at 8:22
  • 4
    +1 the principle to follow for everything is: Always make the common case the default. The designers of Java (very reasonably) thought that virtual-methods should be the default; however, that turned out to be a mistake (see "Effective Java" for more info). C# made the correct design from a design standpoint; however, non-virtual-by-default methods make unit testing an extreme pain because classes can no longer be mocked, like in Java. C# needs facilities to make unit testing reasonable (mocking classes, access to private methods for testing classes, etc.) Commented Jul 8, 2011 at 8:51
  • 1
    @Jon Skeet, I suppose it might be more correct to say that they are hidden. However, it still seems ill-advised to allow a subclass to "hide" a superclasses methods like that. Commented Jul 8, 2011 at 10:39
  • 1
    @Jon Skeet, I guess was thinking primarily of functions which have the same signature (or close enough) that they could be virtual overloads. Its far too easy to try to overload a method which isn't virtual. But I see that there could be uses of it. Commented Jul 8, 2011 at 11:24
54
  1. One of the main C++ principles is: you only pay for what you use ("zero overhead principle"). If you don't need the dynamic dispatch mechanism, you shouldn't pay for its overhead.

  2. As the author of the base class, you should decide which methods should be allowed to be overridden. If you're writing both, go ahead and refactor what you need. But it works this way, because there has to be a way for the author of the base class to control its use.

2
  • Correct me if I'm wrong, but I believe your point #2 is not correct. Making methods not-virtual in the base class does NOT prevent them from being overridden, so the author really can't decide which methods should be allowed to be overridden. It only prevents them from being called polymorphically (i.e. via base-class pointer). But they can still be overridden and can still be called when called directly from an instance of the derived class (or pointer to derived class). Commented Aug 29, 2013 at 22:08
  • 3
    @DanielGoldfarb, non-virtual member functions can't be overridden, period. They can, however, be hidden - but that's a different thing. My point is that preventing overriding is yet another aspect of encapsulation. Hiding a member function will not change the behavior of the base class, and will not add dependencies that might limit the ability to change the base class in the future.
    – Eran
    Commented Aug 30, 2013 at 8:27
32

But I guess with modern architectural speed it is almost negligible.

This assumption is wrong, and, I guess, the main reason for this decision.

Consider the case of inlining. C++’ sort function performs much faster than C’s otherwise similar qsort in some scenarios because it can inline its comparator argument, while C cannot (due to use of function pointers). In extreme cases, this can mean performance differences of as much as 700% (Scott Meyers, Effective STL).

The same would be true for virtual functions. We’ve had similar discussions before; for instance, Is there any reason to use C++ instead of C, Perl, Python, etc?

8
  • 2
    Yes, basically a virtual function cannot be inlined, nor can its argument passing be optimized. Commented Jul 7, 2011 at 6:40
  • 1
    Even this statement is starting to be untrue... gcc, for instance, is capable of inlining through a function pointer. Presumably, this can be extended to a virtual method. Commented Jul 7, 2011 at 6:43
  • 1
    gcc does inline virtual functions, provided that you're calling the function on an object of known dynamic type (because the virtual mechanism doesn't need to be used anyway). If you're sorting a container of Base then the dynamic type is known: if Base* then it isn't and you can start worrying about a performance hammering. With function pointers it's similar - if a call to qsort is inlined, then DFA might prove the value of the function pointer, in which case the call could be inlined, although I've never looked into how successful gcc is at doing that. Commented Jul 7, 2011 at 7:25
  • @Steve This is correct. I’m even surprised that the advantage of functors over function pointers still seems to hold, since this is apparently such an obvious optimisation. In fact, I suspect that e.g. calls to qsort are rarely inlined. sort, on the other hand is a template so for a given comparator its type is known inside sort even without inlining. I suspect that the same is true for the inlining of virtual function calls on known dynamic types. Commented Jul 7, 2011 at 8:18
  • 1
    @Konrad: qsort is sometimes inhibited from inlining on account of being in a different TU, whereas sort is always available in the TU. It'd be slightly interesting to take implementations that do link-time optimization, and see whether they are any better or worse at inlining qsort and then inlining the comparator, than they are at inlining sort and its comparator (when sort is passed a function pointer rather than a functor object of user-defined type, to keep the comparison fair). Commented Jul 7, 2011 at 8:25
14

Most answers deal with the overhead of virtual functions, but there are other reasons not to make any function in a class virtual, as the fact that it will change the class from standard-layout to, well, non-standard-layout, and that can be a problem if you need to serialize binary data. That is solved differently in C#, for example, by having structs being a different family of types than classes.

From the design point of view, every public function establishes a contract between your type and the users of the type, and every virtual function (public or not) establishes a different contract with the classes that extend your type. The greater the number of such contracts that you sign the less room for changes that you have. As a matter of fact, there are quite a few people, including some well known writers, that defend that the public interface should never contain virtual functions, as your compromise to your clients might be different from the compromises you require from your extensions. That is, the public interfaces shows what you do for your clients, while the virtual interface shows how others might help you in doing it.

Another effect of virtual functions is that they always get dispatched to the final overrider (unless you explicitly qualify the call), and that means that any function that is needed to maintain your invariants (think the state of the private variables) should not be virtual: if a class extends it, it will have to either make an explicit qualified call back to the parent or else would break the invariants at your level.

This is similar to the example of the infinite loop/stack overflow that @Jon Skeet mentioned, just in a different way: you have to document in each function whether it accesses any private attributes so that extensions will ensure that the function is called at the right time. And that in turn means that you are breaking encapsulation and you have a leaking abstraction: Your internal details are now part of the interface (documentation + requirements on your extensions), and you cannot modify them as you wish.

Then there is performance... there will be an impact in performance, but in most cases that is overrated, and it could be argued that only in the few cases where performance is critical, you would fall back and declare the functions non-virtual. Then again, that might not be simple on a built product, since the two interfaces (public + extensions) are already bound.

8

You forget one thing. The overhead is also in memory, that is you add a virtual table and a pointer to that table for each object. Now if you have an object which has significant number of instances expected then it is not negligible. example, million instance equals 4 Mega byte. I agree that for simple application this is not much, but for real time devices such as routers this counts.

1
  • I'm working on embedded devices, some of which have like 2k of RAM. On those you really want to avoid the pointer overhead, also the extra time cost for calling the methods indirectly through an extra pointer, good point!
    – Droggl
    Commented Apr 17, 2013 at 8:24
6

I'm rather late to the party here, so I'll add one thing that I haven't noticed covered in other answers, and summarise quickly...

  • Usability in shared memory: a typical implementation of virtual dispatch has a pointer to a class-specific virtual dispatch table in each object. The addresses in these pointers are specific to the process creating them, which means multi-process systems accessing objects in shared memory can't dispatch using another process's object! That's an unacceptable limitation given shared memory's importance in high-performance multi-process systems.

  • Encapsulation: the ability of a class designer to control the members accessed by client code, ensuring class semantics and invariants are maintained. For example, if you derive from std::string (I may get a few comments for daring to suggest that ;-P) then you can use all the normal insert / erase / append operations and be sure that - provided you don't do anything that's always undefined behaviour for std::string like pass bad position values to functions - the std::string data will be sound. Someone checking or maintaining your code doesn't have to check if you've changed the meaning of those operations. For a class, encapsulation ensures freedom to later modify the implementation without breaking client code. Another perspective on the same statement: client code can use the class any way it likes without being sensitive to the implementation details. If any function can be changed in a derived class, that whole encapsulation mechanism is simply blown away.

    • Hidden dependencies: when you know neither what other functions are dependent on the one you're overriding, nor that the function was designed to be overridden, then you can't reason about the impact of your change. For example, you think "I've always wanted this", and change std::string::operator[]() and at() to consider negative values (after a type-cast to signed) to be offsets backwards from the end of the string. But, perhaps some other function was using at() as a kind of assertion that an index was valid - knowing it'll throw otherwise - before attempting an insertion or deletion... that code might go from throwing in a Standard-specified way to having undefined (but likely lethal) behaviour.
    • Documentation: by making a function virtual, you're documenting that it is an intended point of customisation, and part of the API for client code to use.

  • Inlining - code side & CPU usage: virtual dispatch complicates the compiler's job of working out when to inline function calls, and could therefore provide worse code in terms of both space/bloat and CPU usage.

  • Indirection during calls: even if an out-of-line call is being made either way, there's a small performance cost for virtual dispatch that may be significant when calling trivially simple functions repeatedly in performance critical systems. (You have to read the per-object pointer to the virtual dispatch table, then the virtual dispatch table entry itself - means the VDT pages are consuming cache too.)

  • Memory usage: the per-object pointers to virtual dispatch tables may represent significant wasted memory, especially for arrays of small objects. This means less objects fit in cache, and can have a significant performance impact.

  • Memory layout: it's essential for performance, and highly convenient for interoperability, that C++ can define classes with the exact memory layout of member data specified by network or data standards of various libraries and protocols. That data often comes from outside your C++ program, and may be generated in another language. Such communications and storage protocols won't have "gaps" for pointers to virtual dispatch tables, and as discussed earlier - even if they did, and the compiler somehow let you efficiently inject the correct pointers for your process over incoming data, that would frustrate multi-process access to the data. Crude-but-practical pointer/size based serialisation/deserialisation/comms code would also be made more complicated and potentially slower.

5

Pay per use (in Bjarne Stroustrup words).

0
3

Seems like this question might have some answers Virtual functions should not be used excessively - Why ?. In my opinion the one thing that stands out is that it just add more complexity in terms of knowing what can be done with inheritance.

2

Yes, it's because of performance overhead. Virtual methods are called using virtual tables and indirection.

In Java all methods are virtual and the overhead is also present. But, contrary to C++, the JIT compiler profiles the code during run-time and can in-line those methods which don't use this property. So, JVM knows where it's really needed and where not thus freeing You from making the decision on your own.

8
  • +1: This is what I was going to say. The JVM can make decisions at runtime a compiler cannot make staticly. Commented Jul 7, 2011 at 7:18
  • 2
    In fact a JIT can do even better, it can inline a method that does use the property, but do a fast type-check for a type that very commonly occurs. It makes the virtual call if the type-check fails. So the code for obj.foo() ends up looking like a bit like if (obj.getClass() == Class.forName("BaseClass")) { /* inlined code from BaseClass.foo() */ } else { obj.foo(); };. Except that of course the call to getClass is inlined, to just grab a pointer out of the object, and the result of the call to forName is a pointer to a class object, and that value is inlined into the code too. Commented Jul 7, 2011 at 7:34
  • @Steve Jessop: I do recall an article (was that from an engineer from Azul or someone directly responsible for JVM I can't remember) where the issue was discussed, but I never did see anything with such detailed example. Could You post some reference, where I could read more?
    – Rekin
    Commented Jul 7, 2011 at 7:52
  • 1
    @Steve Jessop: So, in theory a JIT can do that optimization. But the same check can in theory be inserted by a C++ compiler, especially with PGO. So I'd disagree with "a JIT can do even better".
    – MSalters
    Commented Jul 7, 2011 at 8:10
  • 1
    As you imply, there's no reason in principle why a C++ compiler can't emit self-modifying code that plays every trick in the JIT book to do runtime optimization. Actually I don't know of any C++ compiler that does this, and it is useful because it could be that depending on input, the dynamic type is 99.9% Foo or 99.9% Bar, and the fact that (some) JITs continue optimizing after runtime starts is what lets them optimize this run of the program, happening now on the user's machine. Profile-guided C++ compilers in my experience only optimize some standard development run of the program. Commented Jul 7, 2011 at 8:17
1

The issues is that while Java compiles to code that runs on a virtual machine, that same guarantee can't be made for C++. It common to use C++ as a more organized replacement for C, and C has a 1:1 translation to assembly.

If you consider that 9 out of 10 microprocessors in the world are not in a personal computer or a smartphone, you'll see the issue when you further consider that there are a lot of processors that need this low level access.

C++ was designed to avoid that hidden deferencing if you didn't need it, thus keeping that 1:1 nature. Some of the first C++ code actually had an intermediate step of being translated to C before running through a C-to-assembly compiler.

4
  • 1
    C has a 1:1 translation to assembly? That would be a surprise. Which CPU has switch(foo) ? Hell, which CPU has for ? Most use a compare and branch instruction.
    – MSalters
    Commented Jul 7, 2011 at 8:08
  • maybe 1:1 is a bad way to put it... but there is typically a direct translation between c constructs and the resultant assembly code to the point where this was a main design feature of c at the time it was made. C++ was designed to maintain this relationship whenever possible.
    – Ape-inago
    Commented Jul 7, 2011 at 9:22
  • 1
    Whether it's exceptions, templates or virtual functions, none of points where C++ differs most noticeably from C are at all related to CPU instructions. Not to mention the Standard Library. No, what you observe is really the consequence of a VM-less language. There's just less hidden code as a result.
    – MSalters
    Commented Jul 7, 2011 at 10:14
  • The point I was trying to make was that if you don't use those bits of c++ (templates, inheritance, etc), much of the resultant assembly turns out to be very similar to c. the only difference ends up being an implicit structure being passed into member functions, where as most complicated c code just passes the struct around explictly. For developers who use C because it is easy to translate it into various forms of assembly, C++ is an easy transition because of the "if you don't use it, you don't pay for it" nature. This was intentional to keep it compatible with C.
    – Ape-inago
    Commented Jul 10, 2011 at 8:26
-5

Java method calls are far more efficient than C++ due to runtime optimization.

What we need is to compile C++ into bytecode and run it on JVM.

8
  • 3
    Lol... is this for real? If you look at different language performance in most cases good C++ performs better than good Java equivalents. Commented Jul 7, 2011 at 7:32
  • 5
    @spraff: Java can't do any optimization that self-modifying binaries from C++ can't. But JVMs do perform optimizations that no C++ compiler actually does, because JVMs can and do optimize based on profile data from this exact run of the program, whereas no C++ implementation that I know of does that, PGO works off a run that some developer did back at HQ. JVMs use this advantage to somewhat compensate for the C++ compiler's advantages. It'd be interesting to see if a best-of-both C++ implementation were possible, but it certainly would not be as simple as compiling C++ to Java bytecode. Commented Jul 7, 2011 at 8:34
  • 1
    And in fact C++ does already get a little bit of the same kind of benefit, since some CPUs use profile data from this exact run of the program to do e.g. branch prediction. But this doesn't come through the efforts of the C++ compiler. Commented Jul 7, 2011 at 8:36
  • 6
    Let's just be clear that there might be a compliler advantage or a toolchain advantage but it's not a language advantage. Profiling optimisers for C++ do exist but people don't bother with them and/or they're commercial and closed. The cynic in me says that if Java does this more willingly it's because it has to compensate for being inherently slower in the first place. And let's not overlook the fact that a fast execution with added optimisation delay might be slower than a naive execution!
    – spraff
    Commented Jul 7, 2011 at 8:58
  • 1
    @spraff: sure, it depends whether you're talking about properties of the language specification, or properties of actual implementations that exist. It's not as if C++ optimizer-writers think to themselves "ah, this is fast enough, let's get down the pub", there's a real difference between PGO based on a profile of a single run prior to compile-time, and the code mutation that modern JITs do using profile data from this exact run. And agreed, sometimes optimization is counter-productive. It takes time to do it, and there may be pathological cases that make the "optimized" code slower. Commented Jul 7, 2011 at 12:28

Not the answer you're looking for? Browse other questions tagged or ask your own question.