38

I read in the standards n4296 (Draft) § 1.8 page 7:

An object is a region of storage. [ Note: A function is not an object, regardless of whether or not it occupies storage in the way that objects do. —end note ]

I spent some days on the net looking for a good reason for such exclusion, with no luck. Maybe because I do not fully understand objects. So:

  1. Why is a function not an object? How does it differ?
  2. And does this have any relation with the functors (function objects)?
2
  • 18
    Language definers get to pick the meanings of terms. For C++, function and object mean strictly different things because the designers chose to define them that way. In other languages, this may not be so.
    – Gene
    Commented May 15, 2017 at 3:56
  • C and C++ were designed to be able to run on both Von Neuman architectures (where instructions are stored in RAM) and Harvard architectures (where instructions are NOT stored in RAM, well, sometimes they are but Harvard machines have two separate memory access and instructions (functions) can only be accessed using instruction oriented instructions like call, goto, branch etc and cannot be accessed using data oriented instructions like load, store etc).
    – slebetman
    Commented May 15, 2017 at 14:06

3 Answers 3

46

A lot of the difference comes down to pointers and addressing. In C++¹ pointers to functions and pointers to objects are strictly separate kinds of things.

C++ requires that you can convert a pointer to any object type into a pointer to void, then convert it back to the original type, and the result will be equal to the pointer you started with². In other words, regardless of exactly how they do it, the implementation has to ensure that a conversion from pointer-to-object-type to pointer-to-void is lossless, so no matter what the original was, whatever information it contained can be recreated so you can get back the same pointer as you started with by conversion from T* to void * and back to T*.

That's not true with a pointer to a function though--if you take a pointer to a function, convert it to void *, and then convert it back to a pointer to a function, you may lose some information in the process. You might not get back the original pointer, and dereferencing what you do get back gives you undefined behavior (in short, don't do that).

For what it's worth, you can, however, convert a pointer to one function to a pointer to a different type of function, then convert that result back to the original type, and you're guaranteed that the result is the same as you started with.

Although it's not particularly relevant to the discussion at hand, there are a few other differences that may be worth noting. For example, you can copy most objects--but you can't copy any functions.

As far as relationship to function objects goes: well, there really isn't much of one beyond one point: a function object supports syntax that looks like a function call--but it's still an object, not a function. So, a pointer to a function object is still a pointer to an object. If, for example, you convert one to void *, then convert it back to the original type, you're still guaranteed that you get back the original pointer value (which wouldn't be true with a pointer to a function).

As to why pointers to functions are (at least potentially) different from pointers to objects: part of it comes down to existing systems. For example, on MS-DOS (among others) there were four entirely separate memory models: small, medium, compact, and large. Small model used 16 bit addressing for either functions or data. Medium used 16 bit addresses for data, and 20-bit addresses for code. Compact reversed that (16 bit addresses for code, 20-bit addresses for data). Large used 20-bit addresses for both code and data. So, in either compact or medium model, converting between pointers to code and pointers to functions really could and did lead to problems.

More recently, a fair number of DSPs have used entirely separate memory buses for code and for data and (like with MS-DOS memory models) they were often different widths, converting between the two could and did lose information.


  1. These particular rules came to C++ from C, so the same is true in C, for whatever that's worth.
  2. Although it's not directly required, with the way things work, pretty much the same works out to be true for a conversion from the original type to a pointer to char and back, for whatever that's worth.
11
  • 1
    Good answer. Might be worth name-dropping the Harvard architecture.
    – Kevin
    Commented May 15, 2017 at 5:23
  • 1
    converting between pointers to code and pointers to functions I believe it is typo Commented May 15, 2017 at 5:43
  • 3
    @Kevin: I considered doing so, but decided it was likely to do more to obfuscate than illuminate. The problem is that quite a few people use it to refer to things like current x86/x64, where data and code have separate caches, but still live in the same actual memory. On the DSPs I'm talking about, they had separate memory buses talking to entirely separate memory arrays (not that this means it's not Harvard architecture, but it is different from how many use the term). Commented May 15, 2017 at 6:01
  • 2
    And then one day someone has to design and implement dlsym() and things go pretty hairy.
    – Joker_vD
    Commented May 15, 2017 at 10:27
  • 1
    Separate address spaces for code and data typical for Harvard architectures. Commented May 15, 2017 at 10:36
7

Why a function is not an object? How does it differ?

To understand this, let's move from bottom to top in terms of abstractions involved. So, you have your address space through which you can define the state of the memory and we have to remember that fundamentally it's all about this state you operate on.

Okay, let's move a bit higher in terms of abstractions. I am not taking about any abstractions imposed by a programming language yet (like object, array, etc.) but simply as a layman I want to keep a record of a portion of the memory, lets call it Ab1 and another one called Ab2.

Both have a state fundamentally but I intend to manipulate/make use of the state differently.

Differently...Why and How?

Why ?

Because of my requirements (to perform addition of 2 numbers and store the result back, for example). I will be using use Ab1 as a long usage state and Ab2 as relatively shorter usage state. So, I will create a state for Ab1(with the 2 numbers to add) and then use this state to populate some of state of Ab2(copy them temporarily) and perform further manipulation of Ab2(add them) and save a portion of resultant Ab2 to Ab1(the added result). Post that Ab2 becomes useless and we reset its state.

How?

I am going to need some management of both the portions to keep track of what words to pick from Ab1 and copy to Ab2 and so on. At this point I realize that I can make it work to perform some simple operations but something serious shall require a laid out specification for managing this memory.

So, I look for such management specification and it turns out there exists a variety of these specifications (with some having built-in memory model, others provide flexibility to manage the memory yourself) with a better design. In-fact because they(without even dictating how to manage the memory directly) have successfully defined the encapsulation for this long lived storage and rules for how and when this can be created and destroyed.

The same goes for Ab2 but the way they present it makes me feel like this is much different from Ab1. And indeed, it turns out to be. They use a stack for state manipulation of Ab2 and reserve memory from heap for Ab1. Ab2 dies after a while.(after finished executing).

Also, the way you define what to do with Ab2 is done through yet another storage portion called Ab2_Code and specification for Ab1 involves similarly Ab1_Code

I would say, this is fantastic! I get so much convenience that allows me to solve so many problems.

Now, I am still looking from a layman's perspective so I don't feel surprised really having gone through the thought process of it all but if you question things top-down, things can get a bit difficult to put into perspective.(I suspect that's what happened in your case)

BTW, I forgot to mention that Ab1 is called an object officially and Ab2 a function stack while Ab1_Code is the class definition and Ab2_Code is the function definition code.

And it is because of these differences imposed by the PL, you find that they are so different.(your question)

Note: Don't take my representation of Ab1/Object as a long storage abstraction as a rule or a concrete thing - it was from layman perspective. The programming language provides much more flexibility in terms of managing lifecycle of an object. So, object may be deployed like Ab1 but it can be much more.

And does this have any relation with the functors (function objects)?

Note that the first part answer is valid for many programming languages in general(including C++), this part has to do specifically with C++ (whose spec you quoted). So you have pointer to a function, you can have a pointer to an object too. Its just another programming construct that C++ defines. Notice that this is about having a pointer to the Ab1, Ab2 to manipulate them rather than having another distinct abstraction to act upon.

You can read about its definition, usage here:

C++ Functors - and their uses

3
  • 3
    Your first two paragraphs are hard to read and could be summarized in a couple of sentences. I guess you're not getting votes because most people just stop reading before reaching "Differently...Why and How?".
    – YSC
    Commented May 15, 2017 at 11:43
  • @YSC : I have made some edits. What do you think of it now? Can you take a look and suggest edits to make it more readable? Commented May 15, 2017 at 15:10
  • It's better :).
    – YSC
    Commented May 15, 2017 at 20:43
1

Let me answer the question in simpler language (terms).

What does a function contain?

It basically contains instructions to do something. While executing the instructions, the function can temporarily store and / or use some data - and might return some data.

Although the instructions are stored somewhere - those instructions themselves are not considered as objects.

Then, what are the objects?

Generally, objects are entities which contain data - which get manipulated / changed / updated by functions (the instructions).

Why the difference?

Because computers are designed in such way that the instructions do not depend on the data.

To understand this, let's think about a calculator. We do different mathematical operations using a calculator. Say, if we want to add some numbers, we provide the numbers to the calculator. No matter what the numbers are, the calculator will add them in the same way following the same instructions (if the result exceeds the calculator's capacity to store, it will show an error - but that is because of calculator's limitation to store the result (the data), not because of its instructions for addition).

Computers are designed in the similar manner. That is why when you use a library function (for example qsort()) on some data which are compatible with the function, you get the same result as you expect - and the functionality of the function doesn't change if the data changes - because the instructions of the function remains unchanged.

Relation between function and functors

Functions are set of instructions; and while they are being executed, some temporary data can be required to store. In other words, some objects might be temporarily created while executing the function. These temporary objects are functors.

Not the answer you're looking for? Browse other questions tagged or ask your own question.