36

In a recent code review, it was claimed that

On select systems, calloc() can allocate more than SIZE_MAX total bytes whereas malloc() is limited.

My claim is that that's mistaken, because calloc() creates space for an array of objects - which, being an array, is itself an object. And no object can be larger in size than SIZE_MAX.

So which of us is correct? On a (possibly hypothetical) system with address space larger than the range of size_t, is calloc() allowed to succeed when called with arguments whose product is greater than SIZE_MAX?

To make it more concrete: will the following program ever exit with a non-zero status?

#include <stdint.h>
#include <stdlib.h>

int main()
{
     return calloc(SIZE_MAX, 2) != NULL;
}
19
  • 2
    more quote : "A good calloc(n, size) will detect products of n * size greater the SIZE_MAX". This actually looks like an opinion. Standard does not mention something like "good calloc" and says nothing about detection of "n * size greater the SIZE_MAX" situation Commented Oct 8, 2018 at 9:54
  • I would assume, that he means, that the argument passed to malloc contains the product from the size and the amount of objects created, which can be larger than SIZE_MAX, but in calloc you have two parameters for that (so you can allocate SIZE_MAX elements with 4 bytes each.
    – hellow
    Commented Oct 8, 2018 at 9:55
  • 1
    @hellow, exactly. I don't believe that's a valid call, because such an array violates the rule that size_t can represent the size of any object. Commented Oct 8, 2018 at 9:58
  • 3
    DR266 seems to be related. Only found this: DR-266 RM position is sizeof never overflows. DG - ignore the calloc problem. PJ - size_t must be representable, cannot overflow, by definition. Attempt to overflow s/be a constraint violation / undefined behavior.
    – KamilCuk
    Commented Oct 8, 2018 at 10:14
  • 2
    Here's the link to DR-266.
    – P.P
    Commented Oct 8, 2018 at 10:25

7 Answers 7

21

Can calloc() allocate more than SIZE_MAX in total?

As the assertion "On select systems, calloc() can allocate more than SIZE_MAX total bytes whereas malloc() is limited." came from a comment I posted, I will explain my rationale.


size_t

size_t is some unsigned type of at least 16 bits.

size_t which is the unsigned integer type of the result of the sizeof operator; C11dr §7.19 2

"Its implementation-defined value shall be equal to or greater in magnitude ... than the corresponding value given below" ... limit of size_t SIZE_MAX ... 65535 §7.20.3 2

sizeof

The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. §6.5.3.4 2

calloc

void *calloc(size_t nmemb, size_t size);

The calloc function allocates space for an array of nmemb objects, each of whose size is size. §7.22.3.2 2


Consider a situation where nmemb * size well exceeds SIZE_MAX.

size_t alot = SIZE_MAX/2;
double *p = calloc(alot, sizeof *p); // assume `double` is 8 bytes.

If calloc() truly allocated nmemb * size bytes and if p != NULL is true, what spec did this violate?

The size of each element, (each object) is representable.

// Nicely reports the size of a pointer and an element.
printf("sizeof p:%zu, sizeof *p:%zu\n", sizeof p, sizeof *p); 

Each element can be accessed.

// Nicely reports the value of an `element` and the address of the element
for (size_t i = 0; i<alot; i++) {
  printf("value a[%zu]:%g, address:%p\n", i, p[i], (void*) &p[i]); 
}

calloc() details

"space for an array of nmemb objects": This is certainly a key point of contention. Does the "allocates space for the array" require <= SIZE_MAX? I found nothing in the C spec to require this limit and so conclude:

calloc() may allocate more than SIZE_MAX in total.


It is certainly uncommon for calloc() with large arguments to return non-NULL - compliant or not. Usually such allocations exceed memory available, so the issue is moot. The only case I've encountered was with the Huge memory model where size_t was 16 bit and the object pointer was 32 bit.

2
  • @chux do you have an example libc implementation where this would work? It would require storing the actual size in a type larger than size_t, which I very much doubt is the implementation in any calloc. I just checked a couple libc implementations and both of them put the product in a size_t; one checks for overflow and returns NULL, and the other just returns an array of the overflow-truncated size which you'll access out-of-bounds if you try and iterate it (invoking undefined behavior, of course), so it's certainly not safe to do.
    – Kevin
    Commented Oct 8, 2018 at 18:28
  • 1
    @Kevin As in the answer, such a calloc() existed in ye old days with a HUGE memory model. calloc() that simply multiplies nmemb, size to form the required size, without considering OF, is a weak implementation of calloc(). That libc weakness, especially with a ready fix you noted in the other, is not a prohibition of what the C spec can allow. OP's title question is not: can a non-NULL return with large operands, occur? of course it can - with weak lib code. The question is "Can calloc() allocate more than SIZE_MAX ...?" - the implication: can calloc() do so correctly? Commented Oct 8, 2018 at 19:25
19

SIZE_MAX doesn't necessary specify the maximum size of an object, but rather the maximum value of size_t, which is not necessarily the same thing. See Why is the maximum size of an array "too large"?,

But obviously, it isn't well-defined to pass a larger value than SIZE_MAX to a function expecting a size_t parameter. So in theory SIZE_MAX is the limit, and in in theory calloc would allow for SIZE_MAX * SIZE_MAX bytes to allocated.

The thing with malloc/calloc is that they allocate objects without a type. Objects with a type have restrictions, such as never being larger than a certain limit like SIZE_MAX. But the data pointed-at by the result from these functions does not have a type. It is not (yet) an array.

Formally, the data has no declared type, but as you store something inside the allocated data, it gets the effective type of the data access used for storage (C17 6.5 §6).

This in turn means that it would be possible for calloc to allocate more memory than any type in C can hold, because what's allocated does not (yet) have a type.

Therefore, as far as the C standard is concerned, it is perfectly fine for calloc(SIZE_MAX, 2) to return a value different from NULL. How to actually use that allocated memory in a sensible way, or which systems that even support such large chunks of memory on the heap, is another story.

8
  • This does suggest, I think, a peculiar relationship between SIZE_MAX and ptrdiff_t, since on a system where calloc could behave as described, ptrdiff_t would have to be large enough to cope. Commented Oct 8, 2018 at 11:31
  • 1
    @SteveSummit Yeah that's the catch, as explained by the accepted answer in the linked post, that SIZE_MAX and PTRDIFF_MAX always have to follow, and the latter being a signed type. However, given a SIZE_MAX 2^n, the standard doesn't restrict the compiler to have a PTRDIFF_MAX which is 2^(n+1). It's just very inconvenient for the compiler to have such a burdensome type system so in practice it isn't implemented like that. Overall, the C standard doesn't handle the problems with these two types very well, but leaves the thinking "to the implementation".
    – Lundin
    Commented Oct 8, 2018 at 11:47
  • DOS COMPACT memory model could actually do this if the standard library hadn't defined calloc() in such a way that it would have failed.
    – Joshua
    Commented Oct 9, 2018 at 2:35
  • An object without a type seems to be the key to this conundrum - that seems to be the best reasoning so far, and earns you the tick. Commented Oct 9, 2018 at 7:51
  • @Lundin: Technically, PTRDIFF_MAX does not have to follow, and can be arbitrarily smaller than SIZE_MAX because the result of subtracting two pointers to the same array object is not required to always be representable as a ptrdiff_t value, and in this case the behavior is explicitly undefined. n1548 §6.5.6 ¶9. Commented Oct 9, 2018 at 10:39
2

From

7.22.3.2 The calloc function

Synopsis
1

 #include <stdlib.h>
 void *calloc(size_t nmemb, size_t size);`

Description
2 The calloc function allocates space for an array of nmemb objects, each of whose size is size. The space is initialized to all bits zero.

Returns
3 The calloc function returns either a null pointer or a pointer to the allocated space.

I fail to see why the space allocated should be limited to SIZE_MAX bytes.

3
  • 6
    My reasoning is that calloc() allocates space for an array of objects. An array is an object, therefore it must be measurable using a size_t. Commented Oct 8, 2018 at 10:09
  • 1
    @TobySpeight "But the data pointed-at by the result from these functions does not have a type. It is not (yet) an array." in this answer relates to the An array is an object concern. Commented Oct 8, 2018 at 19:30
  • 1
    @chux: The behavior of calloc() was established at a time when it didn't matter whether the storage thereof held any particular kind of object, or a union of every kind of object that could possibly fit, or no object whatsoever. Since the language includes no means of measuring any object or objects that are created, however, there is no need to have a type capable of holding such measurement.
    – supercat
    Commented Oct 9, 2018 at 15:28
2

If a program exceeds implementation limits, behavior is undefined. This follows from the definition of an implementation limit as a restriction imposed upon programs by the implementation (3.13 in C11). The standard also says that strictly-conforming programs must adhere to implementation limits (4p5 in C11). But this also implies to programs in general because the standard does not say what happens when most implementation limits are exceeded (so it is the other kind of undefined behavior, where the standard does not specify what happens).

The standard also does not define what implementation limits may exist, so this a bit of carte blanche, but I think it is reasonable that the maximum object size is actually relevant to object allocations. (The maximum object size is typically smaller than SIZE_MAX, by the way, because the difference of pointers-to-char within the object must be representable in ptrdiff_t.)

This leads us to the following observation: A call to calloc (SIZE_MAX, 2) exceeds the maximum object size limit, so an implementation could return an arbitrary value while still conforming to the standard.

Some implementations will actually return a pointer which is not null for a call like calloc (SIZE_MAX / 2 + 2, 2) because the implementation does not check that the multiplication result does not fit into a size_t value. Whether this a good idea is a different matter, given that the implementation limit can be checked so easily in this case, and there is a perfectly fine way to report errors. Personally, I consider the lack of overflow checking in calloc an implementation bug, and have reported bugs to implementors when I saw them, but technically, it's merely a quality-of-implementation issue.

For variable-length arrays on the stack, the rule about exceeding implementation limits resulting in undefined behavior is more obvious:

size_t length = SIZE_MAX / 2 + 2;
short object[length];

There is really nothing an implementation can do here, so it has to be undefined.

7
  • Can you back that up with references to the standard? Commented Oct 8, 2018 at 10:51
  • And why do you bring in implementation limits? In J.3.12 in the C standard I do not see any implementation defined limits for calloc other than "Whether the calloc, malloc, and realloc functions return a null pointer or a pointer to an allocated object when the size requested is zero (7.22.3)." Commented Oct 8, 2018 at 10:58
  • As noted in DR-266, translation limits do not apply to runtime/allocated objects. So not sure if translation limits apply to calloc.
    – P.P
    Commented Oct 8, 2018 at 11:00
  • 1
    SIZE_MAX does not necessarily exceed the maximum object size. It is fine for an implementation to have a PTRDIFF_MAX that is 2^33 signed while at the same time it has a SIZE_MAX which is 2^32 unsigned. It's just very inconvenient for the compiler to have a type system like that, but the standard doesn't care [didn't even consider the problem].
    – Lundin
    Commented Oct 8, 2018 at 11:21
  • 1
    @Lundin: Curiously, C11 has imposed a rule that ptrdiff_t must be at least 17 bits, even on a freestanding implementation with less than 32K total of storage (the size of object hosted implementations are required to support was increased to 65,535 bytes, but I'm not sure I see the point--regardless of what the Standard says, implementations that can practically support objects that size will do so, and those that can't, won't). In any case, I'm really unsure what purpose a 17-bit ptrdiff_t would serve on an implementation with less than 32K of total storage.
    – supercat
    Commented Oct 9, 2018 at 15:22
2

Per the text of the standard, maybe, because the standard is (some would say intentionally) vague about this sort of thing.

Per 6.5.3.4 ¶2:

The sizeof operator yields the size (in bytes) of its operand

and per 7.19 ¶2:

size_t

which is the unsigned integer type of the result of the sizeof operator;

The former cannot be satisfied in general if the implementation admits any type (including array types) whose size is not representable in size_t. Note that, regardless of whether you interpret the text about the pointer returned by calloc pointing to "an array", there is always an array involved with any object: the overlaid array of type unsigned char[sizeof object] which is its representation.

At best, an implementation that allows the creation of any object larger than SIZE_MAX (or PTRDIFF_MAX, for other reasons) has fatally bad QoI (quality of implementation) problems. The claim on code review that you should account for such bad implementations is bogus unless you are specifically trying to ensure compatibility with a particular broken C implementation (sometimes relevant for embedded, etc.).

1

Just an addition: With a tiny bit of maths you can show that SIZE_MAX * SIZE_MAX = 1 (when evaluated according to C rules).

However, calloc (SIZE_MAX, SIZE_MAX) is only allowed to do one of two things: Return a pointer to an array of SIZE_MAX elements of SIZE_MAX bytes, OR return NULL. It is NOT allowed to calculate the total size by just multiplying the arguments, getting a result of 1, and allocating one byte, cleared to 0.

0

The Standard says nothing about whether it might be possible for a pointer to somehow be created such that ptr+number1+number2 could be a valid pointer, but number1+number2 would exceed SIZE_MAX. It certainly allows for the possibility of number1+number2 exceeding PTRDIFF_MAX (though for some reason C11 has decided to require that even implementations with a 16-bit address space must use a 32-bit ptrdiff_t).

The Standard does not mandate that implementations provide any means of creating pointers to such large objects. It does, however, define a function, calloc(), whose description suggests that it could be asked to attempt to create such an object, and would suggest that calloc() should return a null pointer if it can't create the object.

The ability to allocate any kind of object usefully, however, is a Quality of Implementation issue. The Standard would never require that any particular allocation request succeed, nor would it forbid an implementation from returning a pointer that might turn out to be unusable (in some Linux environments, a malloc() might yield a pointer to an over-committed region of address space; an attempt to use the pointer when insufficient physical storage is available could cause a fatal trap). It would certainly be better for a non-capricious implementation of calloc(x,y) to return null if the numerical product of x and y exceeds SIZE_MAX than for it to yield a pointer which can't be used to access that number of bytes. The Standard is silent, however, whether returning a pointer that can be used to access y objects of x bytes each should be considered be better or worse than returning null. Each behavior would be advantageous in some situations, and disadvantageous in others.

Not the answer you're looking for? Browse other questions tagged or ask your own question.