41

I just noticed that every modern OO programming language that I am at least somewhat familiar with (which is basically just Java, C# and D) allows covariant arrays. That is, a string array is an object array:

Object[] arr = new String[2];   // Java, C# and D allow this

Covariant arrays are a hole in the static type system. They make type errors possible that cannot be detected at compile-time, so every write to an array must be checked at runtime:

arr[0] = "hello";        // ok
arr[1] = new Object();   // ArrayStoreException

This seems like a terrible performance hit if I do lots of array stores.

C++ does not have covariant arrays, so there is no need to do such a runtime check, which means there is no performance penalty.

Is there any analysis done to reduce the number of runtime checks necessary? For example, if I say:

arr[1] = arr[0];

one could argue that the store cannot possibly fail. I'm sure there are lots of other possible optimizations I haven't thought of.

Do modern compilers actually do these kinds of optimizations, or do I have to live with the fact that, for example, a Quicksort always does O(n log n) unnecessary runtime checks?

Can modern OO languages avoid the overhead created by supporting co-variant arrays?

18
  • 7
    I'm confused why you are suggesting C++ is faster than other languages at a feature C++ doesn't even support.
    – eco
    Commented Jan 17, 2012 at 22:03
  • 17
    @eco: C++ is faster with array access because it doesn't support co-variant arrays. Fred wants to know if it's possible for "modern OO languages" to elide the overhead of co-variant arrays to get closer to C++ speeds. Commented Jan 17, 2012 at 22:08
  • 13
    "code that compiles but throws exceptions at run-time is bad news"
    – Jesse
    Commented Jan 17, 2012 at 22:11
  • 4
    @Jesse And millions of people write reliable, highly scalable code in dynamic languages. Imo: If you don't write test cases for your code I don't care whether there are compiler errors or not, I'm not going to trust it anyhow.
    – Voo
    Commented Jan 17, 2012 at 22:27
  • 8
    @Jesse And when else would you expect exceptions but at runtime? The problem isn't code which throws exceptions at runtime - there are plenty of cases where that makes good sense - the problem is code which is guaranteed to be wrong which doesn't get statically caught by the compiler but instead results in an exception at runtime. Commented Jan 17, 2012 at 23:00

8 Answers 8

35

D doesn't have covariant arrays. It allowed them prior to the most recent release (dmd 2.057), but that bug has been fixed.

An array in D is effectively just a struct with a pointer and a length:

struct A(T)
{
    T* ptr;
    size_t length;
}

Bounds checking is done normally when indexing an array, but it's removed when you compile with -release. So, in release mode, there's no real performance difference between arrays in C/C++ and those in D.

4
  • Appears not -- d-programming-language.org/arrays.html says "A static array T[dim] can be implicitly converted to one of the following: ... U[] ... A dynamic array T[] can be implicitly converted to one of the following: U[] ... where U is a base class of T."
    – Ben Voigt
    Commented Jan 17, 2012 at 23:32
  • 2
    @BenVoigt Then the online docs need to be updated. They aren't always 100% up-to-date unfortunately. The conversion will work with const U[], since then you can't assign the wrong type to the array's elements, but T[] definitely does not convert to U[] as long as U[] is mutable. Allowing covariant arrays like it did before was a serious design flaw which has now been corrected. Commented Jan 17, 2012 at 23:57
  • @JonathanMDavis: Covariant arrays are icky, but work well for the particular scenario of code which will only write to an array elements which have been read from that same array. How would D allow one to write a method which could sort arrays of arbitrary type?
    – supercat
    Commented Dec 19, 2013 at 18:25
  • @supercat If you want to write a function which sorts arbitrary types, then templatize it. e.g. dlang.org/phobos/std_algorithm.html#sort Commented Dec 20, 2013 at 4:40
22

Yes, one crucial optimization is this:

sealed class Foo
{
}

In C#, this class can't be a supertype for any type, so you can avoid the check for an array of type Foo.

And to the second question, in F# co-variant arrays are not allowed (but I guess the check will remain in the CLR unless it's found unnecessary in optimizations at runtime)

let a = [| "st" |]
let b : System.Object[] = a // Fails

https://stackoverflow.com/questions/7339013/array-covariance-in-f

A somewhat related problem is array bound-checking. This might be an interesting (but old) read about optimizations done in the CLR (covariance is also mentioned 1 place): Link

1
  • 3
    Scala also prevents this construct: val a = Array("st"); val b: Array[Any] = a is illegal. (However, arrays in Scala are ... special magic ... due to the underlying JVM used.)
    – pst
    Commented Jan 17, 2012 at 22:00
14

Java answer:

I take it you haven't actually benchmarked the code, have you? In general 90% of all dynamic casts in Java are free because the JIT can elide them (quicksort should be a good example for this) and the rest are one ld/cmp/brsequence which is absolutely predictable (if not, well why the hell is your code throwing all those dynamic cast exceptions?).

We do the load much earlier than the actual compare, the branch is correctly predicted in 99.9999% (made up statistic!) of all cases so we don't stall the pipeline (assuming we don't hit the memory with the load, if not well that will be noticeable, but then the load is necessary anyhow). Hence the cost is 1 clock cycle IF the JIT cannot avoid the check at all.

Some overhead? Sure, but I doubt you'll ever notice it..


To help support my answer, please see this Dr. Cliff Click blogpost discussing Java vs. C performance.

0
10

D does not allow covariant arrays.

void main()
{
    class Foo {}
    Object[] a = new Foo[10];
}  

/* Error: cannot implicitly convert expression (new Foo[](10LU)) of type Foo[]
to Object[] */

As you say, it would be a hole in the type system to allow this.

You can be forgiven for the mistake, as this bug was only just fixed in the lastest DMD, released on December 13th.

Array access in D is just as fast as in C or C++.

3
  • According to d-programming-language.org/arrays.html "A static array T[dim] can be implicitly converted to one of the following: ... U[] ... A dynamic array T[] can be implicitly converted to one of the following: U[] ... where U is a base class of T."
    – Ben Voigt
    Commented Jan 17, 2012 at 23:33
  • 1
    @BenVoigt: Out of date docs.
    – BCS
    Commented Jan 18, 2012 at 0:28
  • 1
    @BenVoigt: I have created a pull request to update the documentation. Hopefully this will be resolved soon. Thanks for pointing it out. Commented Jan 19, 2012 at 12:52
5

From test I have done on a cheap laptop, the difference between using int[] and Integer[] is about 1.0 ns. The difference is likely to be due to the extra check for the type.

Generally Objects are only used for higher level logic when not every ns counts. If you need to save every ns, I would avoid using higher level constructs like Objects. Assignments alone are likely to be very small factor in any real program. e.g. creating a new Object on the same machine is 5 ns.

Calls to compareTo are likely to be much more expensive, esp if you using a complex object like String.

2

You asked about other modern OO languages? Well, Delphi avoids this problem entirely by having string be a primitive, not an object. So an array of strings is exactly an array of strings and nothing else, and any operations on them are as fast as native code can be, with no type checking overhead.

However, string arrays are not used very often; Delphi programmers tend to favor the TStringList class. For a minimal amount of overhead it provides a set of string-group methods that are useful in so many situations that the class has been compared to a Swiss Army Knife. So it's idiomatic to use a string list object instead of a string array.

As for other objects in general, the problem does not exist because in Delphi, like in C++, arrays are not covariant, in order to prevent the kind of type system holes described here.

0
1

or do I have to live with the fact that, for example, a Quicksort always does O(n log n) unnecessary runtime checks?

CPU performance is not monotonic, which means that longer programs can be faster than shorter ones (this is CPU dependant, and it is true for the common x86 and amd64 architectures). So it is possible that a program doing bound checking on arrays is actually faster than the program deduced from the previous by removing these bound checking.

The reason of this behaviour is that bound checking modifies the alignment of code in memory, will modify the frequency of cache hits, etc.

So yes, live with the fact that Quicksort always does O(n log n) spurious checks and optimise after profiling.

1

Scala is an OO language that has invariant, rather than covariant, arrays. It targets the JVM, so there's no performance win there, but it avoids a misfeature common to both Java and C# that compromises their type-safety for reasons of backwards compatibility.

Not the answer you're looking for? Browse other questions tagged or ask your own question.