140

The Java team has done a ton of great work removing barriers to functional programming in Java 8. In particular, the changes to the java.util Collections do a great job of chaining transformations into very fast streamed operations. Considering how good a job they have done adding first class functions and functional methods on collections, why have they completely failed to provide immutable collections or even immutable collection interfaces?

Without changing any existing code, the Java team could at any time add immutable interfaces that are the same as the mutable ones, minus the setter methods and make the existing interfaces extend from them, like this:

                  ImmutableIterable
     ____________/       |
    /                    |
Iterable        ImmutableCollection
   |    _______/    /          \   \___________
   |   /           /            \              \
 Collection  ImmutableList  ImmutableSet  ImmutableMap  ...
    \  \  \_________|______________|__________   |
     \  \___________|____________  |          \  |
      \___________  |            \ |           \ |
                  List            Set           Map ...

Sure, operations like List.add() and Map.put() currently return a boolean or previous value for the given key to indicate whether the operation succeeded or failed. Immutable collections would have to treat such methods as factories and return a new collection containing the added element - which is incompatible with the current signature. But that could be worked-around by using a different method name like ImmutableList.append() or .addAt() and ImmutableMap.putEntry(). The resulting verbosity would be more than outweighed by the benefits of working with immutable collections, and the type system would prevent errors of calling the wrong method. Over time, the old methods could be deprecated.

Wins of immutable collections:

  • Simplicity — reasoning about code is simpler when the underlying data does not change.
  • Documentation — if a method takes an immutable collection interface, you know it isn't going to modify that collection. If a method returns an immutable collection, you know you can't modify it.
  • Concurrency — immutable collections can be shared safely across threads.

As someone who has tasted languages which assume immutability, it is very hard to go back to the Wild West of rampant mutation. Clojure's collections (sequence abstraction) already have everything that Java 8 collections provide, plus immutability (though maybe using extra memory and time due to synchronized linked-lists instead of streams). Scala has both mutable and immutable collections with a full set of operations, and though those operations are eager, calling .iterator gives a lazy view (and there are other ways of lazily evaluating them). I don't see how Java can continue to compete without immutable collections.

Can someone point me to the history or discussion about this? Surely it's public somewhere.

19
  • 9
    Related to this - Ayende blogged recently about collections and immutable collections in C#, with benchmarks. ayende.com/blog/tags/performance - tl;dr - immutability is slow.
    – Oded
    Commented Dec 18, 2013 at 15:05
  • 24
    with your hierarchy I can give you a ImmutableList and then change it on you when you don't expect it which can break a lot of things, as is you only have const collections Commented Dec 18, 2013 at 15:16
  • 19
    @Oded Immutability is slow, but so is locking. So is maintaining a history. Simplicity/Correctness is worth speed in many situations. With small collections, speed is not an issue. Ayende's analysis is based on the assumption that you don't need history, locking, or simplicity and that you are working with a large data set. Sometimes that's true, but it's not a one-is-always-better thing. There are trade-offs. Commented Dec 18, 2013 at 15:36
  • 5
    @GlenPeterson that's what defensive copies and Collections.unmodifiable*() are for. but don't treat these as immutable when they are not Commented Dec 18, 2013 at 15:56
  • 15
    Eh? If your function takes an ImmutableList in that diagram, people can pass in a mutable List? No, that's a very bad violation of LSP.
    – Telastyn
    Commented Dec 18, 2013 at 16:59

7 Answers 7

119
+50

Because immutable collections absolutely require sharing to be usable. Otherwise, every single operation drops a whole other list into the heap somewhere. Languages that are entirely immutable, like Haskell, generate astonishing amounts of garbage without aggressive optimizations and sharing. Having collection that's only usable with <50 elements is not worth putting in the standard library.

Further more, immutable collections often have fundamentally different implementations than their mutable counterparts. Consider for example ArrayList, an efficient immutable ArrayList wouldn't be an array at all! It should be implemented with a balanced tree with a large branching factor, Clojure uses 32 IIRC. Making mutable collections be "immutable" by just adding a functional update is a performance bug just as much as a memory leak is.

Furthermore, sharing isn't viable in Java. Java provides too many unrestricted hooks to mutability and reference equality to make sharing "just an optimization". It'd probably irk you a bit if you could modify an element in a list, and realize you just modified an element in the other 20 versions of that list you had.

This also rules out huge classes of very vital optimizations for efficient immutability, sharing, stream fusion, you name it, mutability breaks it. (That'd make a good slogan for FP evangelists)

15
  • 22
    My example talked about immutable interfaces. Java could provide a full suite of both mutable and immutable implementations of those interfaces that would make the necessary trade-offs. It's up to the programmer to choose mutable or immutable as appropriate. Programmers have to know when to use a List vs. Set now. You generally don't need the mutable version until you have a performance issue, and then it may only be necessary as a builder. In any case, having the immutable interface would be a win on its own. Commented Dec 18, 2013 at 17:21
  • 4
    I read your answer again and I think you are saying that Java has a fundamental assumption of mutability (e.g. java beans) and that the collections are just the tip of the iceberg and carving off that tip won't solve the underlying problem. A valid point. I might accept this answer and speed up my adoption of Scala! :-) Commented Dec 18, 2013 at 17:28
  • 8
    I'm not sure immutable collections require the ability to share common parts to be useful. The most common immutable type in Java, an immutable collection of characters, used to allow sharing but doesn't anymore. The key thing that makes it useful is the ability to quickly copy data from a String into a StringBuffer, manipulate it, and then copy the data into a new immutable String. Using such a pattern with sets and lists might be as good as using immutable types that are designed to facilitate the production of slightly-changed instances, but could still be better...
    – supercat
    Commented Jan 17, 2014 at 19:30
  • 4
    It is entirely possible to make an immutable collection in Java using sharing. The items stored in the collection are references and their referents may be mutated - so what? Such behavior already breaks existing collections such as HashMap and TreeSet, yet those are implemented in Java. And if multiple collections contain references to the same object, it's entirely expected that modifying the object will cause a change visible when it is viewed from all collections. Commented Aug 13, 2015 at 21:51
  • 4
    jozefg, it is entirely possible to implement efficient immutable collections on JVM with structural sharing. Scala and Clojure have them as part of their standard library, both implementations are based on Phil Bagwell's HAMT (Hash Array Mapped Trie). Your statement regarding Clojure implementing immutable data structures with BALANCED trees is completely wrong.
    – sesm
    Commented Sep 6, 2015 at 20:54
87

A mutable collection is not a subtype of an immutable collection. Instead, mutable and immutable collections are sibling descendants of readable collections. Unfortunately, the concepts of "readable", "read-only", and "immutable" seem to get blurred together, even though they mean three different things.

  • A readable collection base class or interface type promises that one may read items, and does not provide any direct means of modifying the collection, but does not guarantee that code receiving the reference cannot cast or manipulate it in such a way as to permit modification.

  • A read-only collection interface doesn't include any new members, but should only be implemented by a class which promises that there is no way to manipulate a reference to it in such a way as to mutate the collection nor receive a reference to something that could do so. It does not, however, promise that the collection won't be modified by something else which has a reference to the internals. Note that a read-only collection interface may not be able to prevent implementation by mutable classes, but can specify that any any implementation, or class derived from an implementation, which allows mutation shall be considered an "illegitimate" implementation or derivative of an implementation.

  • An immutable collection is one which will always hold the same data as long as any reference to it exists. Any implementation of an immutable interface which does not always return the same data in response to a particular request is broken.

It is sometimes useful to have strongly-associated mutable and immutable collection types which both implement or derive from the same "readable" type, and to have the readable type include AsImmutable, AsMutable, and AsNewMutable methods. Such a design can allow code which wants to persist the data in a collection to call AsImmutable; that method will make a defensive copy if the collection is mutable, but skip the copy if it's already immutable.

23
  • 2
    Great answer. Immutable collections can give you a pretty strong guarantee related to thread-safeness and how you can reason about them as time goes by. A Readable/Read-only collection does not. In fact, to honor the liskov substition principle, Read-Only and Immutable should probably be abstract base type with final method and private members to ensure that no derived class can destroy the garantee given by the type. Or they should be fully concrete type that either wrap a collection (Read-Only), or always take a defensive copy (Immutable). This is how guava ImmutableList does it. Commented Dec 23, 2013 at 15:12
  • 1
    @LaurentBourgault-Roy: There are advantages to both sealed and inheritable immutable types. If one doesn't want to allow an illegitimate derived class to break one's invariants, sealed types can offer protection against that while inheritable classes offer none. On the other hand, it may be possible for code that knows something about the data it holds to store it much more compactly than would a type that knows nothing about it. Consider, for example, a ReadableIndexedIntSequence type which encapsulates a sequence of int, with methods getLength() and getItemAt(int).
    – supercat
    Commented Dec 23, 2013 at 16:34
  • 1
    @LaurentBourgault-Roy: Given a ReadableIndexedIntSequence, one could produce an instance of an array-backed immutable type by copying all the items into an array, but suppose that a particular implementation simply returned 16777216 for length and ((long)index*index)>>24 for each item. That would be a legitimate immutable sequence of integers, but copying it to an array would be a huge waste of time and memory.
    – supercat
    Commented Dec 23, 2013 at 16:39
  • 1
    I fully agree. My solution give you correctness (up to a point), but to get performance with large dataset, you must have persistent structure and design for immutability from the beginning. For small collection though you can get away with taking an immutable copy from time to time. I remember that Scala did some analysis of various program and found that something like 90% of the lists instanciated were 10 or less items long. Commented Dec 23, 2013 at 16:43
  • 1
    @LaurentBourgault-Roy: The fundamental question is whether one trusts people not to produce broken implementations or derived classes. If one does, and if the interfaces/base classes provide asMutable/asImmutable methods, it may be possible to improve performance by many orders of magnitude [e.g. compare the cost of calling asImmutable on an instance of the above-defined sequence versus the cost of constructing an immutable array-backed copy]. I would posit that having interfaces defined for such purposes is probably better than trying to use ad-hoc approaches; IMHO, the biggest reason...
    – supercat
    Commented Dec 23, 2013 at 16:51
17

The Java Collections Framework does provide the ability to create a read-only version of a collection by way of six static methods in the java.util.Collections class:

As someone has pointed out in the comments to the original question, the collections returned may not be considered immutable because even though the collections cannot be modified (no members can be added or removed from such a collection), the actual objects referenced by the collection can be modified if their object type allows it.

However, this problem would remain regardless of whether code returns a single object, or an unmodifiable collection of objects. If the type allows its objects to be mutated, then that decision was made in the design of the type and I don't see how a change to the JCF could alter that. If immutability is important, then the members of a collection should be of an immutable type.

1
  • 5
    The design of the unmodifiable collections would have been greatly enhanced if the wrappers included an indication of whether the thing being wrapped was already immutable, and there were immutableList etc. factory methods which would return a read-only wrapper around a copy of a passed-in list unless the passed-in list was already immutable. It would be easy to create user-defined types like that but for one problem: there would be no way for joesCollections.immutableList method to recognize that it shouldn't need to copy the object returned by fredsCollections.immutableList.
    – supercat
    Commented Jul 14, 2014 at 16:27
11

This is a very good question. I enjoy entertaining the idea that of all the code written in java and running on millions of computers all over the world, every day, around the clock, about half the total clock cycles must be wasted doing nothing but making safety copies of collections returned by functions, which are garbage-collected milliseconds later.

A percentage of java programmers are aware of the existence of the unmodifiableCollection() family of methods of the Collections class, but most don't bother with it, and I can't blame them: an interface which pretends to be read-write but will slap you in the face with an UnsupportedOperationException if you make the mistake of invoking any of its 'write' methods is quite an evil thing to have!

Now, an interface like Collection which would be missing the add(), remove() and clear() methods would not be an "ImmutableCollection" interface; it would be an "UnmodifiableCollection" interface. As a matter of fact, there could never be an "ImmutableCollection" interface, because immutability is a nature of an implementation, not a characteristic of an interface. I know, that's not very clear; let me explain.

Suppose someone hands you such a read-only collection interface; is it safe to pass it to another thread? If you knew for sure that it represents a truly immutable collection, then the answer would be "yes"; unfortunately, since it is an interface, you do not know how it is implemented, so the answer has to be a no: for all you know, it may be an unmodifiable (to you) view of a collection which is in fact mutable, (like what you get with Collections.unmodifiableCollection(),) so attempting to read from it while another thread is modifying it would result in reading corrupt data.

So, what you have essentially described is a set of not "Immutable", but "Unmodifiable" collection interfaces. It is important to understand that "Unmodifiable" simply means that whoever has a reference to such an interface is prevented from modifying the underlying collection not because the underlying collection is necessarily immutable, but simply because the interface lacks any modification methods.

Now, I happened to be present at Devoxx Belgium 2017 where several people from Oracle gave talks, and one of them was Brian Goetz. I went to him after his talk and asked him precisely this: why are there no unmodifiable interfaces in Java? I must say that the answer he gave me was not particularly convincing: he said that they are not doing it due to a security concern: if you have a mutable collection class implementing a mutable collection interface which extends an unmodifiable collection interface, and you hand out the unmodifiable interface, whoever holds that interface can up-cast it to the mutable interface and modify the original collection.

This is not convincing because it describes a very simplistic scenario which can very easily be taken care of with some simple extra measures: If you have a security concern you can instantiate and hand-out an unmodifiable collection adaptor which wraps (delegates to) your mutable collection without implementing the mutable collection interface. As a matter of fact, you are already doing this structurally when you use the unmodifiableCollecton() family of methods, the only problem is that the interface you are handing out is secretly unmodifiable, meaning that it obtusely exposes mutation methods, but invoke any of them and you die. Thus, unmodifiable interfaces could eliminate this existing obtuseness while keeping everything else structurally the same if need be.

But of course, things would not have to be kept the same. The next step would be to add immutable collection classes, which implement the unmodifiable collection interfaces. When there is a need to guarantee immutability, (such as the case is, for example, when passing a collection from one thread to another,) such collections would be passed around as immutable classes, not as unmodifiable interfaces, so that the receiver knows for sure that what they have in their hands is immutable.

So, in order to have a complete set of collections in java, (or any other declarative imperative language,) we would need the following:

  1. A set of unmodifiable collection interfaces.

  2. A set of mutable collection interfaces, extending the unmodifiable ones.

  3. A set of mutable collection classes implementing the mutable interfaces, and by extension also the unmodifiable interfaces.

  4. A set of immutable collection classes, implementing the unmodifiable interfaces, but mostly passed around as classes, so as to guarantee immutability.

I have implemented all of the above for fun, and I am using them in projects, and they work like a charm.

The reason why they are not part of the java runtime is probably because it was thought that this would be too much / too complex / too difficult to understand.

Personally, I think that what I described above is not even enough; one more thing that appears to be needed is a set of mutable interfaces & classes for structural immutability. (Which may simply be called "Rigid" because the prefix "StructurallyImmutable" is too damn long.)

11
  • Good points. Two details: 1. Immutable collections require certain method signatures, specifically (using a List as an example): List<T> add(T t) - all "mutator" methods must return a new collection that reflects the change. 2. For better or worse, Interfaces often represent a contract in addition to a signature. Serializable is one such interface. Similarly, Comparable requires that you correctly implement your compareTo() method to work correctly and ideally be compatible with equals() and hashCode(). Commented Jun 4, 2015 at 16:48
  • Oh, I did not even have mutate-by-copy immutability in mind. What I wrote above refers to plain simple immutable collections that really have no methods like add(). But I suppose that if mutator methods were to be added to the immutable classes, then they would need to return also immutable classes. So, if there is a problem lurking there, I do not see it.
    – Mike Nakis
    Commented Jun 4, 2015 at 17:16
  • 4
    Suppose someone hands you such a read-only collection interface; is it safe to pass it to another thread? Suppose someone passes you an instance of a mutable collection interface. Is it safe to invoke any method on it? You don't know that that the implementation doesn't loop forever, throw an exception, or completely disregards the interface's contract. Why have a double standard specifically for immutable collections?
    – Doval
    Commented Jun 22, 2016 at 14:59
  • 1
    IMHO your reasoning against mutable interfaces is wrong. You may write a mutable implementation of immutable interfaces, and then it breaks. Sure. But that's your fault as you are violating the contract. Just stop doing that. It's no different from breaking a SortedSet by subclassing the set with a non-conforming implementation. Or by passing an inconsistent Comparable. Nearly anything can be broken if you want. I guess, that's what @Doval meant by "double standards".
    – maaartinus
    Commented Aug 23, 2017 at 23:12
  • 1
    @maaartinus the thing is, unmodifiable (read-only) interfaces have a usefulness even outside the context of immutability. A class may legitimately expose an umodifiable interface of a mutable collection that it contains, and mutate the collection by other means. So, the unmodifiable collection interface cannot possibly impose a contract that says that the backing collection must be immutable, because there are legitimate usages where the backing collection is not immutable.
    – Mike Nakis
    Commented Aug 24, 2017 at 6:34
3

Immutable collections can be deeply recursive, compared to eachother, and not unreasonably inefficient if object equality is by secureHash. This is called a merkle forest. It can be per collection or within parts of them like an (self balancing binary) AVL tree for a sorted map.

Unless all java objects in these collections have a unique id or some bitstring to hash, the collection has nothing to hash to uniquely name itself.

Example: On my 4x1.6ghz laptop, I can run 200K sha256s per second of the smallest size that fits in 1 hash cycle (up to 55 bytes), compared to 500K HashMap ops or 3M ops in a hashtable of longs. 200K/log(collectionSize) new collections per second is fast enough for some things where data integrity and anonymous global scalability is important.

0

This is not immutable, but Read-Only. I.e. a type to represent an collection with only read operations. This is not a bad idea, and was done in C# collection framework. I think Java creators should consider this as a quite useful thing to do in future versions of Java.

-3

Performance. Collections by their nature can be very large. Copying 1000 elements to a new structure with 1001 elements instead of inserting a single element is just plain horrible.

Concurrency. If you have several threads running they may want to get the current version of the collection and not the version that was passed 12 hours ago when the thread started.

Storage. With immutable objects in a multi threaded environment you can end up with dozens of copies of the "same" object at different points of its life cycle. Doesnt matter for a Calendar or Date object but when its a collection of 10,000 widgets this will kill you.

9
  • 12
    Immutable collections only require copying if you can’t share because of pervasive mutability like Java has. Concurrency is generally easier with immutable collections because they don’t require locking; and for visibility you can always have a mutable reference to an immutable collection (common in OCaml). With sharing, updates can be essentially free. You may do logarithmically more allocations than with a mutable structure, but on update, many expired subobjects can be freed immediately or reused, so you don’t necessarily have higher memory overhead.
    – Jon Purdy
    Commented Dec 19, 2013 at 1:32
  • 4
    Couple problems. The collections in Clojure and Scala are both immutable, but support light-weight copies. Adding element 1001 means copying less than 33 elements, plus making a few new pointers. If you share a mutable collection across threads, you have all kinds of synchronization issues when you change it. Operations like "remove()" are nightmarish. Also, immutable collections can be built mutably, then copied once into an immutable version safe to share across threads. Commented Dec 19, 2013 at 2:33
  • 4
    Using concurrency as an argument against immutability is unusual. Duplicates as well. Commented Dec 20, 2013 at 1:41
  • 4
    A little bit miffed about the down votes here. The OP asked why they did not implement immutable collections, and, I provided a considered answer to the question. Presumably the only acceptable answer among the fashion conscious is "because they made a mistake". I actually have some experience with this having to refactor large chunks of code using the otherwise excellent BigDecimal class purely because of poor perfomance due to immutability 512 times that of using a double plus some messing around to fix the decimals. Commented Dec 20, 2013 at 3:33
  • 3
    @JamesAnderson: My problems with your answer: "Performance" - you could say that real life immutable collections always implement some form of sharing and reuse to avoid exactly the issue you describe. "Concurrency" - the argument boils down to "If you want mutability, then an immutable object does not work." I mean that if there is a notion of "latest version of the same thing", then something needs to mutate, either the thing itself, or something that owns the thing. And in "Storage", you seem to say that mutability is sometimes not desired.
    – jhominal
    Commented Dec 23, 2013 at 15:42

Not the answer you're looking for? Browse other questions tagged or ask your own question.