47

Scala has a TrieMap collection.

What is a TrieMap and what is its advantages/disadvantages compared to a HashMap?

2
  • Or maybe: stackoverflow.com/questions/18660769/…
    – Ben Reich
    Commented Apr 7, 2015 at 19:41
  • 3
    Definitely not a duplicate. Your biggest advantage is that Scala TrieMaps provide efficient, consistent iterators that capture all the elements in the trie at one point in time.
    – axel22
    Commented Aug 1, 2016 at 13:20

2 Answers 2

59

A Scala TrieMap is a trie-based concurrent scalable map implementation. Unlike normal trie maps, a Scala TrieMap has an efficient, non-blocking, O(1) time snapshot operation (and a slightly optimized readOnlySnapshot) operation.

Absolute performance of a TrieMap is slightly below the JDK8 ConcurrentHashMap, but the advantage is that it provides consistent iterators, something that concurrent data structures typically do not have. This means that you can capture all the elements in the trie at one point in time (performance numbers and analysis here). You should use the TrieMap if you need to capture all the elements at once (e.g. to list all its elements in the UI, or to consistently analyze them).

6
  • 1
    Slightly unrelated, but there is a open-source port of the scale TrieMap to Java available: github.com/romix/java-concurrent-hash-trie-map.
    – Pinch
    Commented Aug 1, 2016 at 22:04
  • @axel22 trying to get you attention here to explain how to use snapshot properly: stackoverflow.com/questions/66897678/… Basically I am trying to understand how is O(1) time snapshot operation interpreted.
    – cpchung
    Commented Apr 1, 2021 at 17:46
  • The new snapshot (i.e. clone) of the TrieMap will be created in a constant amount of time (i.e. in the same amount of time independent of how many keys you stored). Later, when you try to access some of the keys in that snapshot or in the original TrieMap, the relevant parts of the TrieMap will be lazily duplicated.
    – axel22
    Commented Apr 4, 2021 at 10:29
  • @axel22 the relevant parts of the TrieMap will be lazily duplicated. Does that mean when the key is accessed from the clone/snapshot, it is no longer O(1) get() as in hashmap? What will be the run time complexity for this lazy duplication? the height of the underlying tree like O(lgN)?
    – cpchung
    Commented Apr 11, 2021 at 2:18
  • 1
    The get is never O(1) in the standard Scala TrieMap - it's always expected O(log n), but with a low constant factor (because the nodes are very wide). This is the case both with and without lazy duplication. The similar data structure that has expected O(1) operations is called cache-trie: dl.acm.org/doi/abs/10.1145/3178487.3178498 (but this is not implemented in the Scala stdlib, it's available on github as a standalone repo) If you added snapshots to cache-tries, the get before duplication would be O(1), but after the duplication, it would briefly be O(log n).
    – axel22
    Commented Apr 12, 2021 at 15:16
24

TrieMaps are Maps using trie data structure which are essentially shallow trees. For example, If you have a 32 bit hash you break it up into sections for example 4 times 8 and at every level of the tree you branch to up to 256 sub trees. Obviously this gives O(1) performance due to the fixe size of hash(assuming few collisions).

A trie structure can be made immutable efficently, reusing the structure of a trie to create a new trie with an added or removed element. The relative performance in time/memory affect on GC depend greatly on implementation and load so rather then attempt a generic answer I would say run a benchmark. Though for a single thread with no immutability requirement a classic hashmap will normally produce better average performance and inferior worst case performance.

As a side note I will mention since the trieMap also uses a hash it is also a hashmap so I would recommend calling it a trie backed hashmap vs array backed hashmap.

Not the answer you're looking for? Browse other questions tagged or ask your own question.