I considered two collections with a similar concept - ParHashMap
from Scala and ConcurrentHashMap from Java. Both of them have the same time complexity and both of them are thread safe and lock-free, but they only are based on different concepts under the hood - trie and hash table accordingly. And this reasoning leads to question: why do we need for ParHashMap from Scala while there is ConcurrentHashMap from Java?
1 Answer
ConcurrentHashMap
is a thread safe Map<>
implementation. If you have multiple threads accessing it at the same time they will be in sync.
ParHashMap
is a parallel collection. If you execute operations here (like map()
, filter()
, aggregate()
) Scala will parallelize it for you (similar to Spark but only within a single JVM).
To summarize, ConcurrentHashMap
gives the primitive to synchronize threads for concurrency, ParHashMap
takes care of both sync and execution.
Edit: Note that ParHashMap
is not itself necessarily thread-safe. The idea is to call its methods from a single thread and let the parallelism be handled by the parallel data structure itself.
-
@ You said: "If you execute operations here (like map(), filter(), aggregate()) Scala will parallelize it for you" Doest it work similar to Spark parallelization mechanism?– pacmanCommented Jan 10, 2017 at 9:44
-
1Actually the inventor of Spark said that he drew inspiration from the parallel collection of Scala while inventing the RDD abstraction.– mariosCommented Jan 10, 2017 at 17:30
-
2I think your best bet for a thread safe mutable hashmap in scala is to use the implementation from here: stackoverflow.com/a/17542165/1553233. Using the ParHashMap for this is not the right tool for the job (even if you can force it to be).– mariosCommented Jan 10, 2017 at 18:53
-
2Starting with Java 8,
ConcurrentHashMap
also supports several parallel processing methods, see for the methods starting withforEach…
,reduce…
orsearch…
(or generally all methods having a first parameterlong parallelismThreshold
).– HolgerCommented Jan 16, 2017 at 16:45 -
1I didn’t say it’s thread safe. Probably it’s not. I don’t think you would want to call this from multiple threads. It kind of beats the purpose.– mariosCommented Nov 9, 2017 at 1:03
ConcurrenHashMap
always waslock-free
, but in Java 7 it had an another structure with Node via leveraging aLock stripping
concurrent pattern, but in Java 8 it was changed - this approach was disabled andrb tree
in case of collision would build.