62

I am looking for a high-performance, concurrent, MultiMap. I have searched everywhere but I simply cannot find a solution that uses the same approach as ConcurrentHashMap (Only locking a segment of the hash array).

The multimap will be both read, added to and removed from often.

The multimap key will be a String and it's value will be arbitrary.

I need O(1) to find all values for a given key, O(N) is OK for removal, but O(logN) would be preferred.

It is crucial that removal of the last value for a given key will remove the container of values from the key, as to not leak memory.

EDIT: HERE'S THE SOLUTION I BUILT, available under ApacheV2: Index (multimap)

15
  • 7
    There's no Map with O(1) lookup (except the bucket algorithm stuff, as usual). HashMaps have O(cn) with very small c.
    – ziggystar
    Commented Sep 3, 2010 at 12:43
  • 2
    ziggystar, it depends on how well the hashing function distributes the keys. If it does so randomly - which you could expect of a modern hash, for arbitrary strings - then lookup is O(1). This is also what the HashMap Javadoc says. Commented Sep 3, 2010 at 12:59
  • 2
    I have a registry of potentially millions of objects and they may share some properties, and I need an index for said properties, so I can retrieve all objects having a certain property. Commented Sep 3, 2010 at 14:02
  • 1
    Perhaps it is time to search for your answer on cstheory.stackexchange.com? It looks like you'll be rolling your own data structure... Commented Sep 3, 2010 at 18:46
  • 2
    @John V: Here's what I built: gist.github.com/566793 Commented Sep 6, 2010 at 13:16

10 Answers 10

11

Why not wrap ConcurrentHashMap[T,ConcurrentLinkedQueue[U]] with some nice Scala-like methods (e.g. implicit conversion to Iterable or whatever it is that you need, and an update method)?

6
  • 2
    @Viktor - If you have the (key,value) pair, map.get(key).remove(value) should do the trick as long as you leave the empty queue there as a placeholder when you remove a key and catch the exception if it's not there (including the NPE off of get). If you can't leave the queue as a placeholder, then it's difficult to ensure safety unless you do lock the entire map while you clean out the cruft.
    – Rex Kerr
    Commented Sep 3, 2010 at 13:42
  • Can't leave the queue since it will leak memory. I'm actually considering using the ConcurrentHashMap source and bend it to my will, but it seems like a very crude approach. Commented Sep 3, 2010 at 13:45
  • 1
    This may be the best solution, you can constrain locking the entire collection to only occur when you find a Queue which is empty, I'm not sure you can get away from either writing your own implementation, or changing the way you want to use this to handle this nested structure rather than a Multimap. Commented Sep 3, 2010 at 13:49
  • 3
    Can you tolerate locking the queue when you get it, and then holding the queue while you send an update to the map to empty that (k,v) pair? That is, use the lock on the queue rather than on the whole map to avoid collisions? (And add logic to handle the case where you get a queue but then, before you can lock it, it's emptied out and removed from the map--all you have to do is check when you get the lock that it is non-empty. If it is empty, assume it's been removed.)
    – Rex Kerr
    Commented Sep 3, 2010 at 13:50
  • 2
    For those looking for Viktor's implementation, you can find it as akka.util.ConcurrentMultiMap. Thanks a lot, Viktor! Commented Aug 9, 2016 at 12:10
6

Have you tried Google Collections? They have various Multimap implementations.

3
  • 8
    Yes, but I am not looking for just a multimap, I am looking for a high-performance concurrent multimap. When I checked their source code earlier today, their concurrent multimap locked the entire map, which creates a serial structure. Commented Sep 3, 2010 at 11:45
  • 1
    You're right about the implementation of Syncronized - it locks the whole instance - have you identified this collection as a performance bottlekneck? Commented Sep 3, 2010 at 12:43
  • 2
    Yes, this is as central as it gets. Commented Sep 3, 2010 at 14:00
3

There is one in akka although I haven't used it.

0
2

I had a requirement where I had to have a Map<Comparable, Set<Comparable>> where insertion on the Map be concurrent and also on the corresponding Set, but once a Key was consumed from the Map, it had to be deleted, think if as a Job running every two seconds which is consuming the whole Set<Comparable> from an specific Key but insertion be totally concurrent so that most values be buffered when the Job kicks in, here is my implementation:

Note: I use Guava's helper class Maps to create the concurrent Maps, also, this solution emulates Java concurrency in Practice Listing 5.19:

import com.google.common.collect.MapMaker;
import com.google.common.collect.Sets;

import java.util.Collection;
import java.util.Set;
import java.util.concurrent.ConcurrentMap;

/**
 * A general purpose Multimap implementation for delayed processing and concurrent insertion/deletes.
 *
 * @param <K> A comparable Key
 * @param <V> A comparable Value
 */
public class ConcurrentMultiMap<K extends Comparable, V extends Comparable>
{
  private final int size;
  private final ConcurrentMap<K, Set<V>> cache;
  private final ConcurrentMap<K, Object> locks;

  public ConcurrentMultiMap()
  {
    this(32, 2);
  }

  public ConcurrentMultiMap(final int concurrencyLevel)
  {
    this(concurrencyLevel, 2);
  }

  public ConcurrentMultiMap(final int concurrencyLevel, final int factor)
  {
    size=concurrencyLevel * factor;
    cache=new MapMaker().concurrencyLevel(concurrencyLevel).initialCapacity(concurrencyLevel).makeMap();
    locks=new MapMaker().concurrencyLevel(concurrencyLevel).initialCapacity(concurrencyLevel).weakKeys().weakValues().makeMap();
  }

  private Object getLock(final K key){
    final Object object=new Object();
    Object lock=locks.putIfAbsent(key, object);
    if(lock == null){
      lock=object;
    }
    return lock;
  }

  public void put(final K key, final V value)
  {
    synchronized(getLock(key)){
      Set<V> set=cache.get(key);
      if(set == null){
        set=Sets.newHashSetWithExpectedSize(size);
        cache.put(key, set);
      }
      set.add(value);
    }
  }

  public void putAll(final K key, final Collection<V> values)
  {
    synchronized(getLock(key)){
      Set<V> set=cache.get(key);
      if(set == null){
        set=Sets.newHashSetWithExpectedSize(size);
        cache.put(key, set);
      }
      set.addAll(values);
    }
  }

  public Set<V> remove(final K key)
  {
    synchronized(getLock(key)){
      return cache.remove(key);
    }
  }

  public Set<K> getKeySet()
  {
    return cache.keySet();
  }

  public int size()
  {
    return cache.size();
  }

}
2

I made a ConcurrentMultiMap mixin which extends the mutable.MultiMap mixin and has a concurrent.Map[A, Set[B]] self type. It locks per key, which has O(n) space complexity, but its time complexity is pretty good, if you aren't particularly write-heavy.

1

you should give ctries a try. here is the pdf.

3
  • Aleks s coming to visit me at the office next week, I'll talk to him then. Commented Nov 21, 2011 at 22:14
  • iam curious to hear how the talk went. did you find ctries useful?
    – Shlomi
    Commented Feb 15, 2012 at 0:44
  • Not for the multimap purpose, I talked to Bagwell and Prokopec about making an implementation that would work as a multimap, but I don't think there was enough time. Ctries are going into Scala 2.10 Commented Feb 15, 2012 at 22:01
0

It's late for the discussion, yet...

When it comes to high performance concurrent stuff, one should be prepared to code the solution. With Concurrent the statement the Devil is in the details has a complete meaning. It's possible to implement the structure fully concurrent and lock-free.

Starting base would be the NonBlocking Hashtable http://sourceforge.net/projects/high-scale-lib/ and then depending how many values per key and how often need to add/remove some copy on write Object[] for values or an array based Set with semaphore/spin lock.

0

I am a bit late on this topic but I think, nowadays, you can use Guava like this:

Multimaps.newSetMultimap(new ConcurrentHashMap<>(), ConcurrentHashMap::newKeySet)
2
  • 3
    Does Multimaps.newSetMultimap(...) support concurrent updates? According to the Javadoc, "The multimap is not threadsafe when any concurrent operations update the multimap, even if map and the instances generated by factory are." cf. google.github.io/guava/releases/23.0/api/docs/com/google/common/…
    – breandan
    Commented Aug 17, 2017 at 5:19
  • so by reading the docs, and the previous answer will this work ? : Multimaps.synchronizedSetMultimap(Multimaps.newSetMultimap(new ConcurrentHashMap<>(), ConcurrentHashMap::newKeySet)); Commented Sep 23, 2019 at 15:24
0

Use MultiMaps from Gauava. Multimaps.synchronizedMultimap(HashMultimap.create())

1
  • 2
    Your solution is synchronized. Note what the OP was looking for. Commented Jan 25, 2021 at 14:41
-1

Have you taken a look to Javalution which is intended for Real time etc. and of course high performance.

1
  • 1
    Don't see no multimap let alone a concurrent and high performant one :( Commented Sep 3, 2010 at 12:40

Not the answer you're looking for? Browse other questions tagged or ask your own question.