1

I have a map defined

private static HashMap<Object, Object> myMap;

It is populated in a single thread and then that single thread spawns more threads that alter the data inside the map elements, but do not alter the map structure (no removes/puts/etc). Also, each thread alters one and only one unique member of the map (no two threads alter the same member).

My question is: will the main thread see the changes to the hashmap members, once all changes are complete? If not, would adding volatile to the declaration work, or would it only guarantee other threads see changes to the structure? Thanks

EDIT: Code that hopefully highlights what I'm doing in a more clear way

import java.util.HashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

public class TestingRandomStuff {
    public static void main(String[] args) throws Exception {
        HashMap<Object, Object> myMap = new HashMap();
        ExecutorService pool = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());

        //myMap is populated by this thread, and the objects inside are initialized, but are left largely empty
        populate();

        for (Object o : myMap.values()) {
            Runnable r = new Task(o);
            pool.execute(r);
        }
        pool.shutdown();
        try {
            pool.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS);
        } catch (InterruptedException e) {;}

        //How do I gurantee that the objects inside myMap are displayed correctly with all the data that was just loaded by seperate threads,
        //or is it already guranteed?
        displayObjectData();
    }

    public static class Task implements Runnable {
        private Object o;

        public Task(Object o) {this.o = o;}

        public void run() {
            try {
                o.load(); //o contains many complicated instance variables that will be created and written to
            } catch (Exception e) {;}
        }
    }
}
15
  • 1
    Is this just a bad example? Because String is immutable. Can you give a more realistic example?
    – ernest_k
    Commented Jul 18, 2018 at 17:07
  • 1
    No, there's no guarantee other threads will see changes. Making the map volatile won't help because you're not modifying the field.
    – shmosel
    Commented Jul 18, 2018 at 17:07
  • What do you mean by "alter the data inside the map elements"? Strings are immutable, but you aren't removing/putting?
    – rgettman
    Commented Jul 18, 2018 at 17:07
  • 1
    There are no guarantees that threads will see changes to the nested objects. Making the map volatile won't help as @shmosel points out. Adding a synchronized block would be the simplet way to make this work - other solutions require a much deeper understanding of the structure of what you are changing and how you are changing it. Commented Jul 18, 2018 at 17:13
  • 1
    @sufah1 the fact a object is referenced in a ConcurrentHashMap doesn't make the object any more thread safe. Commented Jul 18, 2018 at 17:23

3 Answers 3

4

EDIT: in your example, the map doesn't get accessed in other threads, only objects which are referenced by the map.

The objects themselves should be thread safe due to the way they are being used.

Note: if you used a parallelStream() the code would be simpler.


will other threads see the changes to the hashmap members?

Probably, but there is no guarentee

If not, would adding volatile to the declaration work,

volatile on the field only adds a read barrier on the reference to the Map. (Unless you change the field to point to another Map when you will get a write barrier.)

or would it only guarantee other threads see changes to the structure?

No, only guaranteed see changes to the myMap reference not changes to the Map or anything in the Map. i.e. the guarantee is very shallow.

There is a number of ways you can provide thread safety however the simplest is to synchronized the object in the on write and read. There are tricks you can do with volatile fields however it is high dependant on what you are doing as to whether thi will work.

2
  • 1
    The OP has added a code example after you posted your answer and after reviewing the question’s code example, I consider it completely safe, due to the way it uses an ExecutorService. The confusion stems from the OP’s term “hashmap members”, which is a red herring, as the hash map is populated before the concurrent operation and doesn’t change afterwards, in other words, is completely irrelevant here.
    – Holger
    Commented Aug 2, 2018 at 17:08
  • @Holger Thank you for picking up on that change. It's a common misconception that if an object is referenced in a map it some how has a special relationship with that map. Commented Aug 2, 2018 at 18:37
1

You could use ConcurrentHashMap.

Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset.

This means changes done by a thread are visible to another thread reading value for the same key. But the two can interfere.

9
  • 1
    @Peter Lawrey said in the comments of my post that "the fact a object is referenced in a ConcurrentHashMap doesn't make the object any more thread safe", so I am a tad confused
    – sufah1
    Commented Jul 18, 2018 at 19:08
  • @sufah1, I think the case is that if you have ConcurrentHashMap<String, List<String>> you cannot just do map.get("key").add("another_value"). You might need to do List l = map.get("key"); l.add("another_value"); map.put("key", l);. Also if two threads execute that code then you might get missing updates. In your particular case ConsurrentHashMap will work
    – Ivan
    Commented Jul 18, 2018 at 19:19
  • If your List is not threadsafe then the behaviour of the example is undefined - the CHM is irrelevant. Commented Jul 18, 2018 at 19:42
  • @BoristheSpider, yes, this is what I've said in the comment. But it is valid only if several threads perform said update operation. If one performs update using replace/put and another thread only reads using get then it will work.
    – Ivan
    Commented Jul 18, 2018 at 19:46
  • 1
    Forget about the map. It's not safe to have 2 threads updating the same list at the same time.
    – shmosel
    Commented Jul 18, 2018 at 21:25
1

In Java, there are no objects embedded in other objects, all data structures consisting of multiple objects are reference graphs, or in other words, a collection of objects is a collection of references and a map with key and value objects is a map containing references to these objects.

Therefore, your objects never became “hashmap members”, but are just referenced by your hashmap. So, to discuss the thread safety of your code, the existence of the HashMap is just a red herring, as your multi-threaded code never sees any artifact of the HashMap.

You have code creating several distinct object, then submitting Task instances, each containing a reference to one of these objects, to an ExecutorService to be processed. Assuming that these object do not share mutable state, this is a straight-forward thread safe approach.

After waiting for the completion of all jobs, the main thread can be sure to see the result of all actions made within the jobs, i.e. the modifications made to these objects. It will again be entirely irrelevant whether you use that HashMap or anything else to get a reference to one of these objects to look at these modifications.

It would be different if you were modifying keys of a map in a way that it affects their equality or hash code, but that’s independent from thread safety concerns. You must never modify map keys in such a way, even in single threaded code, and violating this contract would even break thread safe maps. But since your objects are referenced as values, there is no such problem.

There is only one corner case to pay attention to. Your waiting for completion contains the line catch (InterruptedException e) {;}, so there is no full guaranty that after the execution of the statement truly all jobs have been completed, but this is the requirement for the visibility guaranty. Since you seem to assume that interruption should never happen, you should use something like catch(InterruptedException e) { throw new AssertionError(e); }, to be sure that violations of that assumption do not happen silently, while at the same time, get the full visibility guaranty, as now the displayObjectData(); statement can only be reached when all jobs have been completed (or Long.MAX_VALUE seconds elapsed which no-one of us will ever witness).

Not the answer you're looking for? Browse other questions tagged or ask your own question.