Suppose I have a list of hashmaps (<String, Object>
). Values can be either scalar types or a nested HashMap with the same type.
I need to merge a list of this maps into a final HashMap, each map in the list has higher precedence than the one before. Also the merge has to be deep: i.e.:
map1 = {root: {a: a, b: {c: c, d: d}}}
map2 = {root: {a: a, b: {c: x}}
result = {root: {a: a, b: {c: x, d: d}}}
In the previous example, the sub tree b
in map2
does not override the whole subtree in map1
. It just overrides the leaves that are duplicated.
The code I have right now looks like this and works ok:
public static Map<String, Object> merge(List<Map<String, Object>> configs) {
return configs.stream()
.filter(Objects::nonNull)
.reduce(new LinkedHashMap<>(), ConfigMerger::mergeMaps);
}
public static Map<String, Object> mergeMaps(Map<String, Object> map, Map<String, Object> other) {
Map<String, Object> result = new LinkedHashMap<>(map.size() + other.size());
result.putAll(map);
for (Map.Entry<String, Object> entry : other.entrySet()) {
String key = entry.getKey();
Object val2 = entry.getValue();
Object val1 = result.get(key);
if (val1 instanceof Map && val2 instanceof Map) {
Map<String, Object> nestedResult = mergeMaps((Map<String, Object>) val1, (Map<String, Object>) val2);
result.put(key, nestedResult);
} else {
result.put(key, val2);
}
}
return result;
}
The only issue is that result.putAll(map);
is a bit wasteful. I do that for every nested map in each one of the original maps.
I tried a slightly more complicated approach, which is around 20% faster, and yet the most expensive method is the HashSet.addAll
:
// (assumes the list in reversed order)
public static Map<String, Object> merge2(List<Map<String, Object>> configs) {
Map<String, Object> result = new HashMap<>();
Set<String> allKeys = new HashSet<>();
for (Map<String, Object> config : configs) {
allKeys.addAll(config.keySet());
}
for (String key : allKeys) {
if (configs.get(0).get(key) instanceof Map<?, ?>) {
List<Map<String, Object>> nestedConfigs = new ArrayList<>();
for (Map<String, Object> config : configs) {
if (config.get(key) instanceof Map<?, ?>) {
nestedConfigs.add((Map<String, Object>) config.get(key));
}
}
result.put(key, merge2(nestedConfigs));
} else {
for (Map<String, Object> config : configs) {
if (config.containsKey(key)) {
result.put(key, config.get(key));
break;
}
}
}
}
return result;
}
Is there any other approach I could try that could help me squeeze a bit more of performance out of this?
For context, the maps are usually huge (hundreds of subtrees) and the nesting level up to fifty sometimes.