6986

I want to merge two dictionaries into a new dictionary.

x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
z = merge(x, y)

>>> z
{'a': 1, 'b': 3, 'c': 4}

Whenever a key k is present in both dictionaries, only the value y[k] should be kept.

0

43 Answers 43

1
2
12
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> x, z = dict(x), x.update(y) or x
>>> x
{'a': 1, 'b': 2}
>>> y
{'c': 11, 'b': 10}
>>> z
{'a': 1, 'c': 11, 'b': 10}
1
  • This method overwrites x with its copy. If x is a function argument this won't work (see example) Commented Feb 22, 2019 at 9:27
12

In Python 3.9

Based on PEP 584, the new version of Python introduces two new operators for dictionaries: union (|) and in-place union (|=). You can use | to merge two dictionaries, while |= will update a dictionary in place:

>>> pycon = {2016: "Portland", 2018: "Cleveland"}
>>> europython = {2017: "Rimini", 2018: "Edinburgh", 2019: "Basel"}

>>> pycon | europython
{2016: 'Portland', 2018: 'Edinburgh', 2017: 'Rimini', 2019: 'Basel'}

>>> pycon |= europython
>>> pycon
{2016: 'Portland', 2018: 'Edinburgh', 2017: 'Rimini', 2019: 'Basel'}

If d1 and d2 are two dictionaries, then d1 | d2 does the same as {**d1, **d2}. The | operator is used for calculating the union of sets, so the notation may already be familiar to you.

One advantage of using | is that it works on different dictionary-like types and keeps the type through the merge:

>>> from collections import defaultdict
>>> europe = defaultdict(lambda: "", {"Norway": "Oslo", "Spain": "Madrid"})
>>> africa = defaultdict(lambda: "", {"Egypt": "Cairo", "Zimbabwe": "Harare"})

>>> europe | africa
defaultdict(<function <lambda> at 0x7f0cb42a6700>,
  {'Norway': 'Oslo', 'Spain': 'Madrid', 'Egypt': 'Cairo', 'Zimbabwe': 'Harare'})

>>> {**europe, **africa}
{'Norway': 'Oslo', 'Spain': 'Madrid', 'Egypt': 'Cairo', 'Zimbabwe': 'Harare'}

You can use a defaultdict when you want to effectively handle missing keys. Note that | preserves the defaultdict, while {**europe, **africa} does not.

There are some similarities between how | works for dictionaries and how + works for lists. In fact, the + operator was originally proposed to merge dictionaries as well. This correspondence becomes even more evident when you look at the in-place operator.

The basic use of |= is to update a dictionary in place, similar to .update():

>>> libraries = {
...     "collections": "Container datatypes",
...     "math": "Mathematical functions",
... }
>>> libraries |= {"zoneinfo": "IANA time zone support"}
>>> libraries
{'collections': 'Container datatypes', 'math': 'Mathematical functions',
 'zoneinfo': 'IANA time zone support'}

When you merge dictionaries with |, both dictionaries need to be of a proper dictionary type. On the other hand, the in-place operator (|=) is happy to work with any dictionary-like data structure:

>>> libraries |= [("graphlib", "Functionality for graph-like structures")]
>>> libraries
{'collections': 'Container datatypes', 'math': 'Mathematical functions',
 'zoneinfo': 'IANA time zone support',
 'graphlib': 'Functionality for graph-like structures'}
1
  • 1
    Other answers already covered these
    – wim
    Commented Jun 23, 2022 at 1:50
11

Python 3.9+ only

Merge (|) and update (|=) operators have been added to the built-in dict class.

>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3}
>>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> d | e
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}

The augmented assignment version operates in-place:

>>> d |= e
>>> d
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}

See PEP 584

0
9

I know this does not really fit the specifics of the questions ("one liner"), but since none of the answers above went into this direction while lots and lots of answers addressed the performance issue, I felt I should contribute my thoughts.

Depending on the use case it might not be necessary to create a "real" merged dictionary of the given input dictionaries. A view which does this might be sufficient in many cases, i. e. an object which acts like the merged dictionary would without computing it completely. A lazy version of the merged dictionary, so to speak.

In Python, this is rather simple and can be done with the code shown at the end of my post. This given, the answer to the original question would be:

z = MergeDict(x, y)

When using this new object, it will behave like a merged dictionary but it will have constant creation time and constant memory footprint while leaving the original dictionaries untouched. Creating it is way cheaper than in the other solutions proposed.

Of course, if you use the result a lot, then you will at some point reach the limit where creating a real merged dictionary would have been the faster solution. As I said, it depends on your use case.

If you ever felt you would prefer to have a real merged dict, then calling dict(z) would produce it (but way more costly than the other solutions of course, so this is just worth mentioning).

You can also use this class to make a kind of copy-on-write dictionary:

a = { 'x': 3, 'y': 4 }
b = MergeDict(a)  # we merge just one dict
b['x'] = 5
print b  # will print {'x': 5, 'y': 4}
print a  # will print {'y': 4, 'x': 3}

Here's the straight-forward code of MergeDict:

class MergeDict(object):
  def __init__(self, *originals):
    self.originals = ({},) + originals[::-1]  # reversed

  def __getitem__(self, key):
    for original in self.originals:
      try:
        return original[key]
      except KeyError:
        pass
    raise KeyError(key)

  def __setitem__(self, key, value):
    self.originals[0][key] = value

  def __iter__(self):
    return iter(self.keys())

  def __repr__(self):
    return '%s(%s)' % (
      self.__class__.__name__,
      ', '.join(repr(original)
          for original in reversed(self.originals)))

  def __str__(self):
    return '{%s}' % ', '.join(
        '%r: %r' % i for i in self.iteritems())

  def iteritems(self):
    found = set()
    for original in self.originals:
      for k, v in original.iteritems():
        if k not in found:
          yield k, v
          found.add(k)

  def items(self):
    return list(self.iteritems())

  def keys(self):
    return list(k for k, _ in self.iteritems())

  def values(self):
    return list(v for _, v in self.iteritems())
2
  • 2
    I saw by now that some answers refer to a class called ChainMap which is available in Python 3 only and which does more or less what my code does. So shame on me for not reading everything carefully enough. But given that this only exists for Python 3, please take my answer as a contribution for the Python 2 users ;-)
    – Alfe
    Commented May 18, 2016 at 16:10
  • 5
    ChainMap was backported for earlier Pythons: pypi.python.org/pypi/chainmap
    – clacke
    Commented Jul 28, 2016 at 11:19
7

This is an expression for Python 3.5 or greater that merges dictionaries using reduce:

>>> from functools import reduce
>>> l = [{'a': 1}, {'b': 2}, {'a': 100, 'c': 3}]
>>> reduce(lambda x, y: {**x, **y}, l, {})
{'a': 100, 'b': 2, 'c': 3}

Note: this works even if the dictionary list is empty or contains only one element.

For a more efficient merge on Python 3.9 or greater, the lambda can be replaced directly by operator.ior:

>>> from functools import reduce
>>> from operator import ior
>>> l = [{'a': 1}, {'b': 2}, {'a': 100, 'c': 3}]
>>> reduce(ior, l, {})
{'a': 100, 'b': 2, 'c': 3}

For Python 3.8 or less, the following can be used as an alternative to ior:

>>> from functools import reduce
>>> l = [{'a': 1}, {'b': 2}, {'a': 100, 'c': 3}]
>>> reduce(lambda x, y: x.update(y) or x, l, {})
{'a': 100, 'b': 2, 'c': 3}
2
  • 1
    This is very inefficient (as bad as time proportional to the number of keys squared). Use operator.ior instead.
    – Ry-
    Commented Mar 19, 2022 at 3:24
  • Thanks - I've updated my answer. I'm not usually too concerned with efficiency here, as I'd use something like this for merging configuration once and at initialisation (also, I currently have to maintain compatibility for Python 3.7+)
    – Josh Bode
    Commented Mar 20, 2022 at 6:54
6

Using a dict comprehension, you may

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}

dc = {xi:(x[xi] if xi not in list(y.keys()) 
           else y[xi]) for xi in list(x.keys())+(list(y.keys()))}

gives

>>> dc
{'a': 1, 'c': 11, 'b': 10}

Note the syntax for if else in comprehension

{ (some_key if condition else default_key):(something_if_true if condition 
          else something_if_false) for key, value in dict_.items() }
1
  • 10
    I like the idea of using a dict comprehension, but your implementation is weak. It is insane to use ... in list(y.keys()) instead of just ... in y.
    – wim
    Commented Feb 18, 2014 at 20:18
5

A union of the OP's two dictionaries would be something like:

{'a': 1, 'b': 2, 10, 'c': 11}

Specifically, the union of two entities(x and y) contains all the elements of x and/or y. Unfortunately, what the OP asks for is not a union, despite the title of the post.

My code below is neither elegant nor a one-liner, but I believe it is consistent with the meaning of union.

From the OP's example:

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}

z = {}
for k, v in x.items():
    if not k in z:
        z[k] = [(v)]
    else:
        z[k].append((v))
for k, v in y.items():
    if not k in z:
        z[k] = [(v)]
    else:
        z[k].append((v))

{'a': [1], 'b': [2, 10], 'c': [11]}

Whether one wants lists could be changed, but the above will work if a dictionary contains lists (and nested lists) as values in either dictionary.

2
  • I've edited the question to not use the word union, for clarity.
    – Carl Meyer
    Commented Sep 30, 2014 at 15:49
  • 1
    Perhaps you mean {'a': 1, 'b': (2, 10), 'c': 11} …?
    – Alfe
    Commented May 18, 2016 at 16:07
5

You can use toolz.merge([x, y]) for this.

1
  • 5
    why should we use a 3rd party to perform such a trivial task when we can do it in native python? Commented May 14, 2019 at 14:36
5

I was curious if I could beat the accepted answer's time with a one line stringify approach:

I tried 5 methods, none previously mentioned - all one liner - all producing correct answers - and I couldn't come close.

So... to save you the trouble and perhaps fulfill curiosity:

import json
import yaml
import time
from ast import literal_eval as literal

def merge_two_dicts(x, y):
    z = x.copy()   # start with x's keys and values
    z.update(y)    # modifies z with y's keys and values & returns None
    return z

x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}

start = time.time()
for i in range(10000):
    z = yaml.load((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify yaml')

start = time.time()
for i in range(10000):
    z = literal((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify literal')

start = time.time()
for i in range(10000):
    z = eval((str(x)+str(y)).replace('}{',', '))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify eval')

start = time.time()
for i in range(10000):
    z = {k:int(v) for k,v in (dict(zip(
            ((str(x)+str(y))
            .replace('}',' ')
            .replace('{',' ')
            .replace(':',' ')
            .replace(',',' ')
            .replace("'",'')
            .strip()
            .split('  '))[::2], 
            ((str(x)+str(y))
            .replace('}',' ')
            .replace('{',' ').replace(':',' ')
            .replace(',',' ')
            .replace("'",'')
            .strip()
            .split('  '))[1::2]
             ))).items()}
elapsed = (time.time()-start)
print (elapsed, z, 'stringify replace')

start = time.time()
for i in range(10000):
    z = json.loads(str((str(x)+str(y)).replace('}{',', ').replace("'",'"')))
elapsed = (time.time()-start)
print (elapsed, z, 'stringify json')

start = time.time()
for i in range(10000):
    z = merge_two_dicts(x, y)
elapsed = (time.time()-start)
print (elapsed, z, 'accepted')

results:

7.693928956985474 {'c': 11, 'b': 10, 'a': 1} stringify yaml
0.29134678840637207 {'c': 11, 'b': 10, 'a': 1} stringify literal
0.2208399772644043 {'c': 11, 'b': 10, 'a': 1} stringify eval
0.1106564998626709 {'c': 11, 'b': 10, 'a': 1} stringify replace
0.07989692687988281 {'c': 11, 'b': 10, 'a': 1} stringify json
0.005082368850708008 {'c': 11, 'b': 10, 'a': 1} accepted

What I did learn from this is that JSON approach is the fastest way (of those attempted) to return a dictionary from string-of-dictionary; much faster (about 1/4th of the time) of what I considered to be the normal method using ast. I also learned that, the YAML approach should be avoided at all cost.

Yes, I understand that this is not the best/correct way. I was curious if it was faster, and it isn't; I posted to prove it so.

1
  • Note that the json approach is faster than ast.literal_eval, but it's also not as comprehensive. It can't handle Python literals not in the JSON spec, so no tuples, sets, frozensets, bools (it can handle JSON bools, but not the result of stringifying a Python bool directly), etc. ast.literal_eval is slower, but at least some of that is a consequence of handling more complex inputs. That said, I'm pretty sure it could be faster if they bothered to optimize it, it's just pretty rare that evaluating strings of Python literals is the chokepoint in code. Commented Feb 28, 2019 at 17:29
4

I think my ugly one-liners are just necessary here.

z = next(z.update(y) or z for z in [x.copy()])
# or
z = (lambda z: z.update(y) or z)(x.copy())
  1. Dicts are merged.
  2. Single expression.
  3. Don't ever dare to use it.

P.S. This is a solution working in both versions of Python. I know that Python 3 has this {**x, **y} thing and it is the right thing to use (as well as moving to Python 3 if you still have Python 2 is the right thing to do).

0
3

A method is deep merging. Making use of the | operator in 3.9+ for the use case of dict new being a set of default settings, and dict existing being a set of existing settings in use. My goal was to merge in any added settings from new without over writing existing settings in existing. I believe this recursive implementation will allow one to upgrade a dict with new values from another dict.

def merge_dict_recursive(new: dict, existing: dict):
    merged = new | existing

    for k, v in merged.items():
        if isinstance(v, dict):
            if k not in existing:
                # The key is not in existing dict at all, so add entire value
                existing[k] = new[k]

            merged[k] = merge_dict_recursive(new[k], existing[k])
    return merged

Example test data:

new
{'dashboard': True,
 'depth': {'a': 1, 'b': 22222, 'c': {'d': {'e': 69}}},
 'intro': 'this is the dashboard',
 'newkey': False,
 'show_closed_sessions': False,
 'version': None,
 'visible_sessions_limit': 9999}
existing
{'dashboard': True,
 'depth': {'a': 5},
 'intro': 'this is the dashboard',
 'newkey': True,
 'show_closed_sessions': False,
 'version': '2021-08-22 12:00:30.531038+00:00'}
merged
{'dashboard': True,
 'depth': {'a': 5, 'b': 22222, 'c': {'d': {'e': 69}}},
 'intro': 'this is the dashboard',
 'newkey': True,
 'show_closed_sessions': False,
 'version': '2021-08-22 12:00:30.531038+00:00',
 'visible_sessions_limit': 9999}
1
  • this is a perfectly valid answer for the question in sight; In my opinion; very good ! :) Commented Jul 15, 2022 at 4:09
1

The question is tagged python-3x but, taking into account that it's a relatively recent addition and that the most voted, accepted answer deals extensively with a Python 2.x solution, I dare add a one liner that draws on an irritating feature of Python 2.x list comprehension, that is name leaking...

$ python2
Python 2.7.13 (default, Jan 19 2017, 14:48:08) 
[GCC 6.3.0 20170118] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> [z.update(d) for z in [{}] for d in (x, y)]
[None, None]
>>> z
{'a': 1, 'c': 11, 'b': 10}
>>> ...

I'm happy to say that the above doesn't work any more on any version of Python 3.

1

Deep merge of dicts:

from typing import List, Dict
from copy import deepcopy

def merge_dicts(*from_dicts: List[Dict], no_copy: bool=False) -> Dict :
    """ no recursion deep merge of two dicts

    By default creates fresh Dict and merges all to it.

    no_copy = True, will merge all dicts to a fist one in a list without copy.
    Why? Sometime I need to combine one dictionary from "layers".
    The "layers" are not in use and dropped immediately after merging.
    """

    if no_copy:
        xerox = lambda x:x
    else:
        xerox = deepcopy

    result = xerox(from_dicts[0])

    for _from in from_dicts[1:]:
        merge_queue = [(result, _from)]
        for _to, _from in merge_queue:
            for k, v in _from.items():
                if k in _to and isinstance(_to[k], dict) and isinstance(v, dict):
                    # key collision add both are dicts.
                    # add to merging queue
                    merge_queue.append((_to[k], v))
                    continue
                _to[k] = xerox(v)

    return result

Usage:

print("=============================")
print("merge all dicts to first one without copy.")
a0 = {"a":{"b":1}}
a1 = {"a":{"c":{"d":4}}}
a2 = {"a":{"c":{"f":5}, "d": 6}}
print(f"a0 id[{id(a0)}] value:{a0}")
print(f"a1 id[{id(a1)}] value:{a1}")
print(f"a2 id[{id(a2)}] value:{a2}")
r = merge_dicts(a0, a1, a2, no_copy=True)
print(f"r  id[{id(r)}] value:{r}")

print("=============================")
print("create fresh copy of all")
a0 = {"a":{"b":1}}
a1 = {"a":{"c":{"d":4}}}
a2 = {"a":{"c":{"f":5}, "d": 6}}
print(f"a0 id[{id(a0)}] value:{a0}")
print(f"a1 id[{id(a1)}] value:{a1}")
print(f"a2 id[{id(a2)}] value:{a2}")
r = merge_dicts(a0, a1, a2)
print(f"r  id[{id(r)}] value:{r}")
1
  • Is it really necessary to have explicit for loops nested three levels deep? Commented Sep 29, 2022 at 21:37
1
2

Not the answer you're looking for? Browse other questions tagged or ask your own question.