624

I would like to make a deep copy of a dict in python. Unfortunately the .deepcopy() method doesn't exist for the dict. How do I do that?

>>> my_dict = {'a': [1, 2, 3], 'b': [4, 5, 6]}
>>> my_copy = my_dict.deepcopy()
Traceback (most recent calll last):
  File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'deepcopy'
>>> my_copy = my_dict.copy()
>>> my_dict['a'][2] = 7
>>> my_copy['a'][2]
7

The last line should be 3.

I would like that modifications in my_dict don't impact the snapshot my_copy.

How do I do that? The solution should be compatible with Python 3.x.

3

4 Answers 4

870

How about:

import copy
d = { ... }
d2 = copy.deepcopy(d)

Python 2 or 3:

Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import copy
>>> my_dict = {'a': [1, 2, 3], 'b': [4, 5, 6]}
>>> my_copy = copy.deepcopy(my_dict)
>>> my_dict['a'][2] = 7
>>> my_copy['a'][2]
3
>>>
12
  • 24
    Indeed that works for the oversimplified example I gave. My keys are not numbers but objects. If I read the copy module documentation, I have to declare a __copy__()/__deepcopy__() method for the keys. Thank you very much for leading me there! Commented Feb 24, 2011 at 14:17
  • 5
    Is there any difference in Python 3.2 and 2.7 codes? They seem identical to me. If so, would be better a single block of code and a statement "Works for both Python 3 and 2"
    – MestreLion
    Commented Jun 7, 2014 at 3:59
  • 77
    It's also worth mentioning copy.deepcopy isn't thread safe. Learned this the hard way. On the other hand, depending on your use case, json.loads(json.dumps(d)) is thread safe, and works well.
    – rob
    Commented Feb 7, 2017 at 22:11
  • 13
    @rob you should post that comment as an answer. It is a viable alternative. The thread safety nuance is an important distinction.
    – BuvinJ
    Commented Oct 23, 2017 at 17:29
  • 9
    @BuvinJ The issue is that json.loads doesn't solve the problem for all use cases where python dict attributes are not JSON serializable. It may help those who are only dealing with simple data structures, from an API for example, but I don't think it's enough of a solution to fully answer the OP's question.
    – rob
    Commented Oct 24, 2017 at 15:38
74

dict.copy() is a shallow copy function for dictionary
id is built-in function that gives you the address of variable

First you need to understand "why is this particular problem is happening?"

In [1]: my_dict = {'a': [1, 2, 3], 'b': [4, 5, 6]}

In [2]: my_copy = my_dict.copy()

In [3]: id(my_dict)
Out[3]: 140190444167808

In [4]: id(my_copy)
Out[4]: 140190444170328

In [5]: id(my_copy['a'])
Out[5]: 140190444024104

In [6]: id(my_dict['a'])
Out[6]: 140190444024104

The address of the list present in both the dicts for key 'a' is pointing to same location.
Therefore when you change value of the list in my_dict, the list in my_copy changes as well.


Solution for data structure mentioned in the question:

In [7]: my_copy = {key: value[:] for key, value in my_dict.items()}

In [8]: id(my_copy['a'])
Out[8]: 140190444024176

Or you can use deepcopy as mentioned above.

4
  • 13
    Your solution doesn't work for nested dictionaries. deepcopy is preferable for that reason. Commented Apr 18, 2018 at 18:03
  • 3
    @CharlesPlager Agreed! But you should also notice that list slicing doesn't work on dict value[:]. The solution was for the particular data structure mentioned in the question rather than a universal solution. Commented Apr 23, 2018 at 10:06
  • 2nd solution for dict comprehension would not work with nested lists as values ``` >>> my_copy = {k:v.copy() for k,v in my_dict.items()} >>> id(my_copy["a"]) 4427452032 >>> id(my_dict["a"]) 4427460288 >>> id(my_copy["a"][0]) 4427469056 >>> id(my_dict["a"][0]) 4427469056 ```
    – Aseem Jain
    Commented Mar 4, 2021 at 3:27
  • This is the bets solution for people who do not want to or can not use extra libraries. I am using python in LaTex environment, due to which I could not afford to use a single extra library. So, this solution works like a charm for me. Thanks @theBuzzyCoder Commented Dec 20, 2022 at 3:58
72

Python 3.x

from copy import deepcopy

# define the original dictionary
original_dict = {'a': [1, 2, 3], 'b': {'c': 4, 'd': 5, 'e': 6}}

# make a deep copy of the original dictionary
new_dict = deepcopy(original_dict)

# modify the dictionary in a loop
for key in new_dict:
    if isinstance(new_dict[key], dict) and 'e' in new_dict[key]:
        del new_dict[key]['e']

# print the original and modified dictionaries
print('Original dictionary:', original_dict)
print('Modified dictionary:', new_dict)

Which would yield:

Original dictionary: {'a': [1, 2, 3], 'b': {'c': 4, 'd': 5, 'e': 6}}
Modified dictionary: {'a': [1, 2, 3], 'b': {'c': 4, 'd': 5}}

Without new_dict = deepcopy(original_dict), 'e' element is unable to be removed.

Why? Because if the loop was for key in original_dict, and an attempt is made to modify original_dict, a RuntimeError would be observed:

"RuntimeError: dictionary changed size during iteration"

So in order to modify a dictionary within an iteration, a copy of the dictionary must be used.

Here is an example function that removes an element from a dictionary:

def remove_hostname(domain, hostname):
    domain_copy = deepcopy(domain)
    for domains, hosts in domain_copy.items():
        for host, port in hosts.items():
           if host == hostname:
                del domain[domains][host]
    return domain
0
0

@Rob suggested a good alternative to copy.deepcopy() if you are planning to create standard APIs for your data (database or web request json payloads).

>>> import json
>>> my_dict = {'a': [1, 2, 3], 'b': [4, 5, 6]}
>>> json.loads(json.dumps(my_dict))
{'a': [1, 2, 3], 'b': [4, 5, 6]}

This approach is also thread safe. But it only works for jsonifiable (serializable) objects like str, dict, list, int, float, and None. Sometimes that jsonifiable constraint can be a good thing because it forces you to make your data structures compliant with standard database fixture and web request formats (for creating best-practice APIs).

If your data structures don't work directly with json.dumps (for example datetime objects and your custom classes) they will be need to be coerced into a string or other standard type before serializing with json.dumps(). And you'll need to run a custom deserializer as well after json.loads():

>>> from datetime import datetime as dt
>>> my_dict = {'a': (1,), 'b': dt(2023, 4, 9).isoformat()}
>>> d = json.loads(json.dumps(my_dict))
>>> d
{'a': [1], 'b': '2023-04-09T00:00:00'}
>>> for k in d:
...     try:
...         d[k] = dt.fromisoformat(d[k])
...     except:
...         pass
>>> d
{'a': [1], 'b': datetime.datetime(2023, 4, 9, 0, 0)}

Of course you need to do the serialization and deserialization on special objects recursively. Sometimes that's a good thing. This process normalizes all your objects to types that are directly serializable (for example tuples become lists) and you can be sure they'll match a reproducable data schema (for relational database storage).

And it's thread safe. The builtin copy.deepcopy() is NOT thread safe! If you use deepcopy within async code that can crash your program or corrupt your data unexpectedly long after you've forgotten about your code.

7
  • Please explain why it's thread-safe. Becasuse the doc doesn't say that those methods are thread-safe. Commented Apr 10, 2023 at 12:51
  • I think @Rob is right, those functions merely create two new objects without reusing any of the data from the original object. And the intermediate str object is deleted. Constructing new built-in types is always thread safe. But now that I say it out loud, perhaps you're concerned that the mutable objects might be modified during json.dumps(). I'd have to do some testing to see if dicts and lists are locked while you're iterating through them. I thought iters were thread safe.
    – hobs
    Commented Apr 13, 2023 at 19:53
  • @OlivierGrégoire Python dicts and lists are thread-safe, even for write operations, but this code is only doing reads from dicts and lists, so that makes it even more obviously thread safe: stackoverflow.com/a/6955678/623735
    – hobs
    Commented Apr 20, 2023 at 0:42
  • So if I follow your reasoning, copy.deepcopy() is also thread-safe. Then why is this answer more relevant? Or what makes copy.deepcopy() not thread-safe? Commented Apr 20, 2023 at 5:39
  • 1
    It's just that your first sentence in the answer seem to indicate that copy.deepcopy() isn't thread-safe "@Rob suggested a thread safe alternative to copy.deepcopy():" while your answer is thread-safe. Actually none are thread-safe. So I'd like that you remove your assertion that your answer is thread-safe. Commented Apr 20, 2023 at 20:09

Not the answer you're looking for? Browse other questions tagged or ask your own question.