@Rob suggested a good alternative to copy.deepcopy()
if you are planning to create standard APIs for your data (database or web request json payloads).
>>> import json
>>> my_dict = {'a': [1, 2, 3], 'b': [4, 5, 6]}
>>> json.loads(json.dumps(my_dict))
{'a': [1, 2, 3], 'b': [4, 5, 6]}
This approach is also thread safe.
But it only works for jsonifiable (serializable) objects like str
, dict
, list
, int
, float
, and None
.
Sometimes that jsonifiable constraint can be a good thing because it forces you to make your data structures compliant with standard database fixture and web request formats (for creating best-practice APIs).
If your data structures don't work directly with json.dumps
(for example datetime
objects and your custom classes) they will be need to be coerced into a string or other standard type before serializing with json.dumps()
.
And you'll need to run a custom deserializer as well after json.loads()
:
>>> from datetime import datetime as dt
>>> my_dict = {'a': (1,), 'b': dt(2023, 4, 9).isoformat()}
>>> d = json.loads(json.dumps(my_dict))
>>> d
{'a': [1], 'b': '2023-04-09T00:00:00'}
>>> for k in d:
... try:
... d[k] = dt.fromisoformat(d[k])
... except:
... pass
>>> d
{'a': [1], 'b': datetime.datetime(2023, 4, 9, 0, 0)}
Of course you need to do the serialization and deserialization on special objects recursively.
Sometimes that's a good thing.
This process normalizes all your objects to types that are directly serializable (for example tuple
s become list
s) and you can be sure they'll match a reproducable data schema (for relational database storage).
And it's thread safe. The builtin copy.deepcopy()
is NOT thread safe! If you use deepcopy
within async
code that can crash your program or corrupt your data unexpectedly long after you've forgotten about your code.