Hashing a dictionary?

Question

For caching purposes I need to generate a cache key from GET arguments which are present in a dict.

Currently I'm using sha1(repr(sorted(my_dict.items()))) (sha1() is a convenience method that uses hashlib internally) but I'm curious if there's a better way.

this might not work with nested dict. shortest solution is to use json.dumps(my_dict, sort_keys=True) instead, which will recurse into dict values. — Andrey Fedorov, Commented Apr 8, 2014 at 19:15
FYI re: dumps, stackoverflow.com/a/12739361/1082367 says "The output from pickle is not guaranteed to be canonical for similar reasons to dict and set order being non-deterministic. Don't use pickle or pprint or repr for hashing." — Matthew Cornell, Commented Dec 16, 2014 at 12:49
Interesting backstory about hashing mutable data structures (like dictionaries): python.org/dev/peps/pep-0351 was proposed to allow arbitrarily freezing objects, but rejected. For rationale, see this thread in python-dev: mail.python.org/pipermail/python-dev/2006-February/060793.html — FluxLemur, Commented Mar 29, 2018 at 16:47
If your data is json format, and you want semantically invariant hashing, checkout github.com/schollii/sandals/blob/master/json_sem_hash.py. It works on nested structures (of course, since json), and does not depend on internals of dict like preserved order (which has evolved over the lifetime of python), and will give the same hash if two data structures are semantically the same (like {'a': 1, 'b':2} is semantically the same as {'b':2, 'a':1}). I haven't used it on anything too complicated yet so YMMV but feedback welcome. — Oliver, Commented Jul 10, 2020 at 15:45
@shelper you forgot the repr() (and possibly a .encode() in python 3) — ThiefMaster, Commented Jul 6, 2021 at 22:02

Jack O'Connor · Accepted Answer · 2016-06-10 14:24:40Z

210

Using sorted(d.items()) isn't enough to get us a stable repr. Some of the values in d could be dictionaries too, and their keys will still come out in an arbitrary order. As long as all the keys are strings, I prefer to use:

json.dumps(d, sort_keys=True)

That said, if the hashes need to be stable across different machines or Python versions, I'm not certain that this is bulletproof. You might want to add the separators and ensure_ascii arguments to protect yourself from any changes to the defaults there. I'd appreciate comments.

edited Jun 10, 2016 at 14:24

answered Feb 25, 2014 at 2:29

Jack O'Connor

10.7k4 gold badges51 silver badges54 bronze badges

10

This is just being paranoid, but JSON allow most characters to show up in strings without any literal escaping, so the encoder gets to make some choices about whether to escape characters or just pass them through. The risk then is that different versions (or future versions) of the encoder could make different escaping choices by default, and then your program would compute different hash values for the same dictionary in different environments. The ensure_ascii argument would protect against this entirely hypothetical problem.
– Jack O'Connor
Commented Apr 9, 2014 at 4:31
5

I tested the performance of this with different dataset, it's much much faster than make_hash. gist.github.com/charlax/b8731de51d2ea86c6eb9
– charlax
Commented Sep 18, 2014 at 22:33
3

@charlax ujson doesn't guarantee the order of the dict pairs, so it's not safe to do that
– arthurprs
Commented Jul 3, 2015 at 12:48
11

This solution only works as long as all keys are strings, e.g. json.dumps({'a': {(0, 5): 5, 1: 3}}) fails.
– kadee
Commented Jun 1, 2016 at 11:01
9

@LorenzoBelli, you can overcome that by adding default=str to the dumps command. Seems to work nicely.
– mlissner
Commented May 4, 2018 at 23:10

| Show 12 more comments

mit · Accepted Answer · 2019-07-26 10:04:52Z

162

If your dictionary is not nested, you could make a frozenset with the dict's items and use hash():

hash(frozenset(my_dict.items()))

This is much less computationally intensive than generating the JSON string or representation of the dictionary.

UPDATE: Please see the comments below, why this approach might not produce a stable result.

edited Jul 26, 2019 at 10:04

mit

11.2k11 gold badges51 silver badges74 bronze badges

answered May 4, 2011 at 13:24

Imran

89.8k24 gold badges98 silver badges131 bronze badges

10

This didn't work for me with a nested dictionary. I haven't tried the solution below (too complicated). The OP's solution works perfectly fine. I substituted sha1 with hash to save an import.
– spatel
Commented Jan 18, 2012 at 7:51
11

@Ceaser That won't work because tuple implies ordering but dict items are unordered. frozenset is better.
– Antimony
Commented Jul 30, 2012 at 11:55
42

Beware of the built-in hash if something needs to be consistent across different machines. Implementations of python on cloud platforms like Heroku and GAE will return different values for hash() on different instances making it useless for anything that must be shared between two or more "machines" (dynos in the case of heroku)
– B Robster
Commented Apr 22, 2015 at 22:40
12

It might be interesting the hash() function does not produce a stable output. This means that, given the same input, it returns different results with different instances of the same python interpreter. To me, it looks like some sort of seed value is generated every time the interpreter is started.
– Hermann Schachner
Commented May 28, 2015 at 11:03
10

expected. the seed is introduced for security reason as far as I remember to add some kind of memory randomization. So you cannot expect the hash to be the same between two python processes
– Nikokrock
Commented May 29, 2015 at 13:03

| Show 5 more comments

MarredCheese · Accepted Answer · 2019-09-23 03:35:27Z

EDIT: If all your keys are strings, then before continuing to read this answer, please see Jack O'Connor's significantly simpler (and faster) solution (which also works for hashing nested dictionaries).

Although an answer has been accepted, the title of the question is "Hashing a python dictionary", and the answer is incomplete as regards that title. (As regards the body of the question, the answer is complete.)

Nested Dictionaries

If one searches Stack Overflow for how to hash a dictionary, one might stumble upon this aptly titled question, and leave unsatisfied if one is attempting to hash multiply nested dictionaries. The answer above won't work in this case, and you'll have to implement some sort of recursive mechanism to retrieve the hash.

Here is one such mechanism:

import copy

def make_hash(o):

  """
  Makes a hash from a dictionary, list, tuple or set to any level, that contains
  only other hashable types (including any lists, tuples, sets, and
  dictionaries).
  """

  if isinstance(o, (set, tuple, list)):

    return tuple([make_hash(e) for e in o])    

  elif not isinstance(o, dict):

    return hash(o)

  new_o = copy.deepcopy(o)
  for k, v in new_o.items():
    new_o[k] = make_hash(v)

  return hash(tuple(frozenset(sorted(new_o.items()))))

Bonus: Hashing Objects and Classes

The hash() function works great when you hash classes or instances. However, here is one issue I found with hash, as regards objects:

class Foo(object): pass
foo = Foo()
print (hash(foo)) # 1209812346789
foo.a = 1
print (hash(foo)) # 1209812346789

The hash is the same, even after I've altered foo. This is because the identity of foo hasn't changed, so the hash is the same. If you want foo to hash differently depending on its current definition, the solution is to hash off whatever is actually changing. In this case, the __dict__ attribute:

class Foo(object): pass
foo = Foo()
print (make_hash(foo.__dict__)) # 1209812346789
foo.a = 1
print (make_hash(foo.__dict__)) # -78956430974785

Alas, when you attempt to do the same thing with the class itself:

print (make_hash(Foo.__dict__)) # TypeError: unhashable type: 'dict_proxy'

The class __dict__ property is not a normal dictionary:

print (type(Foo.__dict__)) # type <'dict_proxy'>

Here is a similar mechanism as previous that will handle classes appropriately:

import copy

DictProxyType = type(object.__dict__)

def make_hash(o):

  """
  Makes a hash from a dictionary, list, tuple or set to any level, that 
  contains only other hashable types (including any lists, tuples, sets, and
  dictionaries). In the case where other kinds of objects (like classes) need 
  to be hashed, pass in a collection of object attributes that are pertinent. 
  For example, a class can be hashed in this fashion:

    make_hash([cls.__dict__, cls.__name__])

  A function can be hashed like so:

    make_hash([fn.__dict__, fn.__code__])
  """

  if type(o) == DictProxyType:
    o2 = {}
    for k, v in o.items():
      if not k.startswith("__"):
        o2[k] = v
    o = o2  

  if isinstance(o, (set, tuple, list)):

    return tuple([make_hash(e) for e in o])    

  elif not isinstance(o, dict):

    return hash(o)

  new_o = copy.deepcopy(o)
  for k, v in new_o.items():
    new_o[k] = make_hash(v)

  return hash(tuple(frozenset(sorted(new_o.items()))))

You can use this to return a hash tuple of however many elements you'd like:

# -7666086133114527897
print (make_hash(func.__code__))

# (-7666086133114527897, 3527539)
print (make_hash([func.__code__, func.__dict__]))

# (-7666086133114527897, 3527539, -509551383349783210)
print (make_hash([func.__code__, func.__dict__, func.__name__]))

NOTE: all of the above code assumes Python 3.x. Did not test in earlier versions, although I assume make_hash() will work in, say, 2.7.2. As far as making the examples work, I do know that

func.__code__

should be replaced with

func.func_code

isinstance takes a sequence for the second argument, so isinstance(o, (set, tuple, list)) would work. — Xealot, Commented Feb 13, 2013 at 1:42
thanks for making me realize frozenset could consistently hash querystring parameters :) — Xealot, Commented Feb 14, 2013 at 14:26
The items need to be sorted in order to create the same hash if the dict item order is different but the key values aren't -> return hash(tuple(frozenset(sorted(new_o.items())))) — Bas Koopmans, Commented Oct 28, 2013 at 13:11
In Python 3.7, this solution peforms worse than hash(json.dumps(d, sort_keys=True)) by a factor of 18. Proof: pastebin.com/XDZGhNHs — kolypto, Commented May 27, 2019 at 10:15
@kolypto Json solution is dependent on the size of the strings (obviously) while the hash solution time is stable. proof using your code with larger strings — Gulzar, Commented Sep 29, 2022 at 9:17

Community · Accepted Answer · 2017-05-23 11:33:24Z

22

The code below avoids using the Python hash() function because it will not provide hashes that are consistent across restarts of Python (see hash function in Python 3.3 returns different results between sessions). make_hashable() will convert the object into nested tuples and make_hash_sha256() will also convert the repr() to a base64 encoded SHA256 hash.

import hashlib
import base64

def make_hash_sha256(o):
    hasher = hashlib.sha256()
    hasher.update(repr(make_hashable(o)).encode())
    return base64.b64encode(hasher.digest()).decode()

def make_hashable(o):
    if isinstance(o, (tuple, list)):
        return tuple((make_hashable(e) for e in o))

    if isinstance(o, dict):
        return tuple(sorted((k,make_hashable(v)) for k,v in o.items()))

    if isinstance(o, (set, frozenset)):
        return tuple(sorted(make_hashable(e) for e in o))

    return o

o = dict(x=1,b=2,c=[3,4,5],d={6,7})
print(make_hashable(o))
# (('b', 2), ('c', (3, 4, 5)), ('d', (6, 7)), ('x', 1))

print(make_hash_sha256(o))
# fyt/gK6D24H9Ugexw+g3lbqnKZ0JAcgtNW+rXIDeU2Y=

edited May 23, 2017 at 11:33

CommunityBot

11 silver badge

answered Feb 10, 2017 at 5:09

Claudio Fahey

7806 silver badges7 bronze badges

1

make_hash_sha256(((0,1),(2,3)))==make_hash_sha256({0:1,2:3})==make_hash_sha256({2:3,0:1})!=make_hash_sha256(((2,3),(0,1))). This isn't quite the solution I'm looking for, but it is a nice intermediate. I'm thinking of adding type(o).__name__ to the beginning of each of the tuples to force differentiation.
– Poik
Commented Sep 17, 2017 at 20:01
If you want to sort the list as well : tuple(sorted((make_hashable(e) for e in o)))
– Suraj
Commented Aug 13, 2019 at 15:12
make_hash_sha256() - nice!
– jtlz2
Commented Oct 31, 2019 at 19:04
3

@Suraj You shouldn't want to sort the list before hashing because lists which have their contents in different orders are definitely not the same thing. If the order of the items does not matter the problem is that you are using the wrong data structure. You should be using a set instead of a list.
– scottclowe
Commented Nov 3, 2019 at 2:11
@scottclowe That is very true. Thanks for adding that point. There are 2 scenarios where you'd still want a list (without specific ordering needs) - 1. List of repeating items. 2. When you've to use a JSON directly. As JSON doesn't support "set" representation.
– Suraj
Commented Nov 4, 2019 at 9:11

Add a comment |

Shay Dahan · Accepted Answer · 2021-05-07 16:16:37Z

20

MD5 HASH

The method which resulted in the most stable results for me was using md5 hashes and json.stringify

from typing import Dict, Any
import hashlib
import json

def dict_hash(dictionary: Dict[str, Any]) -> str:
    """MD5 hash of a dictionary."""
    dhash = hashlib.md5()
    # We need to sort arguments so {'a': 1, 'b': 2} is
    # the same as {'b': 2, 'a': 1}
    encoded = json.dumps(dictionary, sort_keys=True).encode()
    dhash.update(encoded)
    return dhash.hexdigest()

answered May 7, 2021 at 16:16

Shay Dahan

3112 silver badges4 bronze badges

1

I find that replacing json by yaml helps since it can serialize classes and their attributes. import yaml ... yaml.dump(dictionary, sort_keys=True).encode()
– DomDev
Commented Nov 15, 2022 at 16:56
NOTE : This does not work for nested dictionaries in python. DeepHash from DeepDiff Module does!
– Farhan Hai Khan
Commented Feb 21, 2023 at 19:58
2

@FarhanHaiKhan Please explain. The json.dumps looks like it would dump nested dictionaries in a stable way.
– dfrankow
Commented Mar 16, 2023 at 13:39
Looks like but I originally tried it upon something nested and failed to get a match although the two dicts were identical. The structure appeared to be well nested and therefore this conclusion.
– Farhan Hai Khan
Commented Mar 19, 2023 at 19:42
1

@FarhanHaiKhan sounds like something didn't work for YOU and you've decided the solution is flawed. Feel free to provide a concrete example.
– Guy
Commented May 19 at 22:29

| Show 1 more comment

smartnut007 · Accepted Answer · 2014-02-06 21:13:38Z

16

Here is a clearer solution.

def freeze(o):
  if isinstance(o,dict):
    return frozenset({ k:freeze(v) for k,v in o.items()}.items())

  if isinstance(o,list):
    return tuple([freeze(v) for v in o])

  return o


def make_hash(o):
    """
    makes a hash out of anything that contains only list,dict and hashable types including string and numeric types
    """
    return hash(freeze(o))

answered Feb 6, 2014 at 21:13

smartnut007

6,4136 gold badges46 silver badges52 bronze badges

3

If you change if isinstance(o,list): to if isinstance(obj, (set, tuple, list)): then this function can work on any object.
– Peter Schorn
Commented Apr 12, 2020 at 3:19

Add a comment |

Ben · Accepted Answer · 2021-01-08 08:13:03Z

While hash(frozenset(x.items()) and hash(tuple(sorted(x.items())) work, that's doing a lot of work allocating and copying all the key-value pairs. A hash function really should avoid a lot of memory allocation.

A little bit of math can help here. The problem with most hash functions is that they assume that order matters. To hash an unordered structure, you need a commutative operation. Multiplication doesn't work well as any element hashing to 0 means the whole product is 0. Bitwise & and | tend towards all 0's or 1's. There are two good candidates: addition and xor.

from functools import reduce
from operator import xor

class hashable(dict):
    def __hash__(self):
        return reduce(xor, map(hash, self.items()), 0)

    # Alternative
    def __hash__(self):
        return sum(map(hash, self.items()))

One point: xor works, in part, because dict guarantees keys are unique. And sum works because Python will bitwise truncate the results.

If you want to hash a multiset, sum is preferable. With xor, {a} would hash to the same value as {a, a, a} because x ^ x ^ x = x.

If you really need the guarantees that SHA makes, this won't work for you. But to use a dictionary in a set, this will work fine; Python containers are resiliant to some collisions, and the underlying hash functions are pretty good.

Nick K9 · Accepted Answer · 2023-06-08 09:23:49Z

7

Use DeepHash from DeepDiff Module

from deepdiff import DeepHash
obj = {'a':'1', 'b':'2'}
hashes = DeepHash(obj)[obj]

edited Jun 8, 2023 at 9:23

Nick K9

4,4361 gold badge34 silver badges75 bronze badges

answered Nov 24, 2021 at 2:24

JithinM

811 silver badge2 bronze badges

NOTE : This works for nested dictionaries in python while the md5 hash does not!
– Farhan Hai Khan
Commented Feb 21, 2023 at 19:58
1

Please edit to: obj = {'a':'1','b':'2'}
– Ioannis Koumarelas
Commented Mar 2, 2023 at 18:58

Add a comment |

Steve Yeago · Accepted Answer · 2016-07-06 16:43:56Z

5

Updated from 2013 reply...

None of the above answers seem reliable to me. The reason is the use of items(). As far as I know, this comes out in a machine-dependent order.

How about this instead?

import hashlib

def dict_hash(the_dict, *ignore):
    if ignore:  # Sometimes you don't care about some items
        interesting = the_dict.copy()
        for item in ignore:
            if item in interesting:
                interesting.pop(item)
        the_dict = interesting
    result = hashlib.sha1(
        '%s' % sorted(the_dict.items())
    ).hexdigest()
    return result

edited Jul 6, 2016 at 16:43

answered Mar 4, 2013 at 18:10

Steve Yeago

3413 silver badges8 bronze badges

Why do you think it matters that dict.items does not return a predictably ordered list? frozenset takes care of that
– glarrain
Commented Jul 11, 2014 at 22:54
2

A set, by definition, is unordered. Thus the order in which objects are added is irrelevant. You do have to realize that the built-in function hash does not care about how the frozenset contents are printed or something like that. Test it in several machines and python versions and you'll see.
– glarrain
Commented Jul 13, 2014 at 3:38
Why do you use the extra hash() call in value = hash('%s::%s' % (value, type(value)))??
– RuiDo
Commented Jul 6, 2016 at 10:12

Add a comment |

shirk3y · Accepted Answer · 2015-01-30 00:45:17Z

4

To preserve key order, instead of hash(str(dictionary)) or hash(json.dumps(dictionary)) I would prefer quick-and-dirty solution:

from pprint import pformat
h = hash(pformat(dictionary))

It will work even for types like DateTime and more that are not JSON serializable.

answered Jan 30, 2015 at 0:45

shirk3y

1423 bronze badges

3

Who guarantees that pformat or json always use the same order?
– ThiefMaster
Commented Jan 30, 2015 at 6:02
1

@ThiefMaster, "Changed in version 2.5: Dictionaries are sorted by key before the display is computed; before 2.5, a dictionary was sorted only if its display required more than one line, although that wasn’t documented."(docs.python.org/2/library/pprint.html)
– Arel
Commented Jan 15, 2016 at 16:21
4

This doesn't seem valid to me. The pprint modules and pformat are understood by the authors to be for display purposes and not serialization. Because of this, you shouldn't feel safe in assuming that pformat will always return a result that happens to work.
– David Sanders
Commented Apr 5, 2017 at 22:27

Add a comment |

Pedro Cattori · Accepted Answer · 2018-11-08 19:08:59Z

3

You can use the maps library to do this. Specifically, maps.FrozenMap

import maps
fm = maps.FrozenMap(my_dict)
hash(fm)

To install maps, just do:

pip install maps

It handles the nested dict case too:

import maps
fm = maps.FrozenMap.recurse(my_dict)
hash(fm)

Disclaimer: I am the author of the maps library.

answered Nov 8, 2018 at 19:08

Pedro Cattori

2,7951 gold badge28 silver badges43 bronze badges

The library doesn't sort the list inside a dict. And hence this could produce different hash. There should be an option to sort a list too. A frozenset should help, but I wonder how would you handle the case with a nested dict containing a list of dicts. As dict are unhashable.
– Suraj
Commented Aug 13, 2019 at 15:04
1

@Suraj : it does handle nested structure via .recurse. See maps.readthedocs.io/en/latest/api.html#maps.FrozenMap.recurse . Ordering in lists is semantically meaningful, if you want order independence you can convert your lists to sets prior to calling .recurse. You can also make use of the list_fn parameter to .recurse to use a different hashable data structure than tuple (.e.g frozenset)
– Pedro Cattori
Commented Nov 17, 2019 at 14:26

Add a comment |

Eric · Accepted Answer · 2019-03-18 17:03:28Z

3

You could use the third-party frozendict module to freeze your dict and make it hashable.

from frozendict import frozendict
my_dict = frozendict(my_dict)

For handling nested objects, you could go with:

import collections.abc

def make_hashable(x):
    if isinstance(x, collections.abc.Hashable):
        return x
    elif isinstance(x, collections.abc.Sequence):
        return tuple(make_hashable(xi) for xi in x)
    elif isinstance(x, collections.abc.Set):
        return frozenset(make_hashable(xi) for xi in x)
    elif isinstance(x, collections.abc.Mapping):
        return frozendict({k: make_hashable(v) for k, v in x.items()})
    else:
        raise TypeError("Don't know how to make {} objects hashable".format(type(x).__name__))

If you want to support more types, use functools.singledispatch (Python 3.7):

@functools.singledispatch
def make_hashable(x):
    raise TypeError("Don't know how to make {} objects hashable".format(type(x).__name__))

@make_hashable.register
def _(x: collections.abc.Hashable):
    return x

@make_hashable.register
def _(x: collections.abc.Sequence):
    return tuple(make_hashable(xi) for xi in x)

@make_hashable.register
def _(x: collections.abc.Set):
    return frozenset(make_hashable(xi) for xi in x)

@make_hashable.register
def _(x: collections.abc.Mapping):
    return frozendict({k: make_hashable(v) for k, v in x.items()})

# add your own types here

edited Mar 18, 2019 at 17:03

answered Oct 28, 2018 at 18:54

Eric

96.9k54 gold badges252 silver badges380 bronze badges

This doesn't work, for example, for a dict of DataFrame objects.
– James Hirschorn
Commented Mar 12, 2019 at 17:48
@JamesHirschorn: Updated to fail loudly
– Eric
Commented Mar 13, 2019 at 0:40
Better! I added the following elif clause to make it work with DataFrames: elif isinstance(x, pd.DataFrame): return make_hashable(hash_pandas_object(x).tolist()) I will edit the answer and see if you accept it...
– James Hirschorn
Commented Mar 15, 2019 at 17:54
1

OK. I see I was asking for something more than "hashable", which only guarantees that objects that are equal shall have the same hash. I am working on a version that will give the the same value between runs, and independent of python version, etc..
– James Hirschorn
Commented Mar 20, 2019 at 0:49
1

hash randomization is deliberate security feature enabled by default in python 3.7.
– Eric
Commented Mar 20, 2019 at 1:51

| Show 3 more comments

Anonymous · Accepted Answer · 2020-03-19 21:48:45Z

0

One way to approach the problem is to make a tuple of the dictionary's items:

hash(tuple(my_dict.items()))

answered Mar 19, 2020 at 21:48

Anonymous

7964 gold badges15 silver badges39 bronze badges

1

This won't work; it is order-dependent.
– Brannon
Commented Apr 10 at 14:56
@Brannon Good point, it has to be sorted first. Not sure why I didn't catch that
– Anonymous
Commented May 5 at 13:42

Add a comment |

Iakov Polyak · Accepted Answer · 2022-08-24 12:32:34Z

0

This is not a general solution (i.e. only trivially works if your dict is not nested), but since nobody here suggested it, I thought it might be useful to share it.

One can use a (third-party) immutables package and create an immutable 'snapshot' of a dict like this:

from immutables import Map

map = dict(a=1, b=2)
immap = Map(map)
hash(immap)

This seems to be faster than, say, stringification of the original dict.

I learned about this from this nice article.

edited Aug 24, 2022 at 12:32

answered Aug 24, 2022 at 12:28

Iakov Polyak

12 bronze badges

You don’t need a third-party package to create an immutable dict.
– bfontaine
Commented Sep 14, 2022 at 18:57
note that hash might not be stable across python version or even operating systems
– Guy
Commented May 19 at 22:28

Add a comment |

dawid · Accepted Answer · 2023-02-15 00:36:40Z

0

For nested structures, having string keys at the top level dict, you can use pickle(protocol=5) and hash the bytes object. If you need safety, you can use a safe serializer.

answered Feb 15, 2023 at 0:36

dawid

7306 silver badges13 bronze badges

Add a comment |

Harsh Kumar Bhartiya · Accepted Answer · 2023-07-06 07:30:14Z

Expanding on Shay's answer. I had to do this for a lot of dictionaries and the yaml.dump or json.dump was taking a lot of time. If you don't have nested datatype, this will be a faster way to do stuff.

  def dict_hash(dictionary: Dict[str, Any]) -> str:
      """MD5 hash of a dictionary."""
      dhash = hashlib.md5()
      ##sortedString = yaml.dump(dictionary, sort_keys=True)
      sortedString = str(collections.OrderedDict({k:dictionary[k] for k in sorted(dictionary)}))
      encoded = sortedString.encode()
      dhash.update(encoded)
      return dhash.hexdigest()

garbanzio · Accepted Answer · 2014-12-26 23:53:31Z

-6

I do it like this:

hash(str(my_dict))

answered Dec 26, 2014 at 23:53

garbanzio

8461 gold badge7 silver badges10 bronze badges

1

Can someone explain what is so wrong with this method?
– mhristache
Commented Oct 20, 2016 at 12:32
9

@maximi Dictionaries are not stable it terms of order, thus hash(str({'a': 1, 'b': 2})) != hash(str({'b': 2, 'a': 1})) (while it might work for some dictionaries, it is not guaranteed to work on all).
– Vlad Frolov
Commented Mar 1, 2017 at 9:48
for my use case that's more than enough
– imbr
Commented Dec 19, 2020 at 12:12

Add a comment |

Collectives™ on Stack Overflow

Hashing a dictionary?

17 Answers 17

MD5 HASH

Not the answer you're looking for? Browse other questions tagged
python
hash
dictionary
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

17 Answers 17

MD5 HASH

Not the answer you're looking for? Browse other questions tagged pythonhashdictionary or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
hash
dictionary
or ask your own question.