21

I use YAML with PyYAML. Is there a way to avoid the *id002 references after dumping a nested structure? For readability I'd like to see the actual (tuple) values there.

While trying to produce a mini example I noticed that it only happens when I use the same id object:

import yaml

t = ("b", "c")
x = {(1, t):1, (2, t):2, }
print(yaml.dump(x))

So I thought copy.copy() would solve the problem, however for tuples it doesn't seem to work :( Can I create a new tuple with a different id?

4
  • 1
    Perhaps you could show code so we can repro Commented Nov 22, 2012 at 19:17
  • Does it work with x = {(1, t):1, (2, tuple(t)):2}?
    – Blckknght
    Commented Nov 22, 2012 at 19:34
  • No :( Somehow a=(1,); a is tuple(a) also predicts that there is no new tuple.
    – Gere
    Commented Nov 22, 2012 at 19:36
  • Ah, but tuple(list(t)) works :)
    – Gere
    Commented Nov 22, 2012 at 19:37

2 Answers 2

24

The PyYAML dumper uses an ignore_aliases method to prevent primitive types from being "anchored" and "referenced" in this way. You can override that method to always ignore_aliases independent of any object passed in. And by default the yaml.Loader class is used in yaml.load¹:

t = ("b", "c")
x = {(1, t):1, (2, t):2, }

yaml.Dumper.ignore_aliases = lambda *args : True

yaml.dump(x, sys.stdout)

will get you:

? !!python/tuple
- 1
- !!python/tuple [b, c]
: 1
? !!python/tuple
- 2
- !!python/tuple [b, c]
: 2

That way you don't have to try your best and get tuples with the same hash to look different. You might want to provide the default_flow_style parameter on yaml.load to False or True to get different layouts of the output.

The reason you could not get this to work is that the representer matches the result of id() and that is the same for two tuples generated separately as long as the elements are the same.


¹ In ruamel.yaml, of which I am the author, which is an enhanced version of PyYAML, capable of handling YAML 1.2, you can do:

    yaml = ruamel.yaml.YAML()
    yaml.representer.ignore_aliases = lambda *args: True
4
  • 2
    Also, if you are using the yaml.safe_dump() to remove the ? !!python/tuple from the yaml. You will need to overwrite the yaml.SafeDumper.ignore_aliases. The example then becomes: yaml.SafeDumper.ignore_aliases = lambda *args : True and yaml.safe_dump(x, sys.stdout) Commented Jun 14, 2019 at 19:11
  • FWIW this is now working (PyYAML 5.3.1). On some of less recent versions of PyYAML setting this caused an exception, at least in my code.
    – JL Peyret
    Commented Apr 1, 2020 at 21:16
  • Thanks! It's annoying that it's 2021 and the latest PyYAML version still requires this trick to work properly.
    – Tianyi Shi
    Commented Apr 4, 2021 at 22:26
  • @RafaelAlves, your answer doesn't work with a yaml.safe_dump() call on my host. Does PyYAML have to be a certain version to make your code work?
    – SQA777
    Commented Apr 11, 2021 at 1:17
9

This method works for me on Python 2 and Python 3, and does not require monkeypatching:

import yaml

class NoAliasDumper(yaml.SafeDumper):
    def ignore_aliases(self, data):
        return True

t = ("b", "c")
x = {(1, t):1, (2, t):2, }
print(yaml.dump(x, Dumper=NoAliasDumper))

which yields

? - 1
  - [b, c]
: 1
? - 2
  - [b, c]
: 2
1
  • the answer in the link you provide (ttl255.com/yaml-anchors-and-aliases-and-how-to-disable-them) works for me. To others, when you override the yaml.SafeDumper() class, you have to use yaml.dump() not yaml.safe_dump() to get rid of the YAML references. Using yaml.safe_dump() will result in this error: "TypeError: dump_all() got multiple values for keyword argument 'Dumper'
    – SQA777
    Commented Apr 11, 2021 at 1:20

Not the answer you're looking for? Browse other questions tagged or ask your own question.