1

So I have a yaml file with lots of trivia questions and a list of answers. However, whenever I try to load this file and dump the contents in python with pyyaml, it dumps them backwards. I'm not sure if it's my yaml file or if I'm doing something wrong with the library.

Let's say that one of my question/answer pairs looks like this in the yaml file -

{"question": "What is the name of this sequence of numbers: 1, 1, 2, 3, 5, 8, 13, ...", 
 "answer": ["The Fibonacci Sequence", "The Padovan Sequence", "The Morris Sequence"]}

When I use yaml.dump() on that python dictionary, it dumps this -

answer: [fibonacci, padovan, morris]\nquestion: 'what sequence is this: 1, 1, 2, 3, 5, 8, 13, ...'\n"

I was expecting this -

- question: "What is the name of this sequence of numbers: 1, 1, 2, 3, 5, 8, 13, ..."
  answer: ["The Fibonacci Sequence", "The Padovan Sequence", "The Morris Sequence"]

Am I doing something wrong here?

4 Answers 4

6

I have a somewhat different answer here. dbaupp's answer is correct if the order of elements is important to you for reasons other than readability. If the only reason you want question to show up before answer is to make the file more human-readable, then you don't need to use !!omap, and can instead use custom representers to get the order you want.

First of all, your problem with the dumper dumping without the - in front is because you're only dumping a single mapping, instead of a list of them. Put your dict inside a list and this will be fixed. So we start with:

d = [{"question": "What is the name of this sequence of numbers: 1, 1, 2, 3, 5, 8, 13, ...", 
 "answer": ["The Fibonacci Sequence", "The Padovan Sequence", "The Morris Sequence"]}]

Now we have a particular order we want the output to be, so we'll specify that, and convert to OrderedDict with that order:

from collections import OrderedDict
order = ['question', 'answer']
do = [ OrderedDict( sorted( z.items(), key=lambda x: order.index(x[0]) ) ) for z in d ]

Next, we need to make it so that PyYAML knows what to do with an OrderedDict. In this case, we don't want it to be an !!omap, we just want a mapping with a particular order. For some motivation unclear to me, if you give dumper.represent_mapping a dict, or anything with an items attribute, it will sort the items before dumping, but if you give it the output of items() (eg, a list of (key, value) tuples), it won't. Thus we can use

def order_rep(dumper, data):
    return dumper.represent_mapping( u'tag:yaml.org,2002:map', data.items(), flow_style=False )
yaml.add_representer( OrderedDict, order_rep )

And then, our output from print yaml.dump(do) ends up as:

- question: 'What is the name of this sequence of numbers: 1, 1, 2, 3, 5, 8, 13, ...'
  answer: [The Fibonacci Sequence, The Padovan Sequence, The Morris Sequence]

There are a number of different ways this could be done. Using OrderedDict isn't actually necessary at all, you just need the question/answer pairs to be of some class that you can write a representer for.

And again, do realize that this is only for human readability and aesthetic purposes. The order here will not be of any YAML significance, as it would if you were using !!omap. It just seemed like this was primarily important to you for readability.

4

If the order if preferred in dump, below code could be used

import yaml

class MyDict(dict):
   def to_omap(self):
      return [('question', self['question']), ('answer', self['answer'])]

def represent_omap(dumper, data):
   return dumper.represent_mapping(u'tag:yaml.org,2002:map', data.to_omap())

yaml.add_representer(MyDict, represent_omap)

questions = [
   MyDict({'answer': 'My name is Bob.', 'question': 'What is your name?'}),
   MyDict({'question': 'How are you?', 'answer': 'I am fine.'}),
]
print yaml.dump(questions, default_flow_style=False)

The output is:

- question: What is your name?
  answer: My name is Bob.
- question: How are you?
  answer: I am fine.
4
  • +1 this is neat and works well. I like to have representer as a @staticmethod on MyDict, to keep things together. So you do yaml.add_representer(MyDict, MyDict.representer) instead.
    – Day
    Commented Aug 6, 2013 at 16:37
  • But this doesn't work when dumping with yaml.safe_dump. Any idea how I can use safe_dump and a custom representer as above? I get an exception: yaml.representer.RepresenterError: cannot represent an object: {'answer': 'My name is Bob.', 'question': 'What is your name?'}
    – Day
    Commented Aug 6, 2013 at 16:39
  • To answer my own previous comment: Use yaml.SafeDumper.add_representer(...) instead of yaml.add_representer(...)
    – Day
    Commented Aug 6, 2013 at 16:47
  • This is a neat trick; here's my version using OrderedDict and it's only one additional line: yaml.add_representer(OrderedDict, lambda dumper, data: dumper.represent_mapping(u'tag:yaml.org,2002:map', data.items()))
    – berto
    Commented May 24, 2016 at 13:25
2

YAML associative arrays (and python dictionaries) don't preserve the order of their elements.

However, if order is import then YAML defines an ordered map !!omap which PyYAML by defaults parses into a list of tuples, e.g.:

>>> yaml.load('''!!omap
... - a: foo
... - b: bar''')
[('a','foo'),('b','bar')]

This answer gives some details about how to load an !!omap into them into a Python OrderedDict.

1
  • Thank you, so I was doing something wrong and I just didn't know what it was.
    – Matt Habel
    Commented Mar 31, 2012 at 0:42
1

If it's loading them as a dictionary their order is arbitrary. Dictionaries are not ordered containers.

2
  • I know that, what matters is how I dump them. The dumped string I showed both does not have answers represented as strings or formatted correctly (with the - infront of the first one)
    – Matt Habel
    Commented Mar 31, 2012 at 0:21
  • @Matt, PyYAML is basically the reference implementation of a YAML loader/dumper, and so (especially since this situation is such a common operation) its output will be as per the standard.
    – huon
    Commented Mar 31, 2012 at 0:34

Not the answer you're looking for? Browse other questions tagged or ask your own question.