6

I'm using multiprocessing in my project. I have a worker function which put in a queue the results. Everything works fine. But as size of x increases (in my case x is an array) something gone wrong. Here is a simplified version of my code:

def do_work(queue, x):
    result = heavy_computation_function(x)
    queue.put(result)   # PROBLEM HERE

def parallel_something():
    queue = Queue()
    procs = [Process(target=do_work, args=i) for i in xrange(20)]
    for p in procs: p.start()
    for p in procs: p.join()

    results = []
    while not queue.empty():
        results.append(queue.get)

    return results

I see in the system monitor the python processes working, but then something happen and all processes are running but doing nothing. This is what I get when typing ctrl-D.

    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt

I do some tests. And the problem looks like to be in putting results in the queue in fact if I don't put the results everything works but then there would be no purpose.

1
  • 4
    You seem to never be passing the queue object to the new process. Also args of Process should be a tuple. Try changing it to args=(queue, i). Your queue.get also requires some brackets so that it becomes queue.get().
    – Wessie
    Commented Nov 30, 2012 at 17:09

2 Answers 2

5

You are most probably generating a deadlock.

From the programming guidelines:

This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate. Remember also that non-daemonic processes will be joined automatically.

A possible fix is also proposed in the page. Keep in mind that if processes aren't joined, it doesn't mean they are "occupying" resources in any sense. This means that you could get the queued data out after the processes have completed their operation (perhaps using locks) and only later join the processes.

3

Well, it looks like it is some bug in the Queue module of python. In fact using..

from multiprocessing import Manager

queue = Manager().Queue()

..everything works but I still don't know why..:)

1
  • The difference is that you're instantiating Manager().Queue() instead of simply Queue(). I think this means that Manager.__init__() gets called in the first form but not in the second.
    – Patrick
    Commented Dec 1, 2012 at 7:19

Not the answer you're looking for? Browse other questions tagged or ask your own question.