3

I'm having a problem with the Python multiprocessing package. Below is a simple example code that illustrates my problem.

import multiprocessing as mp
import time

def test_file(f):
  f.write("Testing...\n")
  print f.name
  return None

if __name__ == "__main__":
  f = open("test.txt", 'w')
  proc = mp.Process(target=test_file, args=[f])
  proc.start()
  proc.join()

When I run this, I get the following error.

Process Process-1:
Traceback (most recent call last):
  File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
    self.target(*self._args, **self._kwargs)
  File "C:\Users\Ray\Google Drive\Programming\Python\tests\follow_test.py", line 24, in test_file
    f.write("Testing...\n")
ValueError: I/O operation on closed file
Press any key to continue . . .

It seems that somehow the file handle is 'lost' during the creation of the new process. Could someone please explain what's going on?

2
  • 4
    If you click on the console icon in the top left, you'll find menu options to select and copy text. Please use that function to copy the traceback as text into your posts. Commented Feb 15, 2013 at 16:46
  • 1
    stackoverflow.com/questions/1075443/… ... you may want to dump your output onto a queue, and when all your processes are complete, pop the output off the queue and write it out via the main process
    – pyInTheSky
    Commented Feb 15, 2013 at 17:04

1 Answer 1

8

I had similar issues in the past. Not sure whether it is done within the multiprocessing module or whether open sets the close-on-exec flag by default but I know for sure that file handles opened in the main process are closed in the multiprocessing children.

The obvious work around is to pass the filename as a parameter to the child process' init function and open it once within each child (if using a pool), or to pass it as a parameter to the target function and open/close on each invocation. The former requires the use of a global to store the file handle (not a good thing) - unless someone can show me how to avoid that :) - and the latter can incur a performance hit (but can be used with multiprocessing.Process directly).

Example of the former:

filehandle = None

def child_init(filename):
    global filehandle
    filehandle = open(filename,...)
    ../..

def child_target(args):
    ../..

if __name__ == '__main__':
    # some code which defines filename
    proc = multiprocessing.Pool(processes=1,initializer=child_init,initargs=[filename])
    proc.apply(child_target,args)
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.