1

below code works:

import multiprocessing
import threading
import time

file_path = 'C:/TEST/0000.txt'

class H(object):
    def __init__(self, path):
        self.hash_file = open(file_path, 'rb')
    def read_line(self):
        print self.hash_file.readline()

h = H(file_path)
h.read_line()

But when I use in process:

import multiprocessing
import threading
import time

file_path = 'C:/TEST/0000.txt'

class Worker(multiprocessing.Process):
    def __init__(self, path):
        super(Worker, self).__init__()
        self.hash_file = open(path, 'rb')
    def run(self):
        while True:
            for i in range(1000):
                print self.hash_file.readline()
                time.sleep(1.5)


if __name__ == '__main__':
    w = Worker(file_path)
    w.start()
    w.join()

raise exception:

Process Worker-1:
Traceback (most recent call last):
  File "E:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\ts_file_open.py", line 31, in run
    print self.hash_file.readline()
ValueError: I/O operation on closed file

Because open cost a lot and I only need read the file, I think open it once would be enough.But why this file object is closed when process run? And I also want to pass this file object to child process and child thread of child process.

5
  • 3
    try opening the file in your run() method.
    – monkut
    Commented Aug 19, 2014 at 6:10
  • @monkut I know that would work, but why?
    – Mithril
    Commented Aug 19, 2014 at 6:12
  • Your creating the Worker instance in in one process, and run is called in another.
    – monkut
    Commented Aug 19, 2014 at 6:15
  • @monkut This behavior is a little strange.
    – Mithril
    Commented Aug 19, 2014 at 6:20
  • @monkut I see Popen in process start method.It seems be the point.
    – Mithril
    Commented Aug 19, 2014 at 6:21

1 Answer 1

3

This fails because you're opening the file in the parent process, but trying to use it in the child. File descriptors from the parent process are not inherited by the child on Windows (because it's not using os.fork to create the new process), so the read operation fails in the child. Note that this code will actually work on Linux, because the file descriptor gets inherited by the child, due to the nature of os.fork.

Also, I don't think the open operation itself is particularly expensive. Actually reading the file is potentially expensive, but the open operation itself should be fast.

1
  • Per PEP 446 all descriptor are now non inheritable by default in Python 3.4+ under windows & unix. This PEP add a bunch of new functions. Adding os.set_inheritable(self.hash_file.fileno(),False) in __init__ method should make it working in windows in 3.4+ (but not necesary in unix as it is a multiprocessing call not a subprocess call). Commented Aug 19, 2014 at 6:47

Not the answer you're looking for? Browse other questions tagged or ask your own question.