9

I know it is a good habit of using close to close a file if not used any more in Python. I have tried to open a large number of open files, and not close them (in the same Python process), but not see any exceptions or errors. I have tried both Mac and Linux. So, just wondering if Python is smart enough to manage file handle to close/reuse them automatically, so that we do not need to care about file close?

thanks in advance, Lin

1
  • 6
    When you think about it this is equivalent to "If I forget to signal when driving I have not yet had an accident so should I stop using my indicators?" - just because you usually get away with something doesn't make it a good idea. Commented May 7, 2015 at 5:07

5 Answers 5

20

Python will, in general, garbage collect objects no longer in use and no longer being referenced. This means it's entirely possible that open file objects, that match the garbage collector's filters, will get cleaned up and probably closed. However; you should not rely on this, and instead use:

with open(..):

Example (Also best practice):

with open("file.txt", "r") as f:
    # do something with f

NB: If you don't close the file and leave "open file descriptors" around on your system, you will eventually start hitting resource limits on your system; specifically "ulimit". You will eventually start to see OS errors related to "too many open files". (Assuming Linux here, but other OS(es) will have similar behaviour).

Important: It's also a good practice to close any open files you've written too, so that data you have written is properly flushed. This helps to ensure data integrity, and not have files unexpectedly contain corrupted data because of an application crash.

It's worth noting that the above important note is the cause of many issues with where you write to a file; read it back; discover it's empty; but then close your python program; read it in a text editor and realize it's not empty.

Demo: A good example of the kind of resource limits and errors you might hit if you don't ensure you close open file(s):

$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> xs = [open("/dev/null", "r") for _ in xrange(100000)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 24] Too many open files: '/dev/null'
5
  • This is good information, related, but I'm not sure it answers the question "why do I need to close files?" Commented May 7, 2015 at 5:01
  • Why? Because there's an OS limit on the number of simultaneously open files. This isn't necessarily Python's fault. That said, all files opened by a program are closed by the OS on program termination, so sometimes you can get away with closing them that way. Commented May 7, 2015 at 5:03
  • No one said it's "Python's fault". :) Commented May 7, 2015 at 5:03
  • 3
    Nice inclusion of the extra problems with writable files. (Closing them also avoids the 69105 questions on StackOverflow about "when I read the file it's empty, but then when I look at it in Notepad it's not", because they read the file in Python before closing the writable file handle, but then looked at it in Notepad after the program exited…)
    – abarnert
    Commented May 7, 2015 at 5:36
  • @abarnert Thanks :) That's exactly what I was hoping for in this response based on some of our own experiences! It's like Steve Barnes's comment -- "Just because you don't run into problems now doesn't mean you won't later!" Commented May 7, 2015 at 5:38
5

To add to James Mills' answer, if you need to understand the specific reason you don't see any errors:

Python defines that when a file gets garbage collected, it will be automatically closed. But Python leaves it up to the implementation how it wants to handle garbage collection. CPython, the implementation you're probably using, does it with reference counting: as soon as the last reference to an object goes away, it's immediately collected. So, all of these will appear to work in CPython:

def spam():
    for i in range(100000):
        open('spam.txt') # collected at end of statement

def eggs():
    for i in range(100000):
        f = open('eggs.txt') # collected next loop, when f is reassigned

def beans():
    def toast():
        f = open('toast.txt') # collected when toast exits
    for i in range(100000):
        toast()

But many other implementations (including the other big three, PyPy, Jython, and IronPython) use smarter garbage collectors that detect garbage on the fly, without having to keep track of all the references. This makes them more efficient, better at threading, etc., but it means that it's not deterministic when an object gets collected. So the same code will not work. Or, worse, it will work in your 60 tests, and then fail as soon as you're doing a demo for your investors.

It would be a shame if you needed PyPy's speed or IronPython's .NET integration, but couldn't have it without rewriting all your code. Or if someone else wanted to use your code, but needed it to work in Jython, and had to look elsewhere.


Meanwhile, even in CPython, the interpreter doesn't collect all of its garbage at shutdown. (It's getting better, but even in 3.4 it's not perfect.) So in some cases, you're relying on the OS to close your files for you. The OS will usually flush them when it does so—but maybe not if you, e.g., had them open in a daemon thread, or you exited with os._exit, or segfaulted. (And of course definitely not if you exited by someone tripping over the power cord.)


Finally, even CPython (since 3.3, I think) has code specifically to generate warnings if you let your files be garbage collected instead of closing them. Those warnings are off by default, but people regularly propose turning them on, and one day it may happen.

3

You do need to close (output) files in Python.

One example of why is to flush the output to them. If you don't properly close files and your program is killed for some reason, the left-open file can be corrupted.

In addition, there is this: Why python has limit for count of file handles?

3

There are two good reasons.

  1. If your program crashes or is unexpectedly terminated, then output files may be corrupted.
  2. It's good practice to close what you open.
2

It's a good idea to handle file closing. It's not the sort of thing that will give errors and exceptions: it will corrupt files, or not write what you tried to write, and so on.

The most common python interpreter, CPython, which you're probably using, does, however, try to handle file closing smartly, just in case you don't. If you open a file, and then it gets garbage collected, which generally happens when there are no longer any references to it, CPython will close the file.

So for example, if you have a function like

def write_something(fname):
    f = open(fname, 'w')
    f.write("Hi there!\n")

then Python will generally close the file at some point after the function returns.

That's not that bad for simple situations, but consider this:

def do_many_things(fname):

    # Some stuff here

    f = open(fname, 'w')
    f.write("Hi there!\n")

    # All sorts of other stuff here

    # Calls to some other functions

    # more stuff        

    return something

Now you've opened the file, but it could be a long time before it is closed. On some OSes, that might mean other processes won't be able to open it. If the other stuff has an error, your message might not actually get written to the file. If you're writing quite a bit of stuff, some of it might be written, and some other parts might not; if you're editing a file, you might cause all sorts of problems. And so on.

An interesting question to consider, however, is whether, in OSes where files can be open for reading by multiple processes, there's any significant risk to opening a file for reading and not closing it.

2
  • Python will generally close the file after the function returns -- Are you sure? It doesn't have to garbage collect on function exit. Commented May 7, 2015 at 5:05
  • I'm referring only to CPython; it should close after the function returns if that removes the last reference to it, which it would in the simple example.
    – cge
    Commented May 7, 2015 at 6:29

Not the answer you're looking for? Browse other questions tagged or ask your own question.