check what files are open in Python

Question

I'm getting an error in a program that is supposed to run for a long time that too many files are open. Is there any way I can keep track of which files are open so I can print that list out occasionally and see where the problem is?

Seems related, has yet another answer (answer by dragonfast worked for me, in python 3): stackoverflow.com/questions/4386482/… — dasWesen, Commented Nov 8, 2018 at 9:18

Alexander Pacha · Accepted Answer · 2021-02-19 12:01:25Z

71

To list all open files in a cross-platform manner, I would recommend psutil.

#!/usr/bin/env python
import psutil

for proc in psutil.process_iter():
    print(proc.open_files())

The original question implicitly restricts the operation to the currently running process, which can be accessed through psutil's Process class.

proc = psutil.Process()
print(proc.open_files())

Lastly, you'll want to run the code using an account with the appropriate permissions to access this information or you may see AccessDenied errors.

edited Feb 19, 2021 at 12:01

Alexander Pacha

9,5113 gold badges70 silver badges113 bronze badges

answered Jul 31, 2014 at 21:21

jkhines

1,1199 silver badges6 bronze badges

3

This only seems to work for disk-based files, not sockets, fifos, etc.
– NeilenMarais
Commented Aug 29, 2014 at 13:14
@NeilenMarais: it does work for sockets, see examples at pypi.python.org/pypi/psutil
– LetMeSOThat4U
Commented Jan 13, 2015 at 10:53
1

New versions of psutil use the name proc.get_open_files( )
– gerardw
Commented Dec 1, 2016 at 23:01
Doesn't work at all for me. Simple test: from psutil import Process; proc = Process(); with open('fileincurrentdirectory') as f: print(proc.open_files()) - you'll need to add a line break and indent that final print statement. It'll print out an empty list. And I did use the name of an actual file actually in the current directory.
– ArtOfWarfare
Commented Jan 18, 2018 at 14:57
ImportError: No module named psutil
– Friedrich -- Слава Україні
Commented Apr 7, 2022 at 17:09

| Show 1 more comment

flying sheep · Accepted Answer · 2018-06-07 16:17:32Z

43

I ended up wrapping the built-in file object at the entry point of my program. I found out that I wasn't closing my loggers.

import io
import sys
import builtins
import traceback
from functools import wraps


def opener(old_open):
    @wraps(old_open)
    def tracking_open(*args, **kw):
        file = old_open(*args, **kw)

        old_close = file.close
        @wraps(old_close)
        def close():
            old_close()
            open_files.remove(file)
        file.close = close
        file.stack = traceback.extract_stack()

        open_files.add(file)
        return file
    return tracking_open


def print_open_files():
    print(f'### {len(open_files)} OPEN FILES: [{", ".join(f.name for f in open_files)}]', file=sys.stderr)
    for file in open_files:
        print(f'Open file {file.name}:\n{"".join(traceback.format_list(file.stack))}', file=sys.stderr)


open_files = set()
io.open = opener(io.open)
builtins.open = opener(builtins.open)

edited Jun 7, 2018 at 16:17

flying sheep

8,8235 gold badges58 silver badges79 bronze badges

answered Jan 7, 2010 at 21:11

Claudiu

227k169 gold badges499 silver badges692 bronze badges

@Claudiu - can you please how using close all open files - def closeall(self): print "### CLOSING All files ###" oldfile.close(self,(f.x for f in openfiles)) openfiles.remove(self,(f.x for f in openfiles)) Will work perfectly?
– Programmer
Commented Dec 13, 2014 at 11:52
3

In order not to extend lifetimes of the file objects (and hence prevent auto-closing on reference counted objects in cpython) it's IMO worth using a weakref.WeakSet instead of plain set for openfiles.
– coldfix
Commented Apr 23, 2015 at 3:17
Thanks, worked brilliantly for me too. Note that I had exceptions when patching a program also using sklearn joblib: the same file could be closed twice by joblib.load, causing an exception in openfiles.remove(self).
– massyah
Commented Aug 22, 2016 at 12:38
1

When I use this snippet I get the next error: AttributeError: 'file' object attribute 'close' is read-only. Any idea how can I fix it? $python --version Python 2.7.10
– fsquirrel
Commented Jun 8, 2018 at 14:22

Add a comment |

Mike DeSimone · Accepted Answer · 2010-01-07 21:25:59Z

25

On Linux, you can look at the contents of /proc/self/fd:

$ ls -l /proc/self/fd/
total 0
lrwx------ 1 foo users 64 Jan  7 15:15 0 -> /dev/pts/3
lrwx------ 1 foo users 64 Jan  7 15:15 1 -> /dev/pts/3
lrwx------ 1 foo users 64 Jan  7 15:15 2 -> /dev/pts/3
lr-x------ 1 foo users 64 Jan  7 15:15 3 -> /proc/9527/fd

answered Jan 7, 2010 at 21:25

Mike DeSimone

42.5k10 gold badges75 silver badges96 bronze badges

Is this just for CPython or all implementations? I remember seeing, I think, that files open in ipython are listed in /proc/ipython_pid/fd/. Also, in the list above, how do you know what are files you opened and which are files that Python opened (and which you shouldn't close)?
– Chris
Commented Jan 26, 2012 at 11:13
3

This is for Linux systems which provide the /proc filesystem. It's independent of language; any program in any language that can access the "files" in /proc can get this information. I haven't messed with ipython, but the basic idea would be to record the contents of /proc/self/fd after initialization and then compare the contents later in the run to look for changes.
– Mike DeSimone
Commented Jan 26, 2012 at 14:18

Add a comment |

Intrastellar Explorer · Accepted Answer · 2023-06-07 00:40:34Z

Although the solutions above that wrap opens are useful for one's own code, I was debugging my client to a third party library including some c extension code, so I needed a more direct way. The following routine works under darwin, and (I hope) other unix-like environments:

Python 3:

import os
import shutil
import subprocess


def get_open_fds() -> int:
    """Get the number of open file descriptors for the current process."""
    lsof_path = shutil.which("lsof")
    if lsof_path is None:
        raise NotImplementedError("Didn't handle unavailable lsof.")
    raw_procs = subprocess.check_output(
        [lsof_path, "-w", "-Ff", "-p", str(os.getpid())]
    )

    def filter_fds(lsof_entry: str) -> bool:
        return lsof_entry.startswith("f") and lsof_entry[1:].isdigit()

    fds = list(filter(filter_fds, raw_procs.decode().split(os.linesep)))
    return len(fds)

Python 2:

def get_open_fds():
    '''
    return the number of open file descriptors for current process

    .. warning: will only work on UNIX-like os-es.
    '''
    import subprocess
    import os

    procs = subprocess.check_output(
        ['lsof', '-w', '-Ff', '-p', str(os.getpid())]
    )

    nprocs = len(
        filter(
            lambda s: s and s[0] == 'f' and s[1:].isdigit(),
            procs.split('\n'))
    )
    return nprocs

If anyone can extend to be portable to windows, I'd be grateful.

eduffy · Accepted Answer · 2010-01-07 21:08:48Z

9

On Linux, you can use lsof to show all files opened by a process.

answered Jan 7, 2010 at 21:08

eduffy

39.9k13 gold badges97 silver badges93 bronze badges

2

Has python some internal function for lsof, or I really have to call linux lsof?
– sumid
Commented Sep 6, 2011 at 21:02

Add a comment |

Romuald Brunet · Accepted Answer · 2020-07-03 20:27:47Z

As said earlier, you can list fds on Linux in /proc/self/fd, here is a simple method to list them programmatically:

import os
import sys
import errno

def list_fds():
    """List process currently open FDs and their target """
    if not sys.platform.startswith('linux'):
        raise NotImplementedError('Unsupported platform: %s' % sys.platform)

    ret = {}
    base = '/proc/self/fd'
    for num in os.listdir(base):
        path = None
        try:
            path = os.readlink(os.path.join(base, num))
        except OSError as err:
            # Last FD is always the "listdir" one (which may be closed)
            if err.errno != errno.ENOENT:
                raise
        ret[int(num)] = path

    return ret

interjay · Accepted Answer · 2010-01-07 21:07:13Z

5

On Windows, you can use Process Explorer to show all file handles owned by a process.

answered Jan 7, 2010 at 21:07

interjay

109k21 gold badges274 silver badges258 bronze badges

Add a comment |

stacksia · Accepted Answer · 2013-05-21 15:46:27Z

There are some limitations to the accepted response, in that it does not seem to count pipes. I had a python script that opened many sub-processes, and was failing to properly close standard input, output, and error pipes, which were used for communication. If I use the accepted response, it will fail to count these open pipes as open files, but (at least in Linux) they are open files and count toward the open file limit. The lsof -p solution suggested by sumid and shunc works in this situation, because it also shows you the open pipes.

Matt-the-Bat · Accepted Answer · 2013-12-19 04:56:20Z

Get a list of all open files. handle.exe is part of Microsoft's Sysinternals Suite. An alternative is the psutil Python module, but I find 'handle' will print out more files in use.

Here is what I made. Kludgy code warning.

#!/bin/python3
# coding: utf-8
"""Build set of files that are in-use by processes.
   Requires 'handle.exe' from Microsoft SysInternals Suite.
   This seems to give a more complete list than using the psutil module.
"""

from collections import OrderedDict
import os
import re
import subprocess

# Path to handle executable
handle = "E:/Installers and ZIPs/Utility/Sysinternalssuite/handle.exe"

# Get output string from 'handle'
handle_str = subprocess.check_output([handle]).decode(encoding='ASCII')

""" Build list of lists.
    1. Split string output, using '-' * 78 as section breaks.
    2. Ignore first section, because it is executable version info.
    3. Turn list of strings into a list of lists, ignoring first item (it's empty).
"""
work_list = [x.splitlines()[1:] for x in handle_str.split(sep='-' * 78)[1:]]

""" Build OrderedDict of pid information.
    pid_dict['pid_num'] = ['pid_name','open_file_1','open_file_2', ...]
"""
pid_dict = OrderedDict()
re1 = re.compile("(.*?\.exe) pid: ([0-9]+)")  # pid name, pid number
re2 = re.compile(".*File.*\s\s\s(.*)")  # File name
for x_list in work_list:
    key = ''
    file_values = []
    m1 = re1.match(x_list[0])
    if m1:
        key = m1.group(2)
#        file_values.append(m1.group(1))  # pid name first item in list

    for y_strings in x_list:
        m2 = re2.match(y_strings)
        if m2:
            file_values.append(m2.group(1))
    pid_dict[key] = file_values

# Make a set of all the open files
values = []
for v in pid_dict.values():
    values.extend(v)
files_open = sorted(set(values))

txt_file = os.path.join(os.getenv('TEMP'), 'lsof_handle_files')

with open(txt_file, 'w') as fd:
    for a in sorted(files_open):
        fd.write(a + '\n')
subprocess.call(['notepad', txt_file])
os.remove(txt_file)

Adam Crossland · Accepted Answer · 2010-01-07 21:04:16Z

1

I'd guess that you are leaking file descriptors. You probably want to look through your code to make sure that you are closing all of the files that you open.

answered Jan 7, 2010 at 21:04

Adam Crossland

14.2k3 gold badges45 silver badges56 bronze badges

I figured that's what the problem was. However the code is very complex, and this would be an easy way to immediately spot which files aren't being closed.
– Claudiu
Commented Jan 7, 2010 at 21:06

Add a comment |

Community · Accepted Answer · 2017-05-23 12:10:34Z

You can use the following script. It builds on Claudiu's answer. It addresses some of the issues and adds additional features:

Prints a stack trace of where the file was opened
Prints on program exit
Keyword argument support

Here's the code and a link to the gist, which is possibly more up to date.

"""
Collect stacktraces of where files are opened, and prints them out before the
program exits.

Example
========

monitor.py
----------
from filemonitor import FileMonitor
FileMonitor().patch()
f = open('/bin/ls')
# end of monitor.py

$ python monitor.py
  ----------------------------------------------------------------------------
  path = /bin/ls
  >   File "monitor.py", line 3, in <module>
  >     f = open('/bin/ls')
  ----------------------------------------------------------------------------

Solution modified from:
https://stackoverflow.com/questions/2023608/check-what-files-are-open-in-python
"""
from __future__ import print_function
import __builtin__
import traceback
import atexit
import textwrap


class FileMonitor(object):

    def __init__(self, print_only_open=True):
        self.openfiles = []
        self.oldfile = __builtin__.file
        self.oldopen = __builtin__.open

        self.do_print_only_open = print_only_open
        self.in_use = False

        class File(self.oldfile):

            def __init__(this, *args, **kwargs):
                path = args[0]

                self.oldfile.__init__(this, *args, **kwargs)
                if self.in_use:
                    return
                self.in_use = True
                self.openfiles.append((this, path, this._stack_trace()))
                self.in_use = False

            def close(this):
                self.oldfile.close(this)

            def _stack_trace(this):
                try:
                    raise RuntimeError()
                except RuntimeError as e:
                    stack = traceback.extract_stack()[:-2]
                    return traceback.format_list(stack)

        self.File = File

    def patch(self):
        __builtin__.file = self.File
        __builtin__.open = self.File

        atexit.register(self.exit_handler)

    def unpatch(self):
        __builtin__.file = self.oldfile
        __builtin__.open = self.oldopen

    def exit_handler(self):
        indent = '  > '
        terminal_width = 80
        for file, path, trace in self.openfiles:
            if file.closed and self.do_print_only_open:
                continue
            print("-" * terminal_width)
            print("  {} = {}".format('path', path))
            lines = ''.join(trace).splitlines()
            _updated_lines = []
            for l in lines:
                ul = textwrap.fill(l,
                                   initial_indent=indent,
                                   subsequent_indent=indent,
                                   width=terminal_width)
                _updated_lines.append(ul)
            lines = _updated_lines
            print('\n'.join(lines))
            print("-" * terminal_width)
            print()

I recent python, there is no module __builtin__, it can be replaced by builtins. — EmmaRenauld, Commented Apr 5, 2023 at 19:39

Collectives™ on Stack Overflow

check what files are open in Python

11 Answers 11

Not the answer you're looking for? Browse other questions tagged
python
debugging
exception
file
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

Not the answer you're looking for? Browse other questions tagged pythondebuggingexceptionfile or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
debugging
exception
file
or ask your own question.