230

I need to get the latest file of a folder using python. While using the code:

max(files, key = os.path.getctime)

I am getting the below error:

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'a'

2
  • 3
    Which file are you try to find? add your relevant code to the quesiton. Commented Sep 5, 2016 at 9:00
  • 2
    I'm guessing why it might not be working for you: is "files" a list of filename elements or a single filename string?
    – mpurg
    Commented Sep 5, 2016 at 9:11

11 Answers 11

541

Whatever is assigned to the files variable is incorrect. Use the following code.

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print(latest_file)
12
  • 4
    What if instead of a file I want to find the latest created/modified folder ?
    – lucians
    Commented Sep 8, 2017 at 15:36
  • 4
    @Link the same code works for that. If you want to check its a folder or not u can check if os.path.isdir(latest_file): Commented Sep 11, 2017 at 4:23
  • 9
    Weird. I had to use "min" to get the latest file. Some searching around hinted that it's os specific.
    – Graeck
    Commented Dec 12, 2017 at 23:53
  • 30
    This is an excellent answer--THANK YOU! I like to work with pathlib.Path objects more than strings and os.path. With pathlib.Path objects your answer becomes: list_of_paths = folder_path.glob('*'); latest_path = max(list_of_paths, key=lambda p: p.stat().st_ctime)
    – Phil
    Commented Apr 25, 2018 at 22:42
  • 5
    @phil You can still use os.path.getctime as key, even with Path objects. Commented Nov 20, 2018 at 14:11
76
max(files, key = os.path.getctime)

is quite incomplete code. What is files? It probably is a list of file names, coming out of os.listdir().

But this list lists only the filename parts (a. k. a. "basenames"), because their path is common. In order to use it correctly, you have to combine it with the path leading to it (and used to obtain it).

Such as (untested):

def newest(path):
    files = os.listdir(path)
    paths = [os.path.join(path, basename) for basename in files]
    return max(paths, key=os.path.getctime)
10
  • 3
    I am sure the downvoters can explain what exactly is wrong.
    – glglgl
    Commented Sep 6, 2016 at 11:36
  • 6
    Dunno, tested for you, it does seem to work. On top of that, you were the only one to care to explain a bit. Reading the accepted answer made me think that 'glob' thing was needed, whereas it's absolutely not. Thanks
    – Arnaud P
    Commented Dec 13, 2017 at 17:16
  • 5
    @David Of course. Just insert if basename.endswith('.csv') into the list comprehension.
    – glglgl
    Commented Sep 26, 2018 at 12:14
  • 1
    @BreakBadSP If you want flexibility, you are right. If you are restricted to a certain directory, I don't see how yours can possibly more efficient. But sometimes, readability is more important than efficiency, so yours might indeed be better in that sense.
    – glglgl
    Commented Oct 8, 2018 at 11:03
  • 2
    Thanks for this, I've used this in so many of my ETL functions!
    – Umar.H
    Commented Jun 15, 2019 at 19:57
34

I lack the reputation to comment but ctime from Marlon Abeykoons response did not give the correct result for me. Using mtime does the trick though. (key=os.path.getmtime))

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getmtime)
print(latest_file)

I found two answers for that problem:

python os.path.getctime max does not return latest Difference between python - getmtime() and getctime() in unix system

1
  • 1
    On a Mac getctime was also the wrong result, with getmtime fixing it for me as well.
    – Kim Miller
    Commented Jul 8, 2021 at 16:54
21

I've been using this in Python 3, including pattern matching on the filename.

from pathlib import Path

def latest_file(path: Path, pattern: str = "*"):
    files = path.glob(pattern)
    return max(files, key=lambda x: x.stat().st_ctime)
1
  • 4
    This would be even better if the max arg default was added to support no files matching the path/pattern - max (and min) raise ValueError in that situation so better to set a default - requires python 3.4+
    – nickjb
    Commented Jun 29, 2022 at 20:18
14

I would suggest using glob.iglob() instead of the glob.glob(), as it is more efficient.

glob.iglob() Return an iterator which yields the same values as glob() without actually storing them all simultaneously.

Which means glob.iglob() will be more efficient.

I mostly use below code to find the latest file matching to my pattern:

LatestFile = max(glob.iglob(fileNamePattern),key=os.path.getctime)


NOTE: There are variants of max function, In case of finding the latest file we will be using below variant: max(iterable, *[, key, default])

which needs iterable so your first parameter should be iterable. In case of finding max of nums we can use beow variant : max (num1, num2, num3, *args[, key])

2
  • 3
    I like this max() sort. In my case, I used a different key=os.path.basename since the filenames had timestamps in them.
    – MarkHu
    Commented Dec 11, 2019 at 18:35
  • In your example, if I want to include the folder path for the fileNamePattern, how to do it?
    – FMFF
    Commented Feb 16, 2023 at 17:34
6

Try to sort items by creation time. Example below sorts files in a folder and gets first element which is latest.

import glob
import os

files_path = os.path.join(folder, '*')
files = sorted(
    glob.iglob(files_path), key=os.path.getctime, reverse=True) 
print files[0]
6

Most of the answers are correct but if there is a requirement like getting the latest two or three latest then it could fail or need to modify the code.

I found the below sample is more useful and relevant as we can use the same code to get the latest 2,3 and n files too.

import glob
import os

folder_path = "/Users/sachin/Desktop/Files/"
files_path = os.path.join(folder_path, '*')
files = sorted(glob.iglob(files_path), key=os.path.getctime, reverse=True) 
print (files[0]) #latest file 
print (files[0],files[1]) #latest two files
4

A much faster method on windows (0.05s), call a bat script that does this:

get_latest.bat

@echo off
for /f %%i in ('dir \\directory\in\question /b/a-d/od/t:c') do set LAST=%%i
%LAST%

where \\directory\in\question is the directory you want to investigate.

get_latest.py

from subprocess import Popen, PIPE
p = Popen("get_latest.bat", shell=True, stdout=PIPE,)
stdout, stderr = p.communicate()
print(stdout, stderr)

if it finds a file stdout is the path and stderr is None.

Use stdout.decode("utf-8").rstrip() to get the usable string representation of the file name.

5
  • Not sure why this attracting down votes, for those that need to do this task quickly this is the fastest method I could find. And sometimes it is necessary to do this very quickly.
    – ic_fl2
    Commented Nov 1, 2018 at 7:51
  • Have an upvote. I'm not doing this in Windows, but if you're looking for speed, the other answers require an iteration of all files in a directory. So if shell commands in your OS that specify a sort order of the listed files are available, pulling the first or last result of that should be faster. Commented Nov 8, 2018 at 18:11
  • 1
    Thanks I'm actually more concerned with a better solution than this (as in similarly fast but pure python) so was hoping someone could elaborate on that.
    – ic_fl2
    Commented Nov 21, 2018 at 8:00
  • 3
    Sorry, but I had to downvote, and I'll give you the courtesy of explaining reasons why. The biggest reason is that it is not using python (not cross-platform) thus broken unless ran under Windows. Secondly, this is not a "faster method" (unless faster means quick-and-dirty-not-bothering-to-read-docs) --shelling out to another script is notoriously slow.
    – MarkHu
    Commented Dec 11, 2019 at 16:33
  • 1
    @MarkHu Actually this script was born out of the necessity to check a large folder's content quickly from a python script. So in this case faster method means, gets the file name of newest folder the fastest (or faster than a pure python method). Feel free to add a similar script for linux, probably based on ls -Art | tail -n 1. Please evaluate the performance of a solution before making claims about it.
    – ic_fl2
    Commented Jan 17, 2020 at 13:04
1

(Edited to improve answer)

First define a function get_latest_file

def get_latest_file(path, *paths):
    fullpath = os.path.join(path, paths)
    ...
get_latest_file('example', 'files','randomtext011.*.txt')

You may also use a docstring !

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)

If you use Python 3, you can use iglob instead.

Complete code to return the name of latest file:

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)
    files = glob.glob(fullpath)  # You may use iglob in Python3
    if not files:                # I prefer using the negation
        return None                      # because it behaves like a shortcut
    latest_file = max(files, key=os.path.getctime)
    _, filename = os.path.split(latest_file)
    return filename
2
  • Where did you get the JuniperAccessLog-standalone-FCL_VPN part from?
    – glglgl
    Commented Sep 5, 2016 at 9:10
  • This fails on 0 length files under Windows 10. Commented Dec 21, 2019 at 19:55
1

I have tried to use the above suggestions and my program crashed, than I figured out the file I'm trying to identify was used and when trying to use 'os.path.getctime' it crashed. what finally worked for me was:

    files_before = glob.glob(os.path.join(my_path,'*'))
    **code where new file is created**
    new_file = set(files_before).symmetric_difference(set(glob.glob(os.path.join(my_path,'*'))))

this codes gets the uncommon object between the two sets of file lists its not the most elegant, and if multiple files are created at the same time it would probably won't be stable

0

On Linux you can also call shell tools from python

subprocess.run requires python 3.5+

import subprocess

def find_latest_files(target_dir, count):
    cmd = f"ls -t {target_dir} | head -n{count}"

    try:
        output = subprocess.run(cmd, shell=True, text=True, capture_output=True, check=False)
    except subprocess.CalledProcessError as err:
        sys.exit(f"Error: finding last modified file {err.output[1]}")

    # returns a list[]
    return output.stdout.splitlines()

Not the answer you're looking for? Browse other questions tagged or ask your own question.