188

Say, I have two absolute paths. I need to check if the location referring to by one of the paths is a descendant of the other. If true, I need to find out the relative path of the descendant from the ancestor. What's a good way to implement this in Python? Any library that I can benefit from?

6 Answers 6

196

os.path.commonprefix() and os.path.relpath() are your friends:

>>> print os.path.commonprefix(['/usr/var/log', '/usr/var/security'])
'/usr/var'
>>> print os.path.commonprefix(['/tmp', '/usr/var'])  # No common prefix: the root is the common prefix
'/'

You can thus test whether the common prefix is one of the paths, i.e. if one of the paths is a common ancestor:

paths = […, …, …]
common_prefix = os.path.commonprefix(list_of_paths)
if common_prefix in paths:
    …

You can then find the relative paths:

relative_paths = [os.path.relpath(path, common_prefix) for path in paths]

You can even handle more than two paths, with this method, and test whether all the paths are all below one of them.

PS: depending on how your paths look like, you might want to perform some normalization first (this is useful in situations where one does not know whether they always end with '/' or not, or if some of the paths are relative). Relevant functions include os.path.abspath() and os.path.normpath().

PPS: as Peter Briggs mentioned in the comments, the simple approach described above can fail:

>>> os.path.commonprefix(['/usr/var', '/usr/var2/log'])
'/usr/var'

even though /usr/var is not a common prefix of the paths. Forcing all paths to end with '/' before calling commonprefix() solves this (specific) problem.

PPPS: as bluenote10 mentioned, adding a slash does not solve the general problem. Here is his followup question: How to circumvent the fallacy of Python's os.path.commonprefix?

PPPPS: starting with Python 3.4, we have pathlib, a module that provides a saner path manipulation environment. I guess that the common prefix of a set of paths can be obtained by getting all the prefixes of each path (with PurePath.parents()), taking the intersection of all these parent sets, and selecting the longest common prefix.

PPPPPS: Python 3.5 introduced a proper solution to this question: os.path.commonpath(), which returns a valid path.

6
  • Exactly what I need. Thanks for your prompt answer. Will accept your answer once the time restriction is lifted. Commented Sep 2, 2011 at 18:57
  • 12
    Take care with commonprefix, as e.g. the common prefix for /usr/var/log and /usr/var2/log is returned as /usr/var - which is probably not what you'd expect. (It's also possible for it to return paths that are not valid directories.) Commented Feb 9, 2012 at 13:47
  • @PeterBriggs: Thanks, this caveat is important. I added a PPS. Commented Feb 27, 2012 at 9:23
  • 1
    @EOL: I don't really see how to fix the problem by appending a slash :(. What if we have ['/usr/var1/log/', '/usr/var2/log/']?
    – bluenote10
    Commented Feb 1, 2014 at 13:29
  • 1
    @EOL: Since I failed to find an appealing solution for this problem I though it might be okay to discuss this sub-issue in a separate question.
    – bluenote10
    Commented Feb 1, 2014 at 14:18
131

os.path.relpath:

Return a relative filepath to path either from the current directory or from an optional start point.

>>> from os.path import relpath
>>> relpath('/usr/var/log/', '/usr/var')
'log'
>>> relpath('/usr/var/log/', '/usr/var/sad/')
'../log'

So, if relative path starts with '..' - it means that the second path is not descendant of the first path.

In Python3 you can use PurePath.relative_to:

Python 3.5.1 (default, Jan 22 2016, 08:54:32)
>>> from pathlib import Path

>>> Path('/usr/var/log').relative_to('/usr/var/log/')
PosixPath('.')

>>> Path('/usr/var/log').relative_to('/usr/var/')
PosixPath('log')

>>> Path('/usr/var/log').relative_to('/etc/')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pathlib.py", line 851, in relative_to
    .format(str(self), str(formatted)))
ValueError: '/usr/var/log' does not start with '/etc'
3
  • 3
    Checking for the presence of os.pardir is more robust than checking for .. (agreed, there are not many other conventions, though). Commented Sep 2, 2011 at 19:22
  • 24
    Am I wrong or is os.relpath more powerful since it handles .. and PurePath.relative_to() does not? Am I missing something?
    – Ray Salemi
    Commented May 21, 2018 at 14:30
  • 1
    @RaySalemi is correct, it should be noted that >>> Path('/usr/var').relative_to('/usr/var/log') fails with a ValueError
    – CervEd
    Commented Apr 12, 2021 at 9:03
22

A write-up of jme's suggestion, using pathlib, in Python 3.

from pathlib import Path
parent = Path(r'/a/b')
son = Path(r'/a/b/c/d')            
​
if parent in son.parents or parent==son:
    print(son.relative_to(parent)) # returns Path object equivalent to 'c/d'
1
  • So dir1.relative_to(dir2) will give PosixPath('.') if they are the same. When you use if dir2 in dir1.parents then it excludes the identity case. If someone is comparing Paths and wants to run relative_to() if they are path-compatible, a better solution may be if dir2 in (dir1 / 'x').parents or if dir2 in dir1.parents or dir2 == dir1. Then all cases of path compatibility are covered.
    – ingyhere
    Commented May 4, 2020 at 6:04
17

Another option is

>>> print os.path.relpath('/usr/var/log/', '/usr/var')
log
1
  • This always return a relative path; this does not directly indicate whether one of the paths is above the other (one can check for the presence of os.pardir in front of the two possible resulting relative paths, though). Commented Sep 2, 2011 at 19:21
4

Pure Python2 w/o dep:

def relpath(cwd, path):
    """Create a relative path for path from cwd, if possible"""
    if sys.platform == "win32":
        cwd = cwd.lower()
        path = path.lower()
    _cwd = os.path.abspath(cwd).split(os.path.sep)
    _path = os.path.abspath(path).split(os.path.sep)
    eq_until_pos = None
    for i in xrange(min(len(_cwd), len(_path))):
        if _cwd[i] == _path[i]:
            eq_until_pos = i
        else:
            break
    if eq_until_pos is None:
        return path
    newpath = [".." for i in xrange(len(_cwd[eq_until_pos+1:]))]
    newpath.extend(_path[eq_until_pos+1:])
    return os.path.join(*newpath) if newpath else "."
1
  • This one looks good, but, as I stumble upon, there is an issue when cwd and path are the same. it should check first if those two are the same and return either "" or "." Commented Jul 25, 2018 at 9:58
1

Edit : See jme's answer for the best way with Python3.

Using pathlib, you have the following solution :

Let's say we want to check if son is a descendant of parent, and both are Path objects. We can get a list of the parts in the path with list(parent.parts). Then, we just check that the begining of the son is equal to the list of segments of the parent.

>>> lparent = list(parent.parts)
>>> lson = list(son.parts)
>>> if lson[:len(lparent)] == lparent:
>>> ... #parent is a parent of son :)

If you want to get the remaining part, you can just do

>>> ''.join(lson[len(lparent):])

It's a string, but you can of course use it as a constructor of an other Path object.

2
  • 4
    It's even easier than that: simply parent in son.parents, and if it is, getting the remainder with son.relative_to(parent).
    – jme
    Commented Dec 26, 2015 at 4:26
  • @jme You answer is even better, why don't you post it? Commented Dec 26, 2015 at 4:39

Not the answer you're looking for? Browse other questions tagged or ask your own question.