0

I'm using sshfs to run some process on one mashine and store the result on another one.

The command I'm using to mount is sshfs user@host:/rempte/path /local/path.

The scrip I'm running creates a file with a specific name in a folder inside the mount point if it doesn't already exist, then makes work and finally saves the content. If the file already exists it skips it and goes to the next one. After that it moves on to the next filename.

To spread the work over all CPU cores I'm running as many instances of the process as threads on the CPU.

The Python function is following:

    def create(fname):
      if os.path.exists(fname):
        return True
      else:
        open(fname, 'a').close()
        return False

What happens is that sometimes two or more instances of the script will enter the create function at the exact same time and end up doing both the same work which is pretty time wasteful.

My question is: is there any way to solve this? either by some kind of parameter for ssh or by some change in my Python code.

Making the script multithreaded is currently outside of my scope.

1
  • There are many ways of solving this. This is basic IPC.
    – Chenmunka
    Commented Feb 10, 2022 at 9:32

1 Answer 1

1

The race is between the existence check and the open() – to avoid it, the best way is to not do manual pre-checks but let the operation itself do that.

The open() function indeed has an "atomic create" mode "x" (or O_CREAT|O_EXCL in low-level libc functions, with a 1:1 translation to SFTP open flags within sshfs). The most direct rewrite of your code would be:

def create(name):
    try:
        open(fname, 'x').close()
        return False
    except FileExistsError:
        return True
try:
    with open(fname, 'x') as outf:
        do_processing(outf)
except FileExistsError:
    pass

Another approach would be to wrap your whole create() function in some cross-process synchronization (lock or mutex) so that only one process at a time can call the function. Most operating systems support a few types of "global" synchronization mechanisms; e.g. any file could act as a lock through fcntl.flock().

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .