0

Given: Folder 1 with A.txt and B.txt and Folder 2 with A.txt. and B.txt
How would I be able to run them concurrently such as file A.txt from folder 1 should run with file from folder 2 A.txt and so on.
What I have so far loops through all of the second folders files and then loops through the first folders files, which throws it out of order. Some stuff will be done such as merging parts of the files together (which has been done so far).
My main question is how would I be able to run through 2 directories simultaneously and do stuff inside them.
Note there are many files in Folder 1 and Folder 2 so I need to find a way that utilizes directory schema of some sort
patha=/folder1
pathb=/folder2

import os,glob
for filename in glob.glob(os.path.join(patha,'*.txt'):
 for filenamez in glob.glob(os.path.join(pathb,'*.txt'):
     MY FUNCTION THAT DOES OTHER STUFF
1
  • Is it guaranteed that there is a name-for-name match in both folders? What do you want to occur if Folder 1 has a filename that is not in Folder 2?
    – AirSquid
    Commented Sep 28, 2022 at 21:45

4 Answers 4

2

You can open files with the same name in both folders simultaneously using context managers and do whatever needs to be done from both input streams:

import os

my_folders = ['Folder1', 'Folder2']

common_files = set(os.listdir('Folder1')) & set(os.listdir('Folder2'))
non_common_files = set(os.listdir('Folder1')) ^ set(os.listdir('Folder2'))

print(f'common_files" {common_files}')
print(f'files without matches: {non_common_files}')

for f_name in common_files:
    with open(os.path.join(my_folders[0], f_name)) as src_1:
        with open(os.path.join(my_folders[1], f_name)) as src_2:
            # do the stuff on both sources... for instance print first line of each:
            print(f'first line of src_1: {src_1.readline()}')
            print(f'first line of src_2: {src_2.readline()}')

Output

common_files" {'A.txt'}
files without matches: set()
first line of src_1: some txt

first line of src_2: text in folder 2's A
1

Is zip what you're looking for?

import glob
import os

files_a = glob.glob(os.path.join(path_a, "*.txt")
files_b = glob.glob(os.path.join(path_b, "*.txt")
for file_a, file_b in zip(files_a, files_b):
    pass
1

You could maybe do something like this:

from threading import Thread
import os,glob

def dir_iterate(path: str):
    for filename in glob.glob(os.path.join(path,'*.txt'):
        # Other stuff ..


path1 = "./directory1"
path2 = "./directory2"
Thread(target = dir_iterate, args=(path1,)).start()
Thread(target = dir_iterate, args=(path2,)).start()
1
  • 1
    OP is unclear, but says that they want to "merge some data" etc. which means threads are not likely independent in this context
    – AirSquid
    Commented Sep 28, 2022 at 21:53
1

This should work,

import glob
import os

files_a = sorted(glob.glob(os.path.join(path_a, "*.txt")))
files_b = sorted(glob.glob(os.path.join(path_b, "*.txt")))

for file_a, file_b in zip(files_a, files_b):
    # Add code to concat
1
  • Fails with non-common files.
    – greybeard
    Commented Oct 4, 2022 at 8:39

Not the answer you're looking for? Browse other questions tagged or ask your own question.