Moving multiple files with gsutil

Question

Let's say I've got the following files in a Google Cloud Storage bucket:

file_A1.csv
file_B2.csv
file_C3.csv

Now I want to move a subset of these files, lets say file_A1.csv and file_B2.csv. Currently I do this like that:

gsutil mv gs://bucket/file_A1.csv gs://bucket/file_A11.csv
gsutil mv gs://bucket/file_B2.csv gs://bucket/file_B22.csv

This approach requires two call of more or less the same command and moves each file separately. I know, that if I move a complete directory I can add the -m option in order to accelerate this process. However, unfortunately I just want to move a subset of all files and keep the rest untouched in the bucket.

When moving 100 files this way I need to execute 100 commands or so and this becomes quite time consuming. I there a way to combine each of the 100 files into just one command with addtionally the -m option?

Do you have a rule for what the destination's name is? Is that also in a file, or is it "repeat the last letter of the existing file", or something more elaborate? — Brandon Yarbrough, Commented Apr 29, 2015 at 17:59

evolved · Accepted Answer · 2020-08-10 15:57:33Z

8

If you have a list of the files you want to move you can use the -I option from the cp command which, according to the docs, is also valid for the mv command:

cat filelist | gsutil -m mv -I gs://my-bucket

answered Aug 10, 2020 at 15:57

evolved

2,10022 silver badges43 bronze badges

that's what I came here for!
– masterxilo
Commented Jan 4, 2022 at 19:17

Add a comment |

Matheus Torquato · Accepted Answer · 2022-08-11 15:18:52Z

7

That worked for me for moving all txt files from gs://config to gs://config/new_folder

gsutil mv 'gs://config/*.txt' gs://config/new_folder/

I had some problems using the wildcard * in zsh, so that is the reason for the quotes around the origin path

edited Aug 11, 2022 at 15:18

answered Jun 24, 2022 at 11:36

Matheus Torquato

1,56122 silver badges27 bronze badges

1

works perfectly in the cloud console shell as well +1
– 1252748
Commented Jul 15, 2023 at 1:42

Add a comment |

jterrace · Accepted Answer · 2015-04-29 15:55:36Z

4

gsutil does not support this currently but what you could do is create a number of shell scripts, each performing a portion of the moves, and run them concurrently.

Note that gsutil mv is based on the syntax of the unix mv command, which also doesn't support the feature you're asking for.

edited Apr 29, 2015 at 15:55

jterrace

66.5k22 gold badges161 silver badges205 bronze badges

answered Apr 29, 2015 at 14:56

rein

33.3k24 gold badges87 silver badges108 bronze badges

Yeah, I already thought about that. However, is there a limit of concurrent commands that are allowed to be executed simultaneously?
– toom
Commented Apr 29, 2015 at 15:56
Only normal operating system limitations would apply; the tool itself can be executed any number of times concurrently.
– Travis Hobrla
Commented Apr 29, 2015 at 19:22
Okay, I wrote a small script that move 100 files in parallel. The result was that just 25 files were moved and the whole process took 10 minutes. Defintely not a solution.
– toom
Commented Apr 30, 2015 at 14:55

Add a comment |

Hisham Karam · Accepted Answer · 2022-08-19 08:35:59Z

you can achieve that using bash by iterating over the gsutil ls output for example:

source folder name: old_folder
new folder name: new_folder

for x in `gsutil ls "gs://<bucket_name>/old_folder"`; do y=$(basename -- "$x");gsutil mv ${x} gs://<bucket_name>/new_folder/${y}; done

you can run in parallel if you have a huge number of files using:

N=8 # number of parallel workers
(
for x in `gsutil ls "gs://<bucket_name>/old_folder"`; do 
   ((i=i%N)); ((i++==0)) && wait
   y=$(basename -- "$x");gsutil mv ${x} gs://<bucket_name>/new_folder/${y} & 
done
)

John Kitonyo · Accepted Answer · 2022-11-07 13:14:28Z

3

Not documented widely but this works all the time

To move the contents of the third folder to the root or any folder before it

gsutil ls gs://my-bucket/first/second/third/ | gsutil -m mv -I gs://my-bucket/first/

and to copy

gsutil ls gs://my-bucket/first/second/third/ | gsutil -m cp -I gs://my-bucket/first/

answered Nov 7, 2022 at 13:14

John Kitonyo

2,3291 gold badge17 silver badges18 bronze badges

Add a comment |

Francis Chasco · Accepted Answer · 2021-07-27 20:42:10Z

1

To do this you can run the follow gsutil command:

gsutil mv gs://bucket_name/common_file_name*  gs://bucket_destiny_name/common_file_name*

In your case; common_file_name is "file_"

answered Jul 27, 2021 at 20:42

Francis Chasco

111 bronze badge

Add a comment |

brook · Accepted Answer · 2021-02-16 00:12:28Z

The lack of -m flag is the real hang up here. Facing the same issue I originally managed this by using python multiprocessing and os.system to call gsutil. I had 60k files and it was going to take hours. With some experimenting I found using the python client gave a 20x speed-up!

If you are willing to move away from gsutil - its a better approach.

Here is a copy(or move) method. If you create a list of src keys/uri's you can call this using multi-threading for fast results.

Note: the method a tuple of (destination-name,exception) which you can pop into a dataframe or something to look for failures

def cp_blob(key=None,bucket=BUCKET_NAME,uri=None,delete_src=False):
    try:
        if uri:
            uri=re.sub('gs://','',uri)
            bucket,key=uri.split('/',maxsplit=1)
        client=storage.Client()
        bucket=client.get_bucket(bucket)
        blob=bucket.blob(key)
        dest=re.sub(THING1,THING2,blob.name)  ## OR SOME OTHER WAY TO GET NEW DESTINATIONS
        out=bucket.copy_blob(blob,bucket,dest)
        if delete_src:
            blob.delete()
        return out.name, None
    except Exception as e:
        return None, str(e)

Collectives™ on Stack Overflow

Moving multiple files with gsutil

7 Answers 7

Not the answer you're looking for? Browse other questions tagged
google-cloud-storage
move
gsutil
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Not the answer you're looking for? Browse other questions tagged google-cloud-storagemovegsutil or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
google-cloud-storage
move
gsutil
or ask your own question.