3

Question: can wildcards be used in GCS bucketnames with gsutil?

I want to grab multiple files in GCS using wildcards that are split across buckets. But, I'm consistently running into errors when using wildcards in bucket names with gsutil. I'm using wildcards like this:

gsutil ls gs://myBucket-abcd-*/log/data_*

I want to match all these file names (variations in bucket name AND in object name):

gs://myBucket-abcd-1234/log/data_foo.csv
gs://myBucket-abcd-1234/log/data_bar.csv
gs://myBucket-abcd-5678/log/data_foo.csv
gs://myBucket-abcd-5678/log/data_bar.csv

Documentation on Bucket Wildcards tells me I should be able to use wildcards both in the bucketname and object name, but the code sample above always gets "BadRequestException: 400 Invalid argument."

gsutil is otherwise working when I use no wildcards or use wildcards in the object name only. But adding a wildcard to the bucket name results in the error. Are there workarounds to make the wildcard work in bucket names, or am I misinterpreting the linked documentation?

2
  • The wildcard on buckets and objects will work. I have tested it with my project. You can run gsutil -DD flag to get more debugging information. This issue seems to be related to ACLs set on objects or your buckets. Make sure you have permission to view these objects or buckets.
    – Faizan
    Commented Jan 6, 2016 at 18:11
  • Wildcards should work. If "gsutil -DD ls your-wildcard..." doesn't help you understand what is wrong, please email the output of the gsutil -DD ls command to [email protected] and I'll take a look. Commented Jan 6, 2016 at 19:40

2 Answers 2

5

Some shells (Zsh) is trying to expand the * and ** , so you need to include these inside quotation marks. Like this

gsutil ls 'gs://myBucket-abcd-*/log/data_*'

I found it here gsutil returning "no matches found"

3

Found that not being able to use bucket wildcards in this case is working as intended, and is due to differences in permission settings. Google Cloud Storage permissions can be set at both bucket and project levels.

Though the access token used in this case can access every individual bucket, it doesn't have reader/editor/owner access to the top-level project (shared across many users of the system). Without access to the project, wildcards cannot be used on buckets.

This can be fixed by having a project owner add the user as a reader/editor/owner to the project.

In this case, for security reasons we can't give an individual token access to all buckets in the project, but its helpful to understand why the wildcard didn't work. Thanks all for the input, and especially Travis for the contact.

2
  • Thanks for saving me all the time looking into it. There really should be a caveat in the documentation.
    – timhj
    Commented Feb 10, 2021 at 1:56
  • Turns out my issue for this was caused by zsh trying to expand the wildcards. In case anyone comes across the same issue; stackoverflow.com/questions/39075182/… might also help.
    – timhj
    Commented Feb 10, 2021 at 2:42

Not the answer you're looking for? Browse other questions tagged or ask your own question.