6

Is there a way to set all public links to have 'no-cache' in Google Cloud Storage?

I've seen solutions to use gsutil to set the "Cache-Control" upon file-upload, but I'm looking for a more permanent solution.

There was a conversation about providing a cache invalidation feature but I didn't quite follow the reasoning. Any explanations would be greatly appreciated!

it would be difficult to provide a cache invalidation feature because once served with a non-0 cache TTL any cache on the Internet (not just those under Google's control) is allowed (per HTTP spec) to cache the data

Thanks!

10
  • What do you mean by 'more permanent'?
    – Greg
    Commented Aug 17, 2015 at 19:44
  • I was hoping to set "no-cache" Account wide once and then forget about it. Currently, I need to set the "Cache-Control" to "no-cache" every time I re-upload a file.
    – Aaron Yan
    Commented Aug 17, 2015 at 20:36
  • The default behavior is no caching.
    – jterrace
    Commented Aug 17, 2015 at 20:40
  • 2
    For private objects, the default is no caching. For public objects, the default is "public, max-time=3600". There is not currently a way to change the default behavior. Commented Aug 17, 2015 at 20:42
  • 1
    Regarding your question about providing a cache invalidation feature: The problem is that once an object has been served with caching enabled, it can be cached anywhere on the Internet -- not just at Google-managed sites. There's no way to know all the sites where an object has been cached (imagine a cache running inside a corporate network behind a firewall), hence no way to invalidate all copies. Commented Aug 19, 2015 at 19:58

1 Answer 1

0

For a more permanent one-time-effort solution, with the current offerings on GCP, you can do this with Cloud Functions.

Create a new Funciton, set the Event type to "On (finalizing/creating) file in the selected bucket" - google.storage.object.finalize. Make sure to select the bucket you want this on. In the body of the function, set the cacheControl / Cache-Control attribute for the blob. The attribute name depends on the language. Here's my version in Python, using cache_control:

main.py:
match the function name below to the Entry point

from google.cloud import storage

def set_file_uncached(event, context):
    file = event  # auto-generated
    print(f"Processing file: {file=}")  # logging, if you want it
    storage_client = storage.Client()
    # we expect just one with that name
    blob = storage_client.bucket(file["bucket"]).get_blob(file["name"])
    if not blob:
        # in case the blob is deleted before this executes
        print(f"blob not found")
        return None
    blob.cache_control = "public, max-age=0"  # or whatever you need
    blob.patch()

requirements.txt

google-cloud-storage

From the logs: Function execution took 1712 ms, finished with status: 'ok'. This could have been faster but I've set the minimum to 0 instances so it needs to spin-up for each upload. Depending on your usage and cost constraints, you can set it to 1 or something higher.

Other settings:

Retry on failure: No/False
Region: [wherever your bucket is]
Memory allocated: 128 MB (smallest available currently)
Timeout: 5 seconds (smallest available currently, function shouldn't take longer)
Minimum instances: 0
Maximum instances: 1

Not the answer you're looking for? Browse other questions tagged or ask your own question.