31

How long does it take for a change to a file in Google Cloud Storage to propagate?

I'm having this very frustrating problem where I change the contents of a file and re-upload it via gsutil, but the change doesn't show up for several hours. Is there a way to force a changed file to propagate everything immediately?

If I look at the file in the Google Cloud Storage console, it sees the new file, but then if I hit the public URL it's an old version and in some cases, 2 versions ago.

Is there a header that I'm not setting?

EDIT:

I tried gsutil -h "Cache-Control: no-cache" cp -a public-read MyFile and it doesn't help, but maybe the old file needs to expire before the new no-cache version takes over?

I did a curl -I on the file and get this back:

HTTP/1.1 200 OK
Server: HTTP Upload Server Built on Dec 12 2012 15:53:08 (1355356388)
Expires: Fri, 21 Dec 2012 19:58:39 GMT
Date: Fri, 21 Dec 2012 18:58:39 GMT
Last-Modified: Fri, 21 Dec 2012 18:53:41 GMT
ETag: "66d820174d6de17a278b327e4c3e9b4e"
x-goog-sequence-number: 3
x-goog-generation: 1356116021512000
x-goog-metageneration: 1
Content-Type: application/octet-stream
Content-Language: en
Accept-Ranges: bytes
Content-Length: 160
Cache-Control: public, max-age=3600, no-transform
Age: 3449

Which seems to indicate it will expire in an hour, despite the no-cache.

1 Answer 1

45

Google Cloud Storage provides strong data consistency: once a write completes, a read from anywhere in the world will get the most recent data.

However, if you enable caching (which by default is true for any publicly readable object), reads of that object can see a version of the object as old as the Cache-Control max-age specified on the object. If, for example, you uploaded the file like this:

gsutil cp -a public-read file gs://my_bucket/file

You can see that the max-age is 1 hour (3600 seconds):

gsutil ls -L gs://my_bucket/file
gs://my_bucket/file:
    Creation time:  Fri, 21 Dec 2012 19:59:57 GMT
    Cache-Control:  public, max-age=3600, no-transform
    Content-Length: 1065
    Content-Type:   text/plain
    ETag:       eb3fb83beedf1efffe5b8e32e8d6a65a
    ...

If you want to prevent a publicly readable object from being cached you could do:

gsutil setmeta -h Cache-Control:no-cache gs://my_bucket/file

Alternatively, you could set a shorter max-age on the object:

gsutil setmeta -h 'Cache-Control:public, max-age=600, no-transform'

Mike Schwartz, Google Cloud Storage team

13
  • 1
    I updated my questions to include that I tried no-cache, but I'm still seeing the max-age=3600. Does the old file need to expire before the new no-cache file takes over? Commented Dec 21, 2012 at 20:25
  • 3
    @mike it would be nice to have a feature to invalidate/flush the cache like on CDN.
    – themihai
    Commented Dec 19, 2013 at 20:10
  • 3
    @mihai - it would be difficult to provide a cache invalidation feature because once served with a non-0 cache TTL any cache on the Internet (not just those under Google's control) is allowed (per HTTP spec) to cache the data. Commented Jun 20, 2014 at 22:11
  • 1
    After discussing with @aqquadro I figured out the issue: cloud.google.com/storage/docs/gsutil/addlhelp/… incorrectly stated that uploading a non-public object and then setting the ACL to public-read would result in a non-cacheable object. In fact the HTTP spec allows public objects to be cached by default, so to inhibit caching you need to set a Cache-Control header, for example using the command: gsutil -h Cache-Control:private cp -a public-read file.png gs://your-bucket. (I'll also fix the incorrect documentation.) Commented Jun 16, 2015 at 16:13
  • 1
    @Pier - not in one request. You'd have to do something like: gsutil -m -h "Cache-Control:no-cache, no-store, must-revalidate" gs://your-bucket/** which will make a request for every object in the bucket. Commented Jun 10, 2016 at 22:32

Not the answer you're looking for? Browse other questions tagged or ask your own question.