5

I have a python web application deployed on Google App Engine.

I need to grab a log file stored on Amazon S3 and load it into Google Cloud Storage. Once it is in Google Cloud Storage I may need to perform some transformations and eventually import the data into BigQuery for analysis.

I tried using gsutil as a some sort of proof of concept, since boto is under the hood of gsutil and I'd like to use boto in my project. This did not work.

I'd like to know if anyone has managed to transfer file directly between the 2 clouds. If possible I'd like to see a simple example. In the end this task has to be accomplished through code executing on GAE.

4 Answers 4

9

Per this thread, you can stream data from S3 to Google Cloud Storage using gsutil but every byte still has to take two hops: S3 to your local computer and then your computer to GCS. Since you're using App Engine, however, you should be able to pull from S3 and deposit into GCS. It's the same progression as above except App Engine is the intermediary, i.e. every byte travels from S3 to your app and then to GCS. You could use boto for the pull side and the Google Cloud Storage API for the push side.

3

Google allows you to import entire buckets from S3 to the storage service:

https://cloud.google.com/storage/transfer/getting-started

You can set file filters on the source bucket to only import the file you want, or a "directory" (i.e. anything with a certain prefix).

1
  • It's weird their GUI tool doesn't let you select multiple buckets, or upload a list of all buckets one may have on s3. Manually uploading every bucket to Google storage is time consuming. Commented Mar 3, 2016 at 7:44
1

I'm not aware of any cloud provider that provides an API for transferring data to a competing cloud provider. Cloud providers have no incentive to help you move your data to the competition. You will almost certainly have to read the data to an intermediate machine then write it to Google.

1
  • 1
    Many providers do offer functionality to import data; Google could have a "import S3 bucket" option.
    – hraban
    Commented Dec 2, 2015 at 11:45
0

GCP supports not only transfer from S3, also it supports all the storage which have S3-compatible API's.

https://cloud.google.com/storage-transfer/docs/create-transfers https://cloud.google.com/storage-transfer/docs/s3-compatible

Not the answer you're looking for? Browse other questions tagged or ask your own question.