21

I want to dump data from BigQuery (i.e. reports) into a CloudSQL database, what is the best way to achieve this programatically?

I realise I could do this manually by running a BigQuery query, downloading it as a CSV, then uploading it through the Cloud console, but I want to do this programatically, preferably in Python/SQL.

1 Answer 1

16

If you would like to dump entire tables, you can use a combination of the BigQuery and Cloud SQL APIs to achieve this.

The BigQuery documentation has an API example in python for extracting a BigQuery table to Cloud Storage.

Once the data is in Cloud Storage, you can use the Cloud SQL Admin API to import the data into a MySQL table.

If you need more granular control, you can use the BigQuery API to perform the query, fetch the results, connect to the Cloud SQL instance and insert the data. This won't perform as well if the amount of data is large.

A more complex approach is to use Dataflow to write the data you are interested in to Cloud Storage and use the Cloud SQL API to import it.

(For my own curiosity, can you describe the use case for wanting the data in Cloud SQL instead of BigQuery? It will help me/us understand how our customers are using our product and where we can improve.)

5
  • 1
    I can offer two use cases: 1) You want to use a 3rd-party tool or LOB application that requires a truly SQL compliant database (with UPDATE, DELETE, etc.). Prep the data in BQ, then export to CloudSQL. 2) You have processing/analysis/ETL scripts from MySQL that are complicated or expensive to port to BQ. Prep the data in BQ, do some processing in CloudSQL, then bring it back to BQ. To eliminate this need, BQ would need ANSI-SQL compatibility and stored procedures. (I'm not recommending either... just saying.) Commented Apr 7, 2016 at 4:14
  • 11
    Thanks, our use case is: we want to do the 'number crunching' of big data in BigQuery and we want to output daily reports (i.e. much smaller data based on BQ queries) into a MySQL database so that we can easily display these through a web dashboard/API
    – p_mcp
    Commented Apr 7, 2016 at 8:32
  • Another use case is wanting to have access to more flexible joins than simple identity, because cross joins of big tables quickly overwhelm even BigQuery's horse power.
    – oulenz
    Commented Apr 7, 2016 at 11:48
  • 5
    If this is still active - we're a customer doing this to use CloudSQL as a base for our api, as BigQuery can't index or return queries quick enough to power the API.
    – Mathieson
    Commented Feb 7, 2020 at 5:57
  • same for me: we have some AI bulk processes in BQ and need to load the result into MySQL for the serving.
    – Thomas W.
    Commented Mar 15, 2022 at 14:41

Not the answer you're looking for? Browse other questions tagged or ask your own question.