Google Cloud Platform architecture

Question

A simple question:

Is the data that is processed via Google Big Query stored on Google Cloud Storage, and is just segmented for GBQ purposes? or does Google Big Query hold it's own Storage mechanism.

I'm trying to learn the architecture, and I see arrows pointing back and forth to each other, but it doesn't say where GBQ's architecture sits?

Thanks.

Elliott Brossard · Accepted Answer · 2017-08-10 20:42:46Z

From Bigquery under the hood:

Colossus - Distributed Storage

BigQuery relies on Colossus, Google’s latest generation distributed file system. Each Google datacenter has its own Colossus cluster, and each Colossus cluster has enough disks to give every BigQuery user thousands of dedicated disks at a time. Colossus also handles replication, recovery (when disks crash) and distributed management (so there is no single point of failure). Colossus is fast enough to allow BigQuery to provide similar performance to many in-memory databases, but leveraging much cheaper yet highly parallelized, scalable, durable and performant infrastructure.

BigQuery leverages the ColumnIO columnar storage format and compression algorithm to store data in Colossus in the most optimal way for reading large amounts of structured data.Colossus allows BigQuery users to scale to dozens of Petabytes in storage seamlessly, without paying the penalty of attaching much more expensive compute resources — typical with most traditional databases.

The part about ColumnIO is outdated--BigQuery uses the Capacitor format now--but the rest is still relevant.

is Colossus a Google Cloud Storage thing? meaning is it used on both? or is it a separate architecture between GCS and Colossus? — arcee123, Commented Aug 10, 2017 at 20:46
GCS is built on top of Colossus. Colossus provides a lower-level storage API for Google's own services. — Elliott Brossard, Commented Aug 10, 2017 at 20:49

Collectives™ on Stack Overflow

Google Cloud Platform architecture

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
google-bigquery
google-cloud-platform
google-cloud-storage
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged google-bigquerygoogle-cloud-platformgoogle-cloud-storage or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
google-bigquery
google-cloud-platform
google-cloud-storage
or ask your own question.