From Bigquery under the hood:
Colossus - Distributed Storage
BigQuery relies on Colossus, Google’s latest generation distributed
file system. Each Google datacenter has its own Colossus cluster, and
each Colossus cluster has enough disks to give every BigQuery user
thousands of dedicated disks at a time. Colossus also handles
replication, recovery (when disks crash) and distributed management
(so there is no single point of failure). Colossus is fast enough to
allow BigQuery to provide similar performance to many in-memory
databases, but leveraging much cheaper yet highly parallelized,
scalable, durable and performant infrastructure.
BigQuery leverages the ColumnIO columnar storage format and
compression algorithm to store data in Colossus in the most optimal
way for reading large amounts of structured data.Colossus allows
BigQuery users to scale to dozens of Petabytes in storage seamlessly,
without paying the penalty of attaching much more expensive compute
resources — typical with most traditional databases.
The part about ColumnIO is outdated--BigQuery uses the Capacitor format now--but the rest is still relevant.