SlideShare a Scribd company logo
Sao Paulo
Amazon Redshift Deep Dive
Eric Ferreira
AWS
Wanderlei Paiva
Movile
Amazon Redshift system architecture
• Leader node
– SQL endpoint
– Stores metadata
– Coordinates query execution
• Compute nodes
– Local, columnar storage
– Execute queries in parallel
– Load, backup, restore via
Amazon S3; load from
Amazon DynamoDB, Amazon EMR, or SSH
• Two hardware platforms
– Optimized for data processing
– DW1: HDD; scale from 2TB to 2PB
– DW2: SSD; scale from 160GB to 326TB
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
A deeper look at compute node architecture
• Each node is split into slices
– One slice per core
– DW1 – 2 slices on XL, 16 on 8XL
– DW2 – 2 slices on L, 32 on 8XL
• Each slice is allocated memory,
CPU, and disk space
• Each slice processes a piece of
the workload in parallel
Leader Node
Implications for ingestion
Use multiple input files to maximize
throughput
• Use the COPY command
• Each slice can load one file at
a time
• A single input file means only
one slice is ingesting data
• Instead of 100MB/s, you’re
only getting 6.25MB/s
Use multiple input files to maximize
throughput
• Use the COPY command
• You need at least as many
input files as you have slices
• With 16 input files, all slices
are working so you maximize
throughput
• Get 100MB/s per node; scale
linearly as you add nodes
Primary keys and manifest files
• Amazon Redshift doesn’t enforce primary key
constraints
– If you load data multiple times, Amazon Redshift won’t complain
– If you declare primary keys in your DML, the optimizer will expect
the data to be unique
• Use manifest files to control exactly what is loaded
and how to respond if input files are missing
– Define a JSON manifest on Amazon S3
– Ensures the cluster loads exactly what you want
Analyze sort/dist key columns after every load
• Amazon Redshift’s query
optimizer relies on up-to-
date statistics
• Maximize performance by
updating stats on sort/dist
key columns after every
load
Automatic compression is a good thing (mostly)
• Better performance, lower costs
• COPY samples data automatically when loading into an
empty table
– Samples up to 100,000 rows and picks optimal encoding
• If you have a regular ETL process and you use temp tables or
staging tables, turn off automatic compression
– Use analyze compression to determine the right encodings
– Bake those encodings into your DML
Be careful when compressing your sort keys
• Zone maps store min/max per block
• Once we know which block(s) contain the
range, we know which row offsets to scan
• Highly compressed sort keys means many
rows per block
• You’ll scan more data blocks than you need
• If your sort keys compress significantly
more than your data columns, you may
want to skip compression
Keep your columns as narrow as possible
• During queries and ingestion,
the system allocates buffers
based on column width
• Wider than needed columns
mean memory is wasted
• Fewer rows fit into memory;
increased likelihood of
queries spilling to disk
Customer Testimony .
Wanderlei Paiva
“Com os serviços da AWS
pudemos dosar os investimento
iniciais e prospectar os custos
para expansões futuras”
• Líder em Mobile Commerce na América Latina
– 50 Milhões de pessoas usam serviços Movile todo mês
– Estamos conectados a + de 70 Operadoras em toda
América
– + de 50 Bilhões de transações por ano
– + de 700 colaboradores em 11 escrit��rios (AL e EUA)
• PlayKids
– 10M de downloads / 3M usuários ativos
– Conteúdo licenciado em 27 países e usuários em 102
países (6 idiomas: português, inglês, espanhol, alemão,
francês e chines)
– App #1 top grossing for Kids in Apple Store
“O Redshift nos
permitiu
transformar dados
em informações
self-service”
- Wanderley Paiva
Database Specialist
PlayKids iFood MapLink Apontador
Rapiddo Superplayer Cinepapaya ChefTim
e
O Desafio
• Escalabilidade
• Disponibilidade
• Centralização dos dados
• Custos reduzidos e
preferencialmente diluído
Solução
Solução v2
Expanding Amazon Redshift’s query
capabilities
New SQL functions
• We add SQL functions regularly to expand Amazon Redshift’s query
capabilities
• Added 25+ window and aggregate functions since launch, including:
– APPROXIMATE_COUNT
– DROP IF EXISTS, CREATE IF NOT EXISTS
– REGEXP_SUBSTR, _COUNT, _INSTR, _REPLACE
– PERCENTILE_CONT, _DISC, MEDIAN
– PERCENT_RANK, RATIO_TO_REPORT
• We’ll continue iterating but also want to enable you to write your own
User Defined Functions
• We’re enabling User Defined Functions (UDFs) so you can
add your own
– Scalar and Aggregate Functions supported
• You’ll be able to write UDFs using Python 2.7
– Syntax is largely identical to PostgreSQL UDF Syntax
– System and network calls within UDFs are prohibited
• Comes with Pandas, NumPy, and SciPy pre-installed
– You’ll also be able import your own libraries for even more flexibility
Scalar UDF example – URL parsing
CREATE FUNCTION f_hostname (VARCHAR url)
RETURNS varchar
IMMUTABLE AS $$
import urlparse
return urlparse.urlparse(url).hostname
$$ LANGUAGE plpythonu;
Multidimensional indexing
with space filling curves
You’re a small Internet bookstore
• You’re interested in
how you’re doing
– Total sales
– Best customers
– Best-selling items
– Top-selling author this
month
• A row store with
indexes works well
Orders
Product
Time
Customer
Site
You get a little bigger
• Your queries start taking
longer
• You move to a column store
• Now you have zone maps,
large data blocks, but no
indexes
• You have to choose which
queries you want to be fast
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
Today’s state of the art: Zone maps, sorting,
projections
• Zone maps store the min/max values for every block in memory
• Works great for sorted columns
– O(log n) access to blocks
• Doesn’t work so well for unsorted columns
– O(n) access
• Projections are multiple copies of data sorted different ways
– Optimizer decides which copy to use for responding to queries
– Loads are slower
– Gets unwieldy quickly. If you have 8 columns, you need have 8 factorial (40,320) combinations.
Blocks are points in multidimensional space
00 01 10 11
00
01
10
11
Customers
Products
00 01 10 11
00
01
10
11
Customers
Products
• The 2D tables on
the left are over-
specified
• You don’t need
every product or
customer to be in
consecutive rows
• You just need to
make sure that
each appears in the
right sequence
Space filling curves
00 01 10 11
00
01
10
11
Customers
Products
• You need a way of traversing
the space that preserves order
• And you need to touch every
point in the space
• You need a space filling curve
– There are many of these, for example the
curve on the right
• Products appear in order as
do customers and you don’t
favor one over the other
Compound Sort Keys Illustrated
• Records in Redshift
are stored in blocks.
• For this illustration,
let’s assume that
four records fill a
block
• Records with a given
cust_id are all in one
block
• However, records
with a given prod_id
are spread across
four blocks
1
1
1
1
2
3
4
1
4
4
4
2
3
4
4
1
3
3
3
2
3
4
3
1
2
2
2
2
3
4
2
1
1 [1,1] [1,2] [1,3] [1,4]
2 [2,1] [2,2] [2,3] [2,4]
3 [3,1] [3,2] [3,3] [3,4]
4 [4,1] [4,2] [4,3] [4,4]
1 2 3 4
prod_id
cust_id
cust_id prod_id other columns blocks
1 [1,1] [1,2] [1,3] [1,4]
2 [2,1] [2,2] [2,3] [2,4]
3 [3,1] [3,2] [3,3] [3,4]
4 [4,1] [4,2] [4,3] [4,4]
1 2 3 4
prod_id
cust_id
Interleaved Sort Keys Illustrated
• Records with a given
cust_id are spread
across two blocks
• Records with a given
prod_id are also
spread across two
blocks
• Data is sorted in equal
measures for both
keys
1
1
2
2
2
1
2
3
3
4
4
4
3
4
3
1
3
4
4
2
1
2
3
3
1
2
2
4
3
4
1
1
cust_id prod_id other columns blocks
How to use the feature
• New keyword ‘INTERLEAVED’ when defining sort keys
– Existing syntax will still work and behavior is unchanged
– You can choose up to 8 columns to include and can query with any or all of them
• No change needed to queries
• We’re just getting started with this feature
– Benefits are significant; load penalty is higher than we’d like and we’ll fix that quickly
• Will be available in a couple of weeks and we’d love to get your feedback
[[ COMPOUND | INTERLEAVED ] SORTKEY ( column_name [, ...] ) ]
Migration Considerations
Forklift = BAD
Typical ETL/ELT
• One file per table, maybe a few if too big
• Many updates (“massage” the data)
• Every job clears the data, then load
• Count on PK to block double loads
• High concurrency of load jobs
• Small table(s) to control the job stream
Two Questions to ask
• Why you do what you do?
– Many times they don’t even know
• What is the customer need ?
– Many times needs do not match practice
– You might have to add other AWS solutions
On Redshift
• Updates are delete + insert
– Deletes just mark rows for deletion
• Commits are expensive
– 4GB write on 8XL per node
– Mirrors WHOLE dictionary
– Cluster-wide Serialized
On Redshift
• Not all Aggregations created equal
– Pre aggregation can help
• Concurrency should be low
• No dashboard connected directly to RS
• WLM only parcels RAM to sessions, not priority
• Compression is for speed as well
• Distkey, Sortkey and datatypes are important
Not all MPP/Columnar are the same
• Only RS can DIST STYLE ALL and have a copy
per node (not per slice/core)
• Some columnar DB have a row-based version of
data on insert. Or a option for it
• RS does not charge millions of dollars and come
do the work for you.
Open Source Tools
• https://github.com/awslabs/amazon-redshift-utils
• Admin Scripts
– Collection of utilities for running diagnostics on your Cluster
• Admin Views
– Collection of utilities for managing your Cluster, generating Schema
DDL, etc
• Column Encoding Utility
– Gives you the ability to apply optimal Column Encoding to an
established Schema with data already loaded
Redshift deep dive
Sao Paulo

More Related Content

Redshift deep dive

  • 2. Amazon Redshift Deep Dive Eric Ferreira AWS Wanderlei Paiva Movile
  • 3. Amazon Redshift system architecture • Leader node – SQL endpoint – Stores metadata – Coordinates query execution • Compute nodes – Local, columnar storage – Execute queries in parallel – Load, backup, restore via Amazon S3; load from Amazon DynamoDB, Amazon EMR, or SSH • Two hardware platforms – Optimized for data processing – DW1: HDD; scale from 2TB to 2PB – DW2: SSD; scale from 160GB to 326TB 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  • 4. A deeper look at compute node architecture • Each node is split into slices – One slice per core – DW1 – 2 slices on XL, 16 on 8XL – DW2 – 2 slices on L, 32 on 8XL • Each slice is allocated memory, CPU, and disk space • Each slice processes a piece of the workload in parallel Leader Node
  • 6. Use multiple input files to maximize throughput • Use the COPY command • Each slice can load one file at a time • A single input file means only one slice is ingesting data • Instead of 100MB/s, you’re only getting 6.25MB/s
  • 7. Use multiple input files to maximize throughput • Use the COPY command • You need at least as many input files as you have slices • With 16 input files, all slices are working so you maximize throughput • Get 100MB/s per node; scale linearly as you add nodes
  • 8. Primary keys and manifest files • Amazon Redshift doesn’t enforce primary key constraints – If you load data multiple times, Amazon Redshift won’t complain – If you declare primary keys in your DML, the optimizer will expect the data to be unique • Use manifest files to control exactly what is loaded and how to respond if input files are missing – Define a JSON manifest on Amazon S3 – Ensures the cluster loads exactly what you want
  • 9. Analyze sort/dist key columns after every load • Amazon Redshift’s query optimizer relies on up-to- date statistics • Maximize performance by updating stats on sort/dist key columns after every load
  • 10. Automatic compression is a good thing (mostly) • Better performance, lower costs • COPY samples data automatically when loading into an empty table – Samples up to 100,000 rows and picks optimal encoding • If you have a regular ETL process and you use temp tables or staging tables, turn off automatic compression – Use analyze compression to determine the right encodings – Bake those encodings into your DML
  • 11. Be careful when compressing your sort keys • Zone maps store min/max per block • Once we know which block(s) contain the range, we know which row offsets to scan • Highly compressed sort keys means many rows per block • You’ll scan more data blocks than you need • If your sort keys compress significantly more than your data columns, you may want to skip compression
  • 12. Keep your columns as narrow as possible • During queries and ingestion, the system allocates buffers based on column width • Wider than needed columns mean memory is wasted • Fewer rows fit into memory; increased likelihood of queries spilling to disk
  • 14. “Com os serviços da AWS pudemos dosar os investimento iniciais e prospectar os custos para expansões futuras” • Líder em Mobile Commerce na América Latina – 50 Milhões de pessoas usam serviços Movile todo mês – Estamos conectados a + de 70 Operadoras em toda América – + de 50 Bilhões de transações por ano – + de 700 colaboradores em 11 escritórios (AL e EUA) • PlayKids – 10M de downloads / 3M usuários ativos – Conteúdo licenciado em 27 países e usuários em 102 países (6 idiomas: português, inglês, espanhol, alemão, francês e chines) – App #1 top grossing for Kids in Apple Store “O Redshift nos permitiu transformar dados em informações self-service” - Wanderley Paiva Database Specialist PlayKids iFood MapLink Apontador Rapiddo Superplayer Cinepapaya ChefTim e
  • 15. O Desafio • Escalabilidade • Disponibilidade • Centralização dos dados • Custos reduzidos e preferencialmente diluído
  • 18. Expanding Amazon Redshift’s query capabilities
  • 19. New SQL functions • We add SQL functions regularly to expand Amazon Redshift’s query capabilities • Added 25+ window and aggregate functions since launch, including: – APPROXIMATE_COUNT – DROP IF EXISTS, CREATE IF NOT EXISTS – REGEXP_SUBSTR, _COUNT, _INSTR, _REPLACE – PERCENTILE_CONT, _DISC, MEDIAN – PERCENT_RANK, RATIO_TO_REPORT • We’ll continue iterating but also want to enable you to write your own
  • 20. User Defined Functions • We’re enabling User Defined Functions (UDFs) so you can add your own – Scalar and Aggregate Functions supported • You’ll be able to write UDFs using Python 2.7 – Syntax is largely identical to PostgreSQL UDF Syntax – System and network calls within UDFs are prohibited • Comes with Pandas, NumPy, and SciPy pre-installed – You’ll also be able import your own libraries for even more flexibility
  • 21. Scalar UDF example – URL parsing CREATE FUNCTION f_hostname (VARCHAR url) RETURNS varchar IMMUTABLE AS $$ import urlparse return urlparse.urlparse(url).hostname $$ LANGUAGE plpythonu;
  • 23. You’re a small Internet bookstore • You’re interested in how you’re doing – Total sales – Best customers – Best-selling items – Top-selling author this month • A row store with indexes works well Orders Product Time Customer Site
  • 24. You get a little bigger • Your queries start taking longer • You move to a column store • Now you have zone maps, large data blocks, but no indexes • You have to choose which queries you want to be fast 10 | 13 | 14 | 26 |… … | 100 | 245 | 324 375 | 393 | 417… … 512 | 549 | 623 637 | 712 | 809 … … | 834 | 921 | 959
  • 25. Today’s state of the art: Zone maps, sorting, projections • Zone maps store the min/max values for every block in memory • Works great for sorted columns – O(log n) access to blocks • Doesn’t work so well for unsorted columns – O(n) access • Projections are multiple copies of data sorted different ways – Optimizer decides which copy to use for responding to queries – Loads are slower – Gets unwieldy quickly. If you have 8 columns, you need have 8 factorial (40,320) combinations.
  • 26. Blocks are points in multidimensional space 00 01 10 11 00 01 10 11 Customers Products 00 01 10 11 00 01 10 11 Customers Products • The 2D tables on the left are over- specified • You don’t need every product or customer to be in consecutive rows • You just need to make sure that each appears in the right sequence
  • 27. Space filling curves 00 01 10 11 00 01 10 11 Customers Products • You need a way of traversing the space that preserves order • And you need to touch every point in the space • You need a space filling curve – There are many of these, for example the curve on the right • Products appear in order as do customers and you don’t favor one over the other
  • 28. Compound Sort Keys Illustrated • Records in Redshift are stored in blocks. • For this illustration, let’s assume that four records fill a block • Records with a given cust_id are all in one block • However, records with a given prod_id are spread across four blocks 1 1 1 1 2 3 4 1 4 4 4 2 3 4 4 1 3 3 3 2 3 4 3 1 2 2 2 2 3 4 2 1 1 [1,1] [1,2] [1,3] [1,4] 2 [2,1] [2,2] [2,3] [2,4] 3 [3,1] [3,2] [3,3] [3,4] 4 [4,1] [4,2] [4,3] [4,4] 1 2 3 4 prod_id cust_id cust_id prod_id other columns blocks
  • 29. 1 [1,1] [1,2] [1,3] [1,4] 2 [2,1] [2,2] [2,3] [2,4] 3 [3,1] [3,2] [3,3] [3,4] 4 [4,1] [4,2] [4,3] [4,4] 1 2 3 4 prod_id cust_id Interleaved Sort Keys Illustrated • Records with a given cust_id are spread across two blocks • Records with a given prod_id are also spread across two blocks • Data is sorted in equal measures for both keys 1 1 2 2 2 1 2 3 3 4 4 4 3 4 3 1 3 4 4 2 1 2 3 3 1 2 2 4 3 4 1 1 cust_id prod_id other columns blocks
  • 30. How to use the feature • New keyword ‘INTERLEAVED’ when defining sort keys – Existing syntax will still work and behavior is unchanged – You can choose up to 8 columns to include and can query with any or all of them • No change needed to queries • We’re just getting started with this feature – Benefits are significant; load penalty is higher than we’d like and we’ll fix that quickly • Will be available in a couple of weeks and we’d love to get your feedback [[ COMPOUND | INTERLEAVED ] SORTKEY ( column_name [, ...] ) ]
  • 33. Typical ETL/ELT • One file per table, maybe a few if too big • Many updates (“massage” the data) • Every job clears the data, then load • Count on PK to block double loads • High concurrency of load jobs • Small table(s) to control the job stream
  • 34. Two Questions to ask • Why you do what you do? – Many times they don’t even know • What is the customer need ? – Many times needs do not match practice – You might have to add other AWS solutions
  • 35. On Redshift • Updates are delete + insert – Deletes just mark rows for deletion • Commits are expensive – 4GB write on 8XL per node – Mirrors WHOLE dictionary – Cluster-wide Serialized
  • 36. On Redshift • Not all Aggregations created equal – Pre aggregation can help • Concurrency should be low • No dashboard connected directly to RS • WLM only parcels RAM to sessions, not priority • Compression is for speed as well • Distkey, Sortkey and datatypes are important
  • 37. Not all MPP/Columnar are the same • Only RS can DIST STYLE ALL and have a copy per node (not per slice/core) • Some columnar DB have a row-based version of data on insert. Or a option for it • RS does not charge millions of dollars and come do the work for you.
  • 38. Open Source Tools • https://github.com/awslabs/amazon-redshift-utils • Admin Scripts – Collection of utilities for running diagnostics on your Cluster • Admin Views – Collection of utilities for managing your Cluster, generating Schema DDL, etc • Column Encoding Utility – Gives you the ability to apply optimal Column Encoding to an established Schema with data already loaded