SlideShare a Scribd company logo
Hive+Tez: A Performance
deep dive
Jitendra Pandey
Gopal Vijayaraghavan
© Hortonworks Inc. 2014.
Stinger Project
(announced February 2013)
Batch AND Interactive SQL-IN-Hadoop
Stinger Initiative
A broad, community-based effort to
drive the next generation of HIVE
Hive 0.13, April, 2013
• Hive on Apache Tez
• Cost Based Optimizer (Optiq)
• Vectorized Processing
Hive 0.11, May 2013:
• Base Optimizations
• SQL Analytic Functions
• ORCFile, Modern File Format
Hive 0.12, October 2013:
• VARCHAR, DATE Types
• ORCFile predicate pushdown
• Advanced Optimizations
• Performance Boosts via YARN
Speed
Improve Hive query performance by 100X to
allow for interactive query times (seconds)
Scale
The only SQL interface to Hadoop designed
for queries that scale from TB to PB
SQL
Support broadest range of SQL semantics for
analytic applications running against Hadoop
…all IN Hadoop
Goals:
© Hortonworks Inc. 2014.
SPEED: Increasing Hive Performance
Key Highlights
– Tez: New execution engine
– Vectorized Query Processing
– Startup time improvement
– Statistics to accelerate query execution
– Cost Based Optimizer: Optiq
Interactive Query Times across ALL use cases
• Simple and advanced queries in seconds
• Integrates seamlessly with existing tools
• Currently a >100x improvement in just nine months
Elements of Fast SQL Execution
• Query Planner/Cost Based
Optimizer w/ Statistics
• Query Startup
• Query Execution
• I/O Path
© Hortonworks Inc. 2014.
Statistics and Cost-based optimization
• Statistics:
– Hive has table and column level statistics
– Used to determine parallelism, join selection
• Optiq: Open source, Apache licensed query execution framework in Java
– Used by Apache Drill, Apache Cascading, Lucene DB
– Based on Volcano paper
– 20 man years dev, more than 50 optimization rules
• Goals for hive
– Ease of Use – no manual tuning for queries, make choices automatically based on cost
– View Chaining/Ad hoc queries involving multiple views
– Help enable BI Tools front-ending Hive
– Emphasis on latency reduction
• Cost computation will be used for
 Join ordering
 Join algorithm selection
 Tez vertex boundary selection
Page 4
HIVE-5775

Recommended for you

ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data

ORC files were originally introduced in Hive, but have now migrated to an independent Apache project. This has sped up the development of ORC and simplified integrating ORC into other projects, such as Hadoop, Spark, Presto, and Nifi. There are also many new tools that are built on top of ORC, such as Hive’s ACID transactions and LLAP, which provides incredibly fast reads for your hot data. LLAP also provides strong security guarantees that allow each user to only see the rows and columns that they have permission for. This talk will discuss the details of the ORC and Parquet formats and what the relevant tradeoffs are. In particular, it will discuss how to format your data and the options to use to maximize your read performance. In particular, we’ll discuss when and how to use ORC’s schema evolution, bloom filters, and predicate push down. It will also show you how to use the tools to translate ORC files into human-readable formats, such as JSON, and display the rich metadata from the file including the type in the file and min, max, and count for each column.

dataworks summitdataworks summit 2017hadoop summit
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL

In Spark SQL the physical plan provides the fundamental information about the execution of the query. The objective of this talk is to convey understanding and familiarity of query plans in Spark SQL, and use that knowledge to achieve better performance of Apache Spark queries. We will walk you through the most common operators you might find in the query plan and explain some relevant information that can be useful in order to understand some details about the execution. If you understand the query plan, you can look for the weak spot and try to rewrite the query to achieve a more optimal plan that leads to more efficient execution. The main content of this talk is based on Spark source code but it will reflect some real-life queries that we run while processing data. We will show some examples of query plans and explain how to interpret them and what information can be taken from them. We will also describe what is happening under the hood when the plan is generated focusing mainly on the phase of physical planning. In general, in this talk we want to share what we have learned from both Spark source code and real-life queries that we run in our daily data processing.

* apache spark

 *big data

 *ai

 *
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals

The document provides an overview of Apache Spark internals and Resilient Distributed Datasets (RDDs). It discusses: - RDDs are Spark's fundamental data structure - they are immutable distributed collections that allow transformations like map and filter to be applied. - RDDs track their lineage or dependency graph to support fault tolerance. Transformations create new RDDs while actions trigger computation. - Operations on RDDs include narrow transformations like map that don't require data shuffling, and wide transformations like join that do require shuffling. - The RDD abstraction allows Spark's scheduler to optimize execution through techniques like pipelining and cache reuse.

apache sparkshufflingrdds
© Hortonworks Inc. 2014.
TPC-DS Query 17
select i_item_id
,i_item_desc
,s_state
,count(ss_quantity) as store_sales_quantitycount
,….
from store_sales ss ,store_returns sr, catalog_sales cs, date_dim d1, date_dim d2, date_dim d3, store s, item i
where d1.d_quarter_name = '2000Q1’ and d1.d_date_sk = ss.ss_sold_date_sk and i.i_item_sk = ss.ss_item_sk
and s.s_store_sk = ss.ss_store_sk and ss.ss_customer_sk = sr.sr_customer_sk and ss.ss_item_sk = sr.sr_item_sk
…
group by i_item_id ,i_item_desc, ,s_state
order by i_item_id ,i_item_desc, s_state
limit 100;
 Joins Store Sales, Store Returns and Catalog Sales fact tables.
 Each of the fact tables are independently restricted by time.
 Analysis at Item and Store grain, so these dimensions are also joined in.
 As specified Query starts by joining the 3 Fact tables.
© Hortonworks Inc. 2014.
TPC-DS Query 17
Specified
Join Tree
Non CBO Plan
CBO
Plan
© Hortonworks Inc. 2014.
TPC-DS Query 17
Run 1 Run 2
Non
CBO
127.53 100.71
CBO 50.9 44.52
 Fact tables
 partitioned by Day,
 bucketed by Item
 Bucketing off
 Bucketing should help CBO plan.
 SR table much smaller. Better chance of Bucket Join
in place of Shuffle Join.
Join Ordering Cost Estimate
['item', [[[[[['d2', 'store_returns'], 'store_sales'], 'catalog_sales'], 'd1'], 'd3'],
'store']]
3547898.061
…
['store_returns', 'd2’] 19224.71
['store_sales', 'store_returns’] 23057497.991
['d1', 'store_sales'] 26142.943
Facts restricted to 3 months
Orderings considered by Planner
© Hortonworks Inc. 2014.
Apache Tez (“Speed”)
• Replaces MapReduce as primitive for Pig, Hive, Cascading etc.
– Smaller latency for interactive queries
– Higher throughput for batch queries
– 22 contributors: Hortonworks (13), Facebook, Twitter, Yahoo, Microsoft
YARN ApplicationMaster to run DAG of Tez Tasks
Task with pluggable Input, Processor and Output
Tez Task - <Input, Processor, Output>
Task
ProcessorInput Output

Recommended for you

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture

This is the presentation I made on JavaDay Kiev 2015 regarding the architecture of Apache Spark. It covers the memory model, the shuffle implementations, data frames and some other high-level staff and can be used as an introduction to Apache Spark

apache sparkdistributed systemtungsten
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark

This presentation is an introduction to Apache Spark. It covers the basic API, some advanced features and describes how Spark physically executes its jobs.

apache sparkdistributed computingmapreduce
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture

A deep dive on Hadoop security given at Yahoo to other architects when Yahoo was rolling out the first version of Hadoop with security 0.20.100.

hadoop securitykeberos
© Hortonworks Inc. 2014.
Hive – MR Hive – Tez
Hive-on-MR vs. Hive-on-Tez
SELECT g1.x, g1.avg, g2.cnt
FROM (SELECT a.x, AVERAGE(a.y) AS avg FROM a GROUP BY a.x) g1
JOIN (SELECT b.x, COUNT(b.y) AS avg FROM b GROUP BY b.x) g2
ON (g1.x = g2.x)
ORDER BY avg;
GROUP a BY a.x
JOIN (a,b)
GROUP b BY b.x
ORDER BY
M M M
R R
M M
R
M M
R
M
R
HDFS HDFS
HDFS
M M M
R R
R
M M
R
GROUP BY a.x
JOIN (a,b)
ORDER BY
GROUP BY x
Tez avoids
unnecessary writes
to HDFS
HIVE-4660
© Hortonworks Inc. 2014.
Shuffle Join
SELECT ss.ss_item_sk, ss.ss_quantity, inv.inv_quantity_on_hand
FROM inventory inv
JOIN store_sales ss
ON (inv.inv_item_sk = ss.ss_item_sk);
Hive – MR Hive – Tez
© Hortonworks Inc. 2014.
Broadcast Join
SELECT ss.ss_item_sk, ss.ss_quantity, avg_price, inv.inv_quantity_on_hand
FROM (select avg(ss_sold_price) as avg_price, ss_item_sk, ss_quantity_sk from store_sales
group by ss_item_sk) ss
JOIN inventory inv
ON (inv.inv_item_sk = ss.ss_item_sk);
Hive – MR Hive – Tez
M
M
M
M M
HDFS
Store Sales
scan. Group by
and aggregation
reduce size of
this input.
Inventory scan
and Join
Broadcast
edge
M M M
HDFS
Store Sales
scan. Group by
and aggregation.
Inventory and Store
Sales (aggr.) output
scan and shuffle
join.
R R
R R
RR
M
MMM
HDFS
© Hortonworks Inc. 2014.
1-1 Edge
• Typical star schema join involve join between large number of
tables
• Dimension aren’t always tiny (Customer dimension)
• Might not be able to handle all dimensions in single vertex as
broadcast joins
• Tez allows streaming records from one processor to the next via
a 1-1 Edge
– Transfer details (streaming, files, etc) are handled transparently
– Scheduling/cluster capacity is worked out by Tez
• Allows hive to build a pipeline of in memory joins which we can
stream records through

Recommended for you

Spark overview
Spark overviewSpark overview
Spark overview

This presentation gives a brief introduction of the Spark Framework and how it can be used in machine learning platform.

open sourcemachine learning
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive

Detailed technical material about MyRocks -- RocksDB storage engine for MySQL -- https://github.com/facebook/mysql-5.6

myrocksmysqlrocksdb
6.hive
6.hive6.hive
6.hive

Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. While developed by Facebook.

hadoophive
© Hortonworks Inc. 2014.
Dynamically Partitioned Hash Join
SELECT ss.ss_item_sk, ss.ss_quantity, inv.inv_quantity_on_hand
FROM store_sales ss
JOIN inventory inv
ON (inv.inv_item_sk = ss.ss_item_sk);
Hive – MR Hive – Tez
M MM
M M
HDFS
Inventory scan
(Runs on
cluster
potentially more
than 1 mapper)
Store Sales
scan and Join
(Custom vertex
reads both
inputs – no side
file reads)
Custom
edge (routes
outputs of
previous stage to
the correct
Mappers of the
next stage)
M MM
M
HDFS
Inventory scan
(Runs as single
local map task)
Store Sales
scan and Join
(Inventory hash
table read as
side file)
HDFS
© Hortonworks Inc. 2014.
Dynamically Partitioned Hash Join
Plans look very similar to map join but the way things work change between
MR and Tez.
Hive – MR (Bucket map-join) Hive – Tez
• Not dynamically partitioned.
• Both tables need to be bucketed by the join
key.
• Local task that generates the hash table
writes n files corresponding to n buckets.
• Number of mappers for the join must be
same as the number of buckets.
• Each of these mappers reads the
corresponding bucket file of the local task to
perform the join.
• Only one of the sides needs to be bucketed
and the other side is dynamically bucketed.
• Also works if neither side is explicitly
bucketed, but another operation forced
bucketing in the pipeline (traits)
• No writing to HDFS.
• There can be more mappers than number of
buckets, and a bucket can be processed in
parallel on multiple mappers.
© Hortonworks Inc. 2014.
Union all
SELECT count(*) FROM (
SELECT distinct ss_customer_sk from store_sales where ss_store_sk = 1
UNION ALL
SELECT distinct ss_customer_sk from store_sales where ss_store_sk = 2) as customers
Hive – MR Hive – Tez
M M M
R
M M M
HDFS
R
M
R
HDFS
M M M
R
M M M
HDFS
R
R
Two MR jobs to
do the distinct
Both sub-queries
are materialized
onto HDFS
Single map
reads both sides
and aggregates
In Tez the sub-query
output is pre-aggregated
and send directly to a
common final node
© Hortonworks Inc. 2014.
Multi-insert queries
FROM (SELECT * FROM store_sales, date_dim WHERE ss_sold_date_sk = d_date_sk
and d_year = 2000)
INSERT INTO TABLE t1 SELECT distinct ss_item_sk
INSERT INTO TABLE t2 SELECT distinct ss_customer_sk;
Hive – MR Hive – Tez
M MM
M
HDFS
Map join
date_dim/store
sales
Two MR jobs to
do the distinct
M MM
M M
HDFS
RR
HDFS
M M M
R
M M M
R
HDFS
Broadcast Join
(scan date_dim,
join store sales)
Distinct for
customer + items
Materialize join on
HDFS

Recommended for you

Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals

This document provides an overview of the Apache Spark framework. It covers Spark fundamentals including the Spark execution model using Resilient Distributed Datasets (RDDs), basic Spark programming, and common Spark libraries and use cases. Key topics include how Spark improves on MapReduce by operating in-memory and supporting general graphs through its directed acyclic graph execution model. The document also reviews Spark installation and provides examples of basic Spark programs in Scala.

apache spark
Apache hive introduction
Apache hive introductionApache hive introduction
Apache hive introduction

Apache Hive is a data warehouse software built on top of Hadoop that allows users to query data stored in various databases and file systems using an SQL-like interface. It provides a way to summarize, query, and analyze large datasets stored in Hadoop distributed file system (HDFS). Hive gives SQL capabilities to analyze data without needing MapReduce programming. Users can build a data warehouse by creating Hive tables, loading data files into HDFS, and then querying and analyzing the data using HiveQL, which Hive then converts into MapReduce jobs.

apachehivecloud computing
Internal Hive
Internal HiveInternal Hive
Internal Hive

The document provides an overview of Hive architecture and workflow. It discusses how Hive converts HiveQL queries to MapReduce jobs through its compiler. The compiler includes components like the parser, semantic analyzer, logical and physical plan generators, and logical and physical optimizers. It analyzes sample HiveQL queries and shows the transformations done at each compiler stage to generate logical and physical execution plans consisting of operators and tasks.

hadoophive
© Hortonworks Inc. 2014.
Execution
“A good plan violently executed now is better
than a perfect plan executed next week.
George S. Patton
© Hortonworks Inc. 2014.
Faster Query Setup
• AM per-session instead of per-query
– Reused across JDBC connections
• No more local tasks
– Except fetch aggregation
• Metastore fetches are much faster
– Metastore direct sql fast-path
– Partition filters pushed to metastore
• Use distributed cache efficiently for hive-exec.jar
– /home/$user/.hiveJars
• UDF Jars as well
– .jar.<sha1> identifier to avoid conflicts
– Multiple version compatibility easily
– YARN localizes the jars once per node (not per query)
• Kryo instead of XML to serialize operators
– Works better on jdk7
Page 18
© Hortonworks Inc. 2014.
Faster Operator Pipeline
• Previously on hive
© Hortonworks Inc. 2014.
Operator Vectorization
• Avoid Writable objects & use primitive int/long
– Allows efficient JIT code for primitive types
• Generate per-type loops & avoid runtime type-checks
• The classes generated look like
– LongColEqualDoubleColumn
– LongColEqualLongColumn
– LongColEqualLongScalar
• Avoid duplicate operations on repeated values
– isRepeating & hasNulls

Recommended for you

Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek

Apache Spark presentation at HasGeek FifthElelephant https://fifthelephant.talkfunnel.com/2015/15-processing-large-data-with-apache-spark Covering Big Data Overview, Spark Overview, Spark Internals and its supported libraries

ml pipelinedata framesdstream
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction

This document discusses Spark shuffle, which is an expensive operation that involves data partitioning, serialization/deserialization, compression, and disk I/O. It provides an overview of how shuffle works in Spark and the history of optimizations like sort-based shuffle and an external shuffle service. Key concepts discussed include shuffle writers, readers, and the pluggable block transfer service that handles data transfer. The document also covers shuffle-related configuration options and potential future work.

spark; internal; shuffle;
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...

RocksDB is the default state store for Kafka Streams. In this talk, we will discuss how to improve single node performance of the state store by tuning RocksDB and how to efficiently identify issues in the setup. We start with a short description of the RocksDB architecture. We discuss how Kafka Streams restores the state stores from Kafka by leveraging RocksDB features for bulk loading of data. We give examples of hand-tuning the RocksDB state stores based on Kafka Streams metrics and RocksDB’s metrics. At the end, we dive into a few RocksDB command line utilities that allow you to debug your setup and dump data from a state store. We illustrate the usage of the utilities with a few real-life use cases. The key takeaway from the session is the ability to understand the internal details of the default state store in Kafka Streams so that engineers can fine-tune their performance for different varieties of workloads and operate the state stores in a more robust manner.

kafka streamsmicroservicesintermediate
© Hortonworks Inc. 2014.
Optimized Row Columnar File
• ORC Vectorized Reader
• Logical Compression helps reader
– isRepeating
• Split per-stripe
• Row-group level indexes
• Stripe level indexes
• PPD avoids a lot of IO
– Column conditions are ANDed
© Hortonworks Inc. 2014.
Faster Statistics
• ORC stripe footers aggregate stats per-column
– Min/Max/Sum/Count
• set hive.stats.autogather=true;
• ANALYZE TABLE <table> compute statistics partialscan;
– Reads only ORC footers
• Predicate computation without Tez/MR tasks
© Hortonworks Inc. 2014.
Faster Execution: Tez
• Multiple edge types
– Broadcast
– Shuffle
– One-to-One
• Multiple output types
– Sorted
– Unsorted
– Unsorted Partitioned
• Per-vertex configurations
– Instead of one configuration between M&R tasks
© Hortonworks Inc. 2014.
Tez I/O speed-ups
• Tez shuffle can use keep-alive over HTTP
• Shuffle scheduler can optimize connection count
– Can fetch all map outputs from one node via 1 connection
• Can skip fetching 0 sized partitions from a mapper
– Speeds up group-by queries with high locality
– Reducers finish shuffle faster
• Shuffle threads are re-used in container re-use
– Secure shuffle has crypto thread-local inits

Recommended for you

Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs

"The common use cases of Spark SQL include ad hoc analysis, logical warehouse, query federation, and ETL processing. Spark SQL also powers the other Spark libraries, including structured streaming for stream processing, MLlib for machine learning, and GraphFrame for graph-parallel computation. For boosting the speed of your Spark applications, you can perform the optimization efforts on the queries prior employing to the production systems. Spark query plans and Spark UIs provide you insight on the performance of your queries. This talk discloses how to read and tune the query plans for enhanced performance. It will also cover the major related features in the recent and upcoming releases of Apache Spark. "

Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia

Apache Spark is a fast and flexible compute engine for a variety of diverse workloads. Optimizing performance for different applications often requires an understanding of Spark internals and can be challenging for Spark application developers. In this session, learn how Facebook tunes Spark to run large-scale workloads reliably and efficiently. The speakers will begin by explaining the various tools and techniques they use to discover performance bottlenecks in Spark jobs. Next, you’ll hear about important configuration parameters and their experiments tuning these parameters on large-scale production workload. You’ll also learn about Facebook’s new efforts towards automatically tuning several important configurations based on nature of the workload. The speakers will conclude by sharing their results with automatic tuning and future directions for the project.ing several important configurations based on nature of the workload. We will conclude by sharing our result with automatic tuning and future directions for the project.

spark summitapache spark
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query PerformanceORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query Performance

Hive’s RCFile has been the standard format for storing Hive data for the last 3 years. However, RCFile has limitations because it treats each column as a binary blob without semantics. The upcoming Hive 0.11 will add a new file format named Optimized Row Columnar (ORC) file that uses and retains the type information from the table definition. ORC uses type specific readers and writers that provide light weight compression techniques such as dictionary encoding, bit packing, delta encoding, and run length encoding — resulting in dramatically smaller files. Additionally, ORC can apply generic compression using zlib, LZO, or Snappy on top of the lightweight compression for even smaller files. However, storage savings are only part of the gain. ORC supports projection, which selects subsets of the columns for reading, so that queries reading only one column read only the required bytes. Furthermore, ORC files include light weight indexes that include the minimum and maximum values for each column in each set of 10,000 rows and the entire file. Using pushdown filters from Hive, the file reader can skip entire sets of rows that aren’t important for this query. Columnar storage formats like ORC reduce I/O and storage use, but it’s just as important to reduce CPU usage. A technical breakthrough called vectorized query execution works nicely with column store formats to do this. Vectorized query execution has proven to give dramatic performance speedups, on the order of 10X to 100X, for structured data processing. We describe how we’re adding vectorized query execution to Hive, coupling it with ORC with a vectorized iterator.

apache hadoophadoop summit 2013big data
© Hortonworks Inc. 2014.
Skewed Reducers: auto-parallelism
• Often queries are slow because of one slow reducer
• Skewed data is too common in real life queries
• This avoids running too many reducers with with very little data
• Future
– This can be extended to group by input size
– This mechanism can actually speculate on stalling reducers better (split into 3)
© Hortonworks Inc. 2014.
A Query in motion
Page 26
• 4-way Map join + map reduce reduce query
• Timeline in left to right, each lane represents one container
© Hortonworks Inc. 2014.
Defer/Skip tasks
Page 27
• No more uploading hive-exec.jar/UDFs for every query
• No more spinning up an AM for each stage
• No more computation on hive client (local task)
© Hortonworks Inc. 2014.
Concurrency of small tasks
Page 28
• Hive used to run several lightweight tasks in a local VM
• LocalTask was a bottleneck
– No locality
– No parallelism
– Small VM
• Tez Broadcast edges solve that problem

Recommended for you

Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries

This document summarizes techniques for optimizing Hive queries, including recommendations around data layout, format, joins, and debugging. It discusses partitioning, bucketing, sort order, normalization, text format, sequence files, RCFiles, ORC format, compression, shuffle joins, map joins, sort merge bucket joins, count distinct queries, using explain plans, and dealing with skew.

hadoopapache hivehadoopsummit
Data organization: hive meetup
Data organization: hive meetupData organization: hive meetup
Data organization: hive meetup

The document discusses various techniques for optimizing data organization and performance in Hive, including: - Partitioning data by meaningful columns like customer ID or VIN to improve lookup performance. - Using the right number and size of buckets to avoid performance issues from too many small files or skewed data distribution. - Denormalizing data and optimizing JOIN queries through techniques like broadcast joins. - Storing data in its natural types like numbers instead of strings to enable predicate pushdown and better performance. - Using temporary tables and in-memory storage to optimize queries involving data reorganization or distinct slices.

hive meetup bigdata
LLAP Nov Meetup
LLAP Nov MeetupLLAP Nov Meetup
LLAP Nov Meetup

The document discusses Live Long and Process (LLAP), a new capability in Apache Hive that enables sub-second query performance. LLAP achieves this through caching the hottest data in RAM on each Hadoop node and running queries against this cache via lightweight long-running daemon processes. It allows for 100% SQL compatibility while integrating with existing security and tools. LLAP provides benefits like failure tolerance, concurrency, ACID transactions, and elastic scaling. Performance tests on TPC-DS queries demonstrated sub-second latency for queries even at large data scales and high concurrency levels.

llap hive hive2.0
© Hortonworks Inc. 2014.
Concurrent Split Generation
Page 29
• Tez input intializers are run parallel
• No more spinning up an AM for each stage
• No more computation on hive client (local task)
© Hortonworks Inc. 2014.
Split Elimination
Page 30
• ORC comes with Predicate Push Down in the reader
• Queries with SARGable where clauses
– http://en.wikipedia.org/wiki/Sargable
• Run the SARGs in the AM, using ORC footer data
– Eliminate splits before task spinups, avoid container costs
• Offers a soft cache for the ORC footers
• Zero splits offers an early exit for data validity checks (i.e price < 0)
© Hortonworks Inc. 2014.
Pipelining Split->Task
Page 31
• The task only depends on its own input
• It starts talking to YARN immediately once its inputs are ready
• Faster generation of dimension tables
• Fact tables can optimize on this further
– Will break existing FileSplit mechanism
© Hortonworks Inc. 2014.
Filling up the pipeline
Page 32
• Tez allows grouping splits dynamically
• Obsoletes CombineFileInputFormat
• Grouped according to locality
–1.7 x available containers (or any factor actually)
• Allow query to use up 100% of queue capacity
–Without tuning mapred split size for each data-set

Recommended for you

Indexed Hive
Indexed HiveIndexed Hive
Indexed Hive

This document summarizes a presentation on using indexes in Hive to accelerate query performance. It describes how indexes provide an alternative view of data to enable faster lookups compared to full data scans. Example queries demonstrating group by and aggregation are rewritten to use an index on the shipdate column. Performance tests on TPC-H data show the indexed queries outperforming the non-indexed versions by an order of magnitude. Future work is needed to expand rewrite rules and integrate indexing fully into Hive's optimizer.

indexesqueriesmapreducehadoop
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! ScaleHive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! Scale

This document discusses benchmarking Hive at Yahoo scale. Some key points: - Hive is the fastest growing product on Yahoo's Hadoop clusters which process 750k jobs per day across 32500 nodes. - Benchmarking was done using TPC-H queries on 100GB, 1TB, and 10TB datasets stored in ORC format. - Significant performance improvements were seen over earlier Hive versions, with 18x speedup over Hive 0.10 on text files for the 100GB dataset. - Average query time was reduced from 530 seconds to 28 seconds for the 100GB dataset, and from 729 seconds to 172 seconds for the 1TB dataset.

Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next

The document discusses new features in Apache Hive 0.14 that improve SQL query performance. It introduces a cost-based optimizer that can optimize join orders, enabling faster query times. An example TPC-DS query is shown to demonstrate how the optimizer selects an efficient join order based on statistics about table and column sizes. Faster SQL queries are now possible in Hive through this query optimization capability.

apache hadoopapache hivehortonworks
© Hortonworks Inc. 2014.
ORC Split extras
• RCFile had horrible split performance
– rcfile::sync() was slow to find a sync point
• ORC Reader allows exact splits for stripes
• ORC Writer can pad a stripe to an HDFS block
– 5%-7% overhead measured on table
– 100% locality of a stripe in a block
© Hortonworks Inc. 2014.
Container reuse
• Tez specific feature
• Run an entire DAG using the same containers
• Different vertices use same container
• Saves time talking to YARN for new containers
© Hortonworks Inc. 2014.
Container reuse (II)
• Tez provides an object registry within a vertex
• This can be used to cache map-join hash-tables
• JVM JIT kicks in and optimizes better on re-use
© Hortonworks Inc. 2014.
Container re-use (Session)
• Keep a container group alive between queries
• Fast query spin-up and skip YARN queue
• Even better JIT performance on >1 queries

Recommended for you

File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet

Hadoop Summit June 2016 The landscape for storing your big data is quite complex, with several competing formats and different implementations of each format. Understanding your use of the data is critical for picking the format. Depending on your use case, the different formats perform very differently. Although you can use a hammer to drive a screw, it isn’t fast or easy to do so. The use cases that we’ve examined are: * reading all of the columns * reading a few of the columns * filtering using a filter predicate * writing the data Furthermore, it is important to benchmark on real data rather than synthetic data. We used the Github logs data available freely from http://githubarchive.org We will make all of the benchmark code open source so that our experiments can be replicated.

orcjsonbenchmarks
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive PerformanceСергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance

Сергей Ковалёв: Solutions Architect, Big Data/High-performance Computation Expert в Altoros; г.Минск Доклад: «Practical Steps to Improve Apache Hive Performance»

February 2014 HUG : Pig On Tez
February 2014 HUG : Pig On TezFebruary 2014 HUG : Pig On Tez
February 2014 HUG : Pig On Tez

This document discusses the integration of Apache Pig with Apache Tez. Pig provides a procedural scripting language for data processing workflows, while Tez is a framework for executing directed acyclic graphs (DAGs) of tasks. Migrating Pig to use Tez as its execution engine provides benefits like reduced resource usage, improved performance, and container reuse compared to Pig's default MapReduce execution. The document outlines the design changes needed to compile Pig scripts to Tez DAGs and provides examples and performance results. It also discusses ongoing work to achieve full feature parity with MapReduce and further optimize performance.

pighughadoop
© Hortonworks Inc. 2014.
HiveServer2 and Sessions
• HiveServer2 can keep sessions alive
–Between different JDBC queries
• New security model helps
–All secure queries run as “hive” user
• Ideal for short exploratory queries
• Uses same JARs (no download for task)
• Even better JIT performance on >1 queries
© Hortonworks Inc. 2014.
Supersize it!
• 78 vertex + 8374 tasks on 50 containers
Page 38
© Hortonworks Inc. 2014.
Query overload #2
• 5000 hive query test-set
• Only 3.9k triggered compute tasks
• Rest was optimized away into fetch tasks or metadata tasks
• Gets progressively faster as the JVM JIT improves the native code
Page 39
© Hortonworks Inc. 2014.
Big picture
1501.895
1176.479
631.027
4.872
0
200
400
600
800
1000
1200
1400
1600
Text Columnar Partitioned Stinger
Latency

Recommended for you

Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading

This document discusses Pig Hive and Cascading, tools for processing large datasets using Hadoop. It provides background on each tool, including that Pig was developed by Yahoo Research in 2006, Hive was developed by Facebook in 2007, and Cascading was authored by Chris Wensel in 2008. It then covers typical use cases for each tool like web analytics processing, mining search logs for synonyms, and building a product recommender. Finally, it discusses how each tool works, mapping queries to MapReduce jobs, and compares features of the tools like philosophy, productivity and data models.

Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production

The document discusses Hadoop integration with cloud storage. It describes the Hadoop-compatible file system architecture, which allows applications to work with different storage systems transparently. Recent enhancements to the S3A connector for Amazon S3 are discussed, including performance improvements and support for encryption. Benchmark results show significant performance gains for Hive queries running on S3A compared to earlier versions. Upcoming work on consistency, output committers, and abstraction layers is outlined to further improve object store integration.

hadoop summit
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller

Apache ORC has undergone significant improvements since its introduction in 2013 to provide faster, better, and smaller data analytics. Some key improvements include the addition of vectorized readers, columnar storage, predicate pushdown using bloom filters and statistics, improved compression techniques, and optimizations that reduce data size and query execution time. Over the years, ORC has become the native data format for Apache Hive and been adopted by many large companies for analytics workloads.

orchadoop summithortonworks
© Hortonworks Inc. 2014.
Roadmap
• Expand uses for CBO
– Join Algorithm selection
– Tez checkpoint selection (recovery)
• Temp Tables
– Session life-time
– Sharing of intermediate results
• Materialized views
– Pre-compute common results/aggregations
– Transparently route via CBO
• Join/Grouping w/o sort
– Tez decouples algorithm from data transfer
• Sort-merge bucket in Tez
– Leverage vertex manager
– Co-locate partitions on HDFS
• Inline sampling/range partitioning with Tez
– Sample/create histogram dynamically for skew joins and total order sort
Page 41

More Related Content

What's hot

Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
Benjamin Leonhardi
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Anton Kirillov
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
Databricks
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
Databricks
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
Pietro Michiardi
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Anastasios Skarlatidis
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
Owen O'Malley
 
Spark overview
Spark overviewSpark overview
Spark overview
Lisa Hua
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
Yoshinori Matsunobu
 
6.hive
6.hive6.hive
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
Zahra Eskandari
 
Apache hive introduction
Apache hive introductionApache hive introduction
Apache hive introduction
Mahmood Reza Esmaili Zand
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
Recruit Technologies
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
Venkata Naga Ravi
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
confluent
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
Databricks
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Databricks
 

What's hot (20)

Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
Spark overview
Spark overviewSpark overview
Spark overview
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
6.hive
6.hive6.hive
6.hive
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Apache hive introduction
Apache hive introductionApache hive introduction
Apache hive introduction
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 

Viewers also liked

ORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query PerformanceORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
DataWorks Summit
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
Owen O'Malley
 
Data organization: hive meetup
Data organization: hive meetupData organization: hive meetup
Data organization: hive meetup
t3rmin4t0r
 
LLAP Nov Meetup
LLAP Nov MeetupLLAP Nov Meetup
LLAP Nov Meetup
t3rmin4t0r
 
Indexed Hive
Indexed HiveIndexed Hive
Indexed Hive
NikhilDeshpande
 
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! ScaleHive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! Scale
DataWorks Summit
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Hortonworks
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
 
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive PerformanceСергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance
Olga Lavrentieva
 
February 2014 HUG : Pig On Tez
February 2014 HUG : Pig On TezFebruary 2014 HUG : Pig On Tez
February 2014 HUG : Pig On Tez
Yahoo Developer Network
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
Dataiku
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
 
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
DataWorks Summit
 
Apache Hive 0.13 Performance Benchmarks
Apache Hive 0.13 Performance BenchmarksApache Hive 0.13 Performance Benchmarks
Apache Hive 0.13 Performance Benchmarks
Hortonworks
 
Tune up Yarn and Hive
Tune up Yarn and HiveTune up Yarn and Hive
Tune up Yarn and Hive
rxu
 
JVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationJVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark application
Tatsuhiro Chiba
 
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
DB Tsai
 
Hive Demo Paper at VLDB 2009
Hive Demo Paper at VLDB 2009Hive Demo Paper at VLDB 2009
Hive Demo Paper at VLDB 2009
Namit Jain
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVRO
airisData
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
 

Viewers also liked (20)

ORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query PerformanceORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
 
Data organization: hive meetup
Data organization: hive meetupData organization: hive meetup
Data organization: hive meetup
 
LLAP Nov Meetup
LLAP Nov MeetupLLAP Nov Meetup
LLAP Nov Meetup
 
Indexed Hive
Indexed HiveIndexed Hive
Indexed Hive
 
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! ScaleHive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! Scale
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
 
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive PerformanceСергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance
Сергей Ковалёв (Altoros): Practical Steps to Improve Apache Hive Performance
 
February 2014 HUG : Pig On Tez
February 2014 HUG : Pig On TezFebruary 2014 HUG : Pig On Tez
February 2014 HUG : Pig On Tez
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
 
Apache Hive 0.13 Performance Benchmarks
Apache Hive 0.13 Performance BenchmarksApache Hive 0.13 Performance Benchmarks
Apache Hive 0.13 Performance Benchmarks
 
Tune up Yarn and Hive
Tune up Yarn and HiveTune up Yarn and Hive
Tune up Yarn and Hive
 
JVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationJVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark application
 
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
 
Hive Demo Paper at VLDB 2009
Hive Demo Paper at VLDB 2009Hive Demo Paper at VLDB 2009
Hive Demo Paper at VLDB 2009
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVRO
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 

Similar to Hive+Tez: A performance deep dive

Performance Hive+Tez 2
Performance Hive+Tez 2Performance Hive+Tez 2
Performance Hive+Tez 2
t3rmin4t0r
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over Yarn
InMobi Technology
 
La big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixitLa big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixit
Data Con LA
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
t3rmin4t0r
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoop
Rommel Garcia
 
Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
Michael Ming Lei
 
מיכאל
מיכאלמיכאל
מיכאל
sqlserver.co.il
 
Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14
Julian Hyde
 
February 2014 HUG : Hive On Tez
February 2014 HUG : Hive On TezFebruary 2014 HUG : Hive On Tez
February 2014 HUG : Hive On Tez
Yahoo Developer Network
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013
alanfgates
 
An In-Depth Look at Putting the Sting in Hive
An In-Depth Look at Putting the Sting in HiveAn In-Depth Look at Putting the Sting in Hive
An In-Depth Look at Putting the Sting in Hive
DataWorks Summit
 
Stinger hadoop summit june 2013
Stinger hadoop summit june 2013Stinger hadoop summit june 2013
Stinger hadoop summit june 2013
alanfgates
 
Austin Scales- Clickstream Analytics at Bazaarvoice
Austin Scales- Clickstream Analytics at BazaarvoiceAustin Scales- Clickstream Analytics at Bazaarvoice
Austin Scales- Clickstream Analytics at Bazaarvoice
bazaarvoice_engineering
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
Adam Muise
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on Hadoop
Carol McDonald
 
Gunther hagleitner:apache hive & stinger
Gunther hagleitner:apache hive & stingerGunther hagleitner:apache hive & stinger
Gunther hagleitner:apache hive & stinger
hdhappy001
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data Pipelines
Vladimír Schreiner
 
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times FasterApril 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
Yahoo Developer Network
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
Mark Kromer
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hive
David Kaiser
 

Similar to Hive+Tez: A performance deep dive (20)

Performance Hive+Tez 2
Performance Hive+Tez 2Performance Hive+Tez 2
Performance Hive+Tez 2
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over Yarn
 
La big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixitLa big datacamp2014_vikram_dixit
La big datacamp2014_vikram_dixit
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoop
 
Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
 
מיכאל
מיכאלמיכאל
מיכאל
 
Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14
 
February 2014 HUG : Hive On Tez
February 2014 HUG : Hive On TezFebruary 2014 HUG : Hive On Tez
February 2014 HUG : Hive On Tez
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013
 
An In-Depth Look at Putting the Sting in Hive
An In-Depth Look at Putting the Sting in HiveAn In-Depth Look at Putting the Sting in Hive
An In-Depth Look at Putting the Sting in Hive
 
Stinger hadoop summit june 2013
Stinger hadoop summit june 2013Stinger hadoop summit june 2013
Stinger hadoop summit june 2013
 
Austin Scales- Clickstream Analytics at Bazaarvoice
Austin Scales- Clickstream Analytics at BazaarvoiceAustin Scales- Clickstream Analytics at Bazaarvoice
Austin Scales- Clickstream Analytics at Bazaarvoice
 
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.02013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on Hadoop
 
Gunther hagleitner:apache hive & stinger
Gunther hagleitner:apache hive & stingerGunther hagleitner:apache hive & stinger
Gunther hagleitner:apache hive & stinger
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data Pipelines
 
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times FasterApril 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hive
 

Recently uploaded

INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
welrejdoall
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 

Recently uploaded (20)

INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 

Hive+Tez: A performance deep dive

  • 1. Hive+Tez: A Performance deep dive Jitendra Pandey Gopal Vijayaraghavan
  • 2. © Hortonworks Inc. 2014. Stinger Project (announced February 2013) Batch AND Interactive SQL-IN-Hadoop Stinger Initiative A broad, community-based effort to drive the next generation of HIVE Hive 0.13, April, 2013 • Hive on Apache Tez • Cost Based Optimizer (Optiq) • Vectorized Processing Hive 0.11, May 2013: • Base Optimizations • SQL Analytic Functions • ORCFile, Modern File Format Hive 0.12, October 2013: • VARCHAR, DATE Types • ORCFile predicate pushdown • Advanced Optimizations • Performance Boosts via YARN Speed Improve Hive query performance by 100X to allow for interactive query times (seconds) Scale The only SQL interface to Hadoop designed for queries that scale from TB to PB SQL Support broadest range of SQL semantics for analytic applications running against Hadoop …all IN Hadoop Goals:
  • 3. © Hortonworks Inc. 2014. SPEED: Increasing Hive Performance Key Highlights – Tez: New execution engine – Vectorized Query Processing – Startup time improvement – Statistics to accelerate query execution – Cost Based Optimizer: Optiq Interactive Query Times across ALL use cases • Simple and advanced queries in seconds • Integrates seamlessly with existing tools • Currently a >100x improvement in just nine months Elements of Fast SQL Execution • Query Planner/Cost Based Optimizer w/ Statistics • Query Startup • Query Execution • I/O Path
  • 4. © Hortonworks Inc. 2014. Statistics and Cost-based optimization • Statistics: – Hive has table and column level statistics – Used to determine parallelism, join selection • Optiq: Open source, Apache licensed query execution framework in Java – Used by Apache Drill, Apache Cascading, Lucene DB – Based on Volcano paper – 20 man years dev, more than 50 optimization rules • Goals for hive – Ease of Use – no manual tuning for queries, make choices automatically based on cost – View Chaining/Ad hoc queries involving multiple views – Help enable BI Tools front-ending Hive – Emphasis on latency reduction • Cost computation will be used for  Join ordering  Join algorithm selection  Tez vertex boundary selection Page 4 HIVE-5775
  • 5. © Hortonworks Inc. 2014. TPC-DS Query 17 select i_item_id ,i_item_desc ,s_state ,count(ss_quantity) as store_sales_quantitycount ,…. from store_sales ss ,store_returns sr, catalog_sales cs, date_dim d1, date_dim d2, date_dim d3, store s, item i where d1.d_quarter_name = '2000Q1’ and d1.d_date_sk = ss.ss_sold_date_sk and i.i_item_sk = ss.ss_item_sk and s.s_store_sk = ss.ss_store_sk and ss.ss_customer_sk = sr.sr_customer_sk and ss.ss_item_sk = sr.sr_item_sk … group by i_item_id ,i_item_desc, ,s_state order by i_item_id ,i_item_desc, s_state limit 100;  Joins Store Sales, Store Returns and Catalog Sales fact tables.  Each of the fact tables are independently restricted by time.  Analysis at Item and Store grain, so these dimensions are also joined in.  As specified Query starts by joining the 3 Fact tables.
  • 6. © Hortonworks Inc. 2014. TPC-DS Query 17 Specified Join Tree Non CBO Plan CBO Plan
  • 7. © Hortonworks Inc. 2014. TPC-DS Query 17 Run 1 Run 2 Non CBO 127.53 100.71 CBO 50.9 44.52  Fact tables  partitioned by Day,  bucketed by Item  Bucketing off  Bucketing should help CBO plan.  SR table much smaller. Better chance of Bucket Join in place of Shuffle Join. Join Ordering Cost Estimate ['item', [[[[[['d2', 'store_returns'], 'store_sales'], 'catalog_sales'], 'd1'], 'd3'], 'store']] 3547898.061 … ['store_returns', 'd2’] 19224.71 ['store_sales', 'store_returns’] 23057497.991 ['d1', 'store_sales'] 26142.943 Facts restricted to 3 months Orderings considered by Planner
  • 8. © Hortonworks Inc. 2014. Apache Tez (“Speed”) • Replaces MapReduce as primitive for Pig, Hive, Cascading etc. – Smaller latency for interactive queries – Higher throughput for batch queries – 22 contributors: Hortonworks (13), Facebook, Twitter, Yahoo, Microsoft YARN ApplicationMaster to run DAG of Tez Tasks Task with pluggable Input, Processor and Output Tez Task - <Input, Processor, Output> Task ProcessorInput Output
  • 9. © Hortonworks Inc. 2014. Hive – MR Hive – Tez Hive-on-MR vs. Hive-on-Tez SELECT g1.x, g1.avg, g2.cnt FROM (SELECT a.x, AVERAGE(a.y) AS avg FROM a GROUP BY a.x) g1 JOIN (SELECT b.x, COUNT(b.y) AS avg FROM b GROUP BY b.x) g2 ON (g1.x = g2.x) ORDER BY avg; GROUP a BY a.x JOIN (a,b) GROUP b BY b.x ORDER BY M M M R R M M R M M R M R HDFS HDFS HDFS M M M R R R M M R GROUP BY a.x JOIN (a,b) ORDER BY GROUP BY x Tez avoids unnecessary writes to HDFS HIVE-4660
  • 10. © Hortonworks Inc. 2014. Shuffle Join SELECT ss.ss_item_sk, ss.ss_quantity, inv.inv_quantity_on_hand FROM inventory inv JOIN store_sales ss ON (inv.inv_item_sk = ss.ss_item_sk); Hive – MR Hive – Tez
  • 11. © Hortonworks Inc. 2014. Broadcast Join SELECT ss.ss_item_sk, ss.ss_quantity, avg_price, inv.inv_quantity_on_hand FROM (select avg(ss_sold_price) as avg_price, ss_item_sk, ss_quantity_sk from store_sales group by ss_item_sk) ss JOIN inventory inv ON (inv.inv_item_sk = ss.ss_item_sk); Hive – MR Hive – Tez M M M M M HDFS Store Sales scan. Group by and aggregation reduce size of this input. Inventory scan and Join Broadcast edge M M M HDFS Store Sales scan. Group by and aggregation. Inventory and Store Sales (aggr.) output scan and shuffle join. R R R R RR M MMM HDFS
  • 12. © Hortonworks Inc. 2014. 1-1 Edge • Typical star schema join involve join between large number of tables • Dimension aren’t always tiny (Customer dimension) • Might not be able to handle all dimensions in single vertex as broadcast joins • Tez allows streaming records from one processor to the next via a 1-1 Edge – Transfer details (streaming, files, etc) are handled transparently – Scheduling/cluster capacity is worked out by Tez • Allows hive to build a pipeline of in memory joins which we can stream records through
  • 13. © Hortonworks Inc. 2014. Dynamically Partitioned Hash Join SELECT ss.ss_item_sk, ss.ss_quantity, inv.inv_quantity_on_hand FROM store_sales ss JOIN inventory inv ON (inv.inv_item_sk = ss.ss_item_sk); Hive – MR Hive – Tez M MM M M HDFS Inventory scan (Runs on cluster potentially more than 1 mapper) Store Sales scan and Join (Custom vertex reads both inputs – no side file reads) Custom edge (routes outputs of previous stage to the correct Mappers of the next stage) M MM M HDFS Inventory scan (Runs as single local map task) Store Sales scan and Join (Inventory hash table read as side file) HDFS
  • 14. © Hortonworks Inc. 2014. Dynamically Partitioned Hash Join Plans look very similar to map join but the way things work change between MR and Tez. Hive – MR (Bucket map-join) Hive – Tez • Not dynamically partitioned. • Both tables need to be bucketed by the join key. • Local task that generates the hash table writes n files corresponding to n buckets. • Number of mappers for the join must be same as the number of buckets. • Each of these mappers reads the corresponding bucket file of the local task to perform the join. • Only one of the sides needs to be bucketed and the other side is dynamically bucketed. • Also works if neither side is explicitly bucketed, but another operation forced bucketing in the pipeline (traits) • No writing to HDFS. • There can be more mappers than number of buckets, and a bucket can be processed in parallel on multiple mappers.
  • 15. © Hortonworks Inc. 2014. Union all SELECT count(*) FROM ( SELECT distinct ss_customer_sk from store_sales where ss_store_sk = 1 UNION ALL SELECT distinct ss_customer_sk from store_sales where ss_store_sk = 2) as customers Hive – MR Hive – Tez M M M R M M M HDFS R M R HDFS M M M R M M M HDFS R R Two MR jobs to do the distinct Both sub-queries are materialized onto HDFS Single map reads both sides and aggregates In Tez the sub-query output is pre-aggregated and send directly to a common final node
  • 16. © Hortonworks Inc. 2014. Multi-insert queries FROM (SELECT * FROM store_sales, date_dim WHERE ss_sold_date_sk = d_date_sk and d_year = 2000) INSERT INTO TABLE t1 SELECT distinct ss_item_sk INSERT INTO TABLE t2 SELECT distinct ss_customer_sk; Hive – MR Hive – Tez M MM M HDFS Map join date_dim/store sales Two MR jobs to do the distinct M MM M M HDFS RR HDFS M M M R M M M R HDFS Broadcast Join (scan date_dim, join store sales) Distinct for customer + items Materialize join on HDFS
  • 17. © Hortonworks Inc. 2014. Execution “A good plan violently executed now is better than a perfect plan executed next week. George S. Patton
  • 18. © Hortonworks Inc. 2014. Faster Query Setup • AM per-session instead of per-query – Reused across JDBC connections • No more local tasks – Except fetch aggregation • Metastore fetches are much faster – Metastore direct sql fast-path – Partition filters pushed to metastore • Use distributed cache efficiently for hive-exec.jar – /home/$user/.hiveJars • UDF Jars as well – .jar.<sha1> identifier to avoid conflicts – Multiple version compatibility easily – YARN localizes the jars once per node (not per query) • Kryo instead of XML to serialize operators – Works better on jdk7 Page 18
  • 19. © Hortonworks Inc. 2014. Faster Operator Pipeline • Previously on hive
  • 20. © Hortonworks Inc. 2014. Operator Vectorization • Avoid Writable objects & use primitive int/long – Allows efficient JIT code for primitive types • Generate per-type loops & avoid runtime type-checks • The classes generated look like – LongColEqualDoubleColumn – LongColEqualLongColumn – LongColEqualLongScalar • Avoid duplicate operations on repeated values – isRepeating & hasNulls
  • 21. © Hortonworks Inc. 2014. Optimized Row Columnar File • ORC Vectorized Reader • Logical Compression helps reader – isRepeating • Split per-stripe • Row-group level indexes • Stripe level indexes • PPD avoids a lot of IO – Column conditions are ANDed
  • 22. © Hortonworks Inc. 2014. Faster Statistics • ORC stripe footers aggregate stats per-column – Min/Max/Sum/Count • set hive.stats.autogather=true; • ANALYZE TABLE <table> compute statistics partialscan; – Reads only ORC footers • Predicate computation without Tez/MR tasks
  • 23. © Hortonworks Inc. 2014. Faster Execution: Tez • Multiple edge types – Broadcast – Shuffle – One-to-One • Multiple output types – Sorted – Unsorted – Unsorted Partitioned • Per-vertex configurations – Instead of one configuration between M&R tasks
  • 24. © Hortonworks Inc. 2014. Tez I/O speed-ups • Tez shuffle can use keep-alive over HTTP • Shuffle scheduler can optimize connection count – Can fetch all map outputs from one node via 1 connection • Can skip fetching 0 sized partitions from a mapper – Speeds up group-by queries with high locality – Reducers finish shuffle faster • Shuffle threads are re-used in container re-use – Secure shuffle has crypto thread-local inits
  • 25. © Hortonworks Inc. 2014. Skewed Reducers: auto-parallelism • Often queries are slow because of one slow reducer • Skewed data is too common in real life queries • This avoids running too many reducers with with very little data • Future – This can be extended to group by input size – This mechanism can actually speculate on stalling reducers better (split into 3)
  • 26. © Hortonworks Inc. 2014. A Query in motion Page 26 • 4-way Map join + map reduce reduce query • Timeline in left to right, each lane represents one container
  • 27. © Hortonworks Inc. 2014. Defer/Skip tasks Page 27 • No more uploading hive-exec.jar/UDFs for every query • No more spinning up an AM for each stage • No more computation on hive client (local task)
  • 28. © Hortonworks Inc. 2014. Concurrency of small tasks Page 28 • Hive used to run several lightweight tasks in a local VM • LocalTask was a bottleneck – No locality – No parallelism – Small VM • Tez Broadcast edges solve that problem
  • 29. © Hortonworks Inc. 2014. Concurrent Split Generation Page 29 • Tez input intializers are run parallel • No more spinning up an AM for each stage • No more computation on hive client (local task)
  • 30. © Hortonworks Inc. 2014. Split Elimination Page 30 • ORC comes with Predicate Push Down in the reader • Queries with SARGable where clauses – http://en.wikipedia.org/wiki/Sargable • Run the SARGs in the AM, using ORC footer data – Eliminate splits before task spinups, avoid container costs • Offers a soft cache for the ORC footers • Zero splits offers an early exit for data validity checks (i.e price < 0)
  • 31. © Hortonworks Inc. 2014. Pipelining Split->Task Page 31 • The task only depends on its own input • It starts talking to YARN immediately once its inputs are ready • Faster generation of dimension tables • Fact tables can optimize on this further – Will break existing FileSplit mechanism
  • 32. © Hortonworks Inc. 2014. Filling up the pipeline Page 32 • Tez allows grouping splits dynamically • Obsoletes CombineFileInputFormat • Grouped according to locality –1.7 x available containers (or any factor actually) • Allow query to use up 100% of queue capacity –Without tuning mapred split size for each data-set
  • 33. © Hortonworks Inc. 2014. ORC Split extras • RCFile had horrible split performance – rcfile::sync() was slow to find a sync point • ORC Reader allows exact splits for stripes • ORC Writer can pad a stripe to an HDFS block – 5%-7% overhead measured on table – 100% locality of a stripe in a block
  • 34. © Hortonworks Inc. 2014. Container reuse • Tez specific feature • Run an entire DAG using the same containers • Different vertices use same container �� Saves time talking to YARN for new containers
  • 35. © Hortonworks Inc. 2014. Container reuse (II) • Tez provides an object registry within a vertex • This can be used to cache map-join hash-tables • JVM JIT kicks in and optimizes better on re-use
  • 36. © Hortonworks Inc. 2014. Container re-use (Session) • Keep a container group alive between queries • Fast query spin-up and skip YARN queue • Even better JIT performance on >1 queries
  • 37. © Hortonworks Inc. 2014. HiveServer2 and Sessions • HiveServer2 can keep sessions alive –Between different JDBC queries • New security model helps –All secure queries run as “hive” user • Ideal for short exploratory queries • Uses same JARs (no download for task) • Even better JIT performance on >1 queries
  • 38. © Hortonworks Inc. 2014. Supersize it! • 78 vertex + 8374 tasks on 50 containers Page 38
  • 39. © Hortonworks Inc. 2014. Query overload #2 • 5000 hive query test-set • Only 3.9k triggered compute tasks • Rest was optimized away into fetch tasks or metadata tasks • Gets progressively faster as the JVM JIT improves the native code Page 39
  • 40. © Hortonworks Inc. 2014. Big picture 1501.895 1176.479 631.027 4.872 0 200 400 600 800 1000 1200 1400 1600 Text Columnar Partitioned Stinger Latency
  • 41. © Hortonworks Inc. 2014. Roadmap • Expand uses for CBO – Join Algorithm selection – Tez checkpoint selection (recovery) • Temp Tables – Session life-time – Sharing of intermediate results • Materialized views – Pre-compute common results/aggregations – Transparently route via CBO • Join/Grouping w/o sort – Tez decouples algorithm from data transfer • Sort-merge bucket in Tez – Leverage vertex manager – Co-locate partitions on HDFS • Inline sampling/range partitioning with Tez – Sample/create histogram dynamically for skew joins and total order sort Page 41

Editor's Notes

  1. base optimizations: Star join, MMR->MR, Multiple map joins grouped to single mapper. Which analytic functions? Windowing functions, over clause Advanced optimizations Predicate push down only eliminates the orc stripes? Performance boosts via YARN Improvements in shuffle
  2. Tools? BI tools, Tableu, Microstrategy Hive-0.13 is 100x faster. Startup time improvements: - Pre-launch the App master, keep containers around, what are the elements of query startup. - Faster metastore lookup. Using statistics other than Optiq: - Metadata queries - Estimating number of reducers - Map join coversion Optique: Join reordering
  3. What is Optiq 50 optimization rules, examples - Join reordering rules, filter push down, column pruning. Should we mention we generate AST? Ad hoc queries involving multiple views: Currently supported to create views, the query on a view is executed by replacing the view with the subquery. What is tez vertex boundary?
  4. What is shuffle+map? Why is d1 not joined with ss before first shuffle?
  5. Why is Run2 slower for Non-CBO ? What is bucketing off?
  6. Why higher throughput? How many contributors now?
  7. No unncessary writes to HDFS. Number of processes reduced. The edges between M and R can be generalized.
  8. On MR: each mapper sorts partitions of both tables In Tez a mapper sorts only one table, the operators don’t have to switch between data sources.
  9. Inventory is the bigger table in this case. Similar to map-join w/o the need to build a hash table on the client Will work with any level of sub-query nesting Uses stats to determine if applicable How it works: Broadcast result set is computed in parallel on the cluster Join processor are spun up in parallel Broadcast set is streamed to join processor Join processors build hash table Other relation is joined with hashtable Tez handles: Best parallelism Best data transfer of the hashed relation Best scheduling to avoid latencies Why broadcast join is better than the map join? -- Multiple hashes can be generated in parallel -- hashtable in memory can be more compact than the serialized one in local task -- subqueries were always on streaming side and were joined with shuffle join Parallelism: Splits of a dimension table processed in parallel across mappers Data transfer - No hdfs write in between Schedule - read from rack local replica of the dimensional table
  10. Comparing the bucketed map join in MR vs Tez Inventory table is already bucketed. In MR, The hash map for each bucket is built in a single mapper in sequence, loaded in hdfs, then joined with store sales where the hash table is read as a side file. In Tez, The inventory scan is run in parallel in multiple mappers that process buckets. ------ Kicks in when large table is bucketed Bucketed table Dynamic as part of query processing Uses custom edge to match the partitioning on the smaller table Allows hash-join in cases where broadcast would be too large Tez gives us the option of building custom edges and vertex managers Fine grained control over how the data is replicated and partitioned Scheduling and actual data transfer is handled by Tez
  11. Common operation in decision support queries Caused additional no-op stages in MR plans Last stage spins up multi-input mapper to write result Intermediate unions have to be materialized before additional processing Tez has union that handles these cases transparently w/o any intermediate steps
  12. Allows the same input to be split and written to different tables or partitions Avoids duplicate scans/processing Useful for ETL Similar to “Splits” in PIG In MR a “split” in the operator pipeline has to be written to HDFS and processed by multiple additional MR jobs Tez allows to send the mulitple outputs directly to downstream processors
  13. checkcast
  14. Tpch query 1 and query 6. Before:
  15. 1Tb of tpc-hdata compreses to 200Gb of ORC data. 30Tb of tpc-ds data compresses to approx ~6Tb of ORC data.