SlideShare a Scribd company logo
Hive
Dirty/Beautiful Hacks
in Treasure Data
"Rejected"
Hadoop Conference Japan 2016
Feb 13, 2016
Satoshi "Moris" Tagomori (@tagomoris)
Satoshi "Moris" Tagomori
(@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
Hive dirty/beautiful hacks in TD
http://www.treasuredata.com/
What I'll talk about today
• Hive query execution & deployment in TD
• Query runner
• UDF & Schema management
• InputFormat
• Controlling maps/reduces
• Time index pushdown
• Implementing INSERT INTO
• Optimizing INSERT INTO
Console
API
EventCollector
PlazmaDB
Worker
Scheduler
Hadoop
Cluster
Presto
Cluster
USERS
TD SDKs
SERVERS
DataConnector
CUSTOMER's
SYSTEMS
50k/day
200k/day
12M/day
(138/sec)
Treasure Data Architecture: Overview
Hive query execution in TD
PlazmaDB
Worker
Hadoop Cluster
Hive CLI
MR App MR App MR App
metastore
※ all modified code is here
※ almost non-modified
hadoop cluster
Hive deployment
PlazmaDB
Worker
Hadoop Cluster
Hive CLI
MR App MR App MR App
PlazmaDB
Worker
Hadoop Cluster
Hive CLI
MR App MR App MR App
Hive CLI
Hive deployment
PlazmaDB
Worker
Hadoop Cluster
MR App MR App MR App
Hive CLI
Hive deployment
Blue-green deployment
for Hadoop clusters
PlazmaDB
Worker
Hadoop Cluster
MR App
Hive CLI
PlazmaDB
Worker
Hadoop Cluster
MR App
Hive CLI
Hadoop Cluster
Blue-green deployment
for Hadoop clusters
PlazmaDB
Worker
Hadoop Cluster
MR App
Hive CLI
Hadoop Cluster
Hive CLI
MR App
Blue-green deployment
for Hadoop clusters
PlazmaDB
Worker
Hadoop Cluster Hadoop Cluster
Hive CLI
MR App
Blue-green deployment
for Hadoop clusters
PlazmaDB
Worker
Hadoop Cluster
Hive CLI
MR App
Blue-green deployment
for Hadoop clusters
• Hive CLI
• Worker code (ruby) build command line options
with java properties
• Using in-memory disposable metastore
• http://www.slideshare.net/lewuathe/maintainable-cloud-architectureofhadoop by
@Lewuathe
• PlazmaDB
• Time indexed database (hourly partition)
• mpc1 columnar format files + schema-on-read
• http://www.slideshare.net/treasure-data/td-techplazma by @frsyuki
Hive query execution in TD
Query Runner
• QueryRunner extends hive.cli.CliDriver
• specify to use MessagePackSerDe forcedly
• inject hook to QueryPlanCheck to prohibit SCRIPT
operators
• replace stdout/stderr/query_result writers
• add hooks to report query statistics
• Entry point to execute hive queries in TD
Example: Hive job
env HADOOP_CLASSPATH=test.jar:td-hadoop-1.0.jar 
HADOOP_OPTS="-Xmx738m -Duser.name=221" 
hive --service jar td-hadoop-1.0.jar 
com.treasure_data.hadoop.hive.runner.QueryRunner 
-hiveconf td.jar.version= 
-hiveconf plazma.metadb.config={} 
-hiveconf plazma.storage.config={} 
-hiveconf td.worker.database.config={} 
-hiveconf mapreduce.job.priority=HIGH 
-hiveconf mapreduce.job.queuename=root.q221.high 
-hiveconf mapreduce.job.name=HiveJob379515 
-hiveconf td.query.mergeThreshold=1333382400 
-hiveconf td.query.apikey=12345 
-hiveconf td.scheduled.time=1342449253 
-hiveconf td.outdir=./jobs/379515 
-hiveconf hive.metastore.warehouse.dir=/user/hive/221/warehouse 
-hiveconf hive.auto.convert.join.noconditionaltask=false 
-hiveconf hive.mapjoin.localtask.max.memory.usage=0.7 
-hiveconf hive.mapjoin.smalltable.filesize=25000000 
-hiveconf hive.resultset.use.unique.column.names=false 
-hiveconf hive.auto.convert.join=false 
-hiveconf hive.optimize.sort.dynamic.partition=false 
-hiveconf mapreduce.job.reduces=-1 
-hiveconf hive.vectorized.execution.enabled=false 
-hiveconf mapreduce.job.ubertask.enable=true 
-hiveconf yarn.app.mapreduce.am.resource.mb=2048
env HADOOP_CLASSPATH=test.jar:td-hadoop-1.0.jar 
HADOOP_OPTS="-Xmx738m -Duser.name=221" 
hive --service jar td-hadoop-1.0.jar 
com.treasure_data.hadoop.hive.runner.QueryRunner 
-hiveconf td.jar.version= 
-hiveconf plazma.metadb.config={} 
-hiveconf plazma.storage.config={} 
-hiveconf td.worker.database.config={} 
-hiveconf mapreduce.job.priority=HIGH 
-hiveconf mapreduce.job.queuename=root.q221.high 
-hiveconf mapreduce.job.name=HiveJob379515 
-hiveconf td.query.mergeThreshold=1333382400 
-hiveconf td.query.apikey=12345 
-hiveconf td.scheduled.time=1342449253 
-hiveconf td.outdir=./jobs/379515 
-hiveconf hive.metastore.warehouse.dir=/user/hive/221/warehouse 
-hiveconf hive.auto.convert.join.noconditionaltask=false 
-hiveconf hive.mapjoin.localtask.max.memory.usage=0.7 
-hiveconf hive.mapjoin.smalltable.filesize=25000000 
-hiveconf hive.resultset.use.unique.column.names=false 
-hiveconf hive.auto.convert.join=false 
-hiveconf hive.optimize.sort.dynamic.partition=false 
-hiveconf mapreduce.job.reduces=-1 
-hiveconf hive.vectorized.execution.enabled=false 
-hiveconf mapreduce.job.ubertask.enable=true 
-hiveconf yarn.app.mapreduce.am.resource.mb=2048 
-hiveconf mapreduce.job.ubertask.maxmaps=1 
-hiveconf mapreduce.job.ubertask.maxreduces=1 
-hiveconf mapreduce.job.ubertask.maxbytes=536870912 
-hiveconf td.hive.insertInto.dynamic.partitioning=false 
-outdir ./jobs/379515
Schema & UDF Management
• UDF management:
• enable Treasure Data UDFs dynamically
• execute CREATE TEMPORARY FUNCTION before
queries
• Schema on read:
• databases/tables from Plazmadb metadata
• schema definition from Plazmadb metadata
• execute CREATE DATABASE/TABLE before
queries
Example: Hive job (cont)
ADD JAR 'td-hadoop-1.0.jar';
CREATE DATABASE IF NOT EXISTS `db`;
USE `db`;
CREATE TABLE tagomoris (`v` MAP<STRING,STRING>, `time` INT)
STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler'
WITH SERDEPROPERTIES ('msgpack.columns.mapping'='*,time')
TBLPROPERTIES (
'td.storage.user'='221',
'td.storage.database'='dfc',
'td.storage.table'='users_20100604_080812_ce9203d0',
'td.storage.path'='221/dfc/users_20100604_080812_ce9203d0',
'td.table_id'='2',
'td.modifiable'='true',
'plazma.data_set.name'='221/dfc/users_20100604_080812_ce9203d0'
);
CREATE TABLE tbl1 (
`uid` INT,
`key` STRING,
`time` INT
)
STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler'
WITH SERDEPROPERTIES ('msgpack.columns.mapping'='uid,key,time')
TBLPROPERTIES (
'td.storage.user'='221',
'td.storage.database'='dfc',
ADD JAR 'td-hadoop-1.0.jar';
CREATE DATABASE IF NOT EXISTS `db`;
USE `db`;
CREATE TABLE tagomoris (`v` MAP<STRING,STRING>, `time` INT)
STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler'
WITH SERDEPROPERTIES ('msgpack.columns.mapping'='*,time')
TBLPROPERTIES (
'td.storage.user'='221',
'td.storage.database'='dfc',
'td.storage.table'='users_20100604_080812_ce9203d0',
'td.storage.path'='221/dfc/users_20100604_080812_ce9203d0',
'td.table_id'='2',
'td.modifiable'='true',
'plazma.data_set.name'='221/dfc/users_20100604_080812_ce9203d0'
);
CREATE TABLE tbl1 (
`uid` INT,
`key` STRING,
`time` INT
)
STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler'
WITH SERDEPROPERTIES ('msgpack.columns.mapping'='uid,key,time')
TBLPROPERTIES (
'td.storage.user'='221',
'td.storage.database'='dfc',
'td.storage.table'='contests_20100606_120720_96abe81a',
'td.storage.path'='221/dfc/contests_20100606_120720_96abe81a',
'td.table_id'='4',
'td.modifiable'='true',
'plazma.data_set.name'='221/dfc/contests_20100606_120720_96abe81a'
);
USE `db`;
USE `db`;
CREATE TEMPORARY FUNCTION MSGPACK_SERIALIZE AS
'com.treasure_data.hadoop.hive.udf.MessagePackSerialize';
CREATE TEMPORARY FUNCTION TD_TIME_RANGE AS
'com.treasure_data.hadoop.hive.udf.GenericUDFTimeRange';
CREATE TEMPORARY FUNCTION TD_TIME_ADD AS
'com.treasure_data.hadoop.hive.udf.UDFTimeAdd';
CREATE TEMPORARY FUNCTION TD_TIME_FORMAT AS
'com.treasure_data.hadoop.hive.udf.UDFTimeFormat';
CREATE TEMPORARY FUNCTION TD_TIME_PARSE AS
'com.treasure_data.hadoop.hive.udf.UDFTimeParse';
CREATE TEMPORARY FUNCTION TD_SCHEDULED_TIME AS
'com.treasure_data.hadoop.hive.udf.GenericUDFScheduledTime';
CREATE TEMPORARY FUNCTION TD_X_RANK AS
'com.treasure_data.hadoop.hive.udf.Rank';
CREATE TEMPORARY FUNCTION TD_FIRST AS
'com.treasure_data.hadoop.hive.udf.GenericUDAFFirst';
CREATE TEMPORARY FUNCTION TD_LAST AS
'com.treasure_data.hadoop.hive.udf.GenericUDAFLast';
CREATE TEMPORARY FUNCTION TD_SESSIONIZE AS
'com.treasure_data.hadoop.hive.udf.UDFSessionize';
CREATE TEMPORARY FUNCTION TD_PARSE_USER_AGENT AS
'com.treasure_data.hadoop.hive.udf.GenericUDFParseUserAgent';
CREATE TEMPORARY FUNCTION TD_HEX2NUM AS
'com.treasure_data.hadoop.hive.udf.UDFHex2num';
CREATE TEMPORARY FUNCTION TD_MD5 AS
'com.treasure_data.hadoop.hive.udf.UDFmd5';
CREATE TEMPORARY FUNCTION TD_RANK_SEQUENCE AS
'com.treasure_data.hadoop.hive.udf.UDFRankSequence';
CREATE TEMPORARY FUNCTION TD_STRING_EXPLODER AS
'com.treasure_data.hadoop.hive.udf.GenericUDTFStringExploder';
CREATE TEMPORARY FUNCTION TD_URL_DECODE AS
CREATE TEMPORARY FUNCTION TD_URL_DECODE AS
'com.treasure_data.hadoop.hive.udf.UDFUrlDecode';
CREATE TEMPORARY FUNCTION TD_DATE_TRUNC AS
'com.treasure_data.hadoop.hive.udf.UDFDateTrunc';
CREATE TEMPORARY FUNCTION TD_LAT_LONG_TO_COUNTRY AS
'com.treasure_data.hadoop.hive.udf.UDFLatLongToCountry';
CREATE TEMPORARY FUNCTION TD_SUBSTRING_INENCODING AS
'com.treasure_data.hadoop.hive.udf.GenericUDFSubstringInEncoding';
CREATE TEMPORARY FUNCTION TD_DIVIDE AS
'com.treasure_data.hadoop.hive.udf.GenericUDFDivide';
CREATE TEMPORARY FUNCTION TD_SUMIF AS
'com.treasure_data.hadoop.hive.udf.GenericUDAFSumIf';
CREATE TEMPORARY FUNCTION TD_AVGIF AS
'com.treasure_data.hadoop.hive.udf.GenericUDAFAvgIf';
CREATE TEMPORARY FUNCTION hivemall_version AS
'hivemall.HivemallVersionUDF';
CREATE TEMPORARY FUNCTION perceptron AS
'hivemall.classifier.PerceptronUDTF';
CREATE TEMPORARY FUNCTION train_perceptron AS
'hivemall.classifier.PerceptronUDTF';
CREATE TEMPORARY FUNCTION train_pa AS
'hivemall.classifier.PassiveAggressiveUDTF';
CREATE TEMPORARY FUNCTION train_pa1 AS
'hivemall.classifier.PassiveAggressiveUDTF';
CREATE TEMPORARY FUNCTION train_pa2 AS
'hivemall.classifier.PassiveAggressiveUDTF';
CREATE TEMPORARY FUNCTION train_cw AS
'hivemall.classifier.ConfidenceWeightedUDTF';
CREATE TEMPORARY FUNCTION train_arow AS
'hivemall.classifier.AROWClassifierUDTF';
CREATE TEMPORARY FUNCTION train_arowh AS
'hivemall.classifier.AROWClassifierUDTF';
CREATE TEMPORARY FUNCTION train_arowh AS
'hivemall.classifier.AROWClassifierUDTF';
CREATE TEMPORARY FUNCTION train_scw AS
'hivemall.classifier.SoftConfideceWeightedUDTF';
CREATE TEMPORARY FUNCTION train_scw2 AS
'hivemall.classifier.SoftConfideceWeightedUDTF';
CREATE TEMPORARY FUNCTION adagrad_rda AS
'hivemall.classifier.AdaGradRDAUDTF';
CREATE TEMPORARY FUNCTION train_adagrad_rda AS
'hivemall.classifier.AdaGradRDAUDTF';
CREATE TEMPORARY FUNCTION train_multiclass_perceptron AS
'hivemall.classifier.multiclass.MulticlassPerceptronUDTF';
CREATE TEMPORARY FUNCTION train_multiclass_pa AS
'hivemall.classifier.multiclass.MulticlassPassiveAggressiveUDTF';
CREATE TEMPORARY FUNCTION train_multiclass_pa1 AS
'hivemall.classifier.multiclass.MulticlassPassiveAggressiveUDTF';
CREATE TEMPORARY FUNCTION train_multiclass_pa2 AS
'hivemall.classifier.multiclass.MulticlassPassiveAggressiveUDTF';
CREATE TEMPORARY FUNCTION train_multiclass_cw AS
'hivemall.classifier.multiclass.MulticlassConfidenceWeightedUDTF';
CREATE TEMPORARY FUNCTION train_multiclass_arow AS
'hivemall.classifier.multiclass.MulticlassAROWClassifierUDTF';
CREATE TEMPORARY FUNCTION train_multiclass_scw AS
'hivemall.classifier.multiclass.MulticlassSoftConfidenceWeightedUDTF';
CREATE TEMPORARY FUNCTION train_multiclass_scw2 AS
'hivemall.classifier.multiclass.MulticlassSoftConfidenceWeightedUDTF';
CREATE TEMPORARY FUNCTION cosine_similarity AS
'hivemall.knn.similarity.CosineSimilarityUDF';
CREATE TEMPORARY FUNCTION cosine_sim AS
'hivemall.knn.similarity.CosineSimilarityUDF';
CREATE TEMPORARY FUNCTION jaccard AS
'hivemall.knn.similarity.JaccardIndexUDF';
CREATE TEMPORARY FUNCTION jaccard AS
'hivemall.knn.similarity.JaccardIndexUDF';
CREATE TEMPORARY FUNCTION jaccard_similarity AS
'hivemall.knn.similarity.JaccardIndexUDF';
CREATE TEMPORARY FUNCTION angular_similarity AS
'hivemall.knn.similarity.AngularSimilarityUDF';
CREATE TEMPORARY FUNCTION euclid_similarity AS
'hivemall.knn.similarity.EuclidSimilarity';
CREATE TEMPORARY FUNCTION distance2similarity AS
'hivemall.knn.similarity.Distance2SimilarityUDF';
CREATE TEMPORARY FUNCTION hamming_distance AS
'hivemall.knn.distance.HammingDistanceUDF';
CREATE TEMPORARY FUNCTION popcnt AS 'hivemall.knn.distance.PopcountUDF';
CREATE TEMPORARY FUNCTION kld AS 'hivemall.knn.distance.KLDivergenceUDF';
CREATE TEMPORARY FUNCTION euclid_distance AS
'hivemall.knn.distance.EuclidDistanceUDF';
CREATE TEMPORARY FUNCTION cosine_distance AS
'hivemall.knn.distance.CosineDistanceUDF';
CREATE TEMPORARY FUNCTION angular_distance AS
'hivemall.knn.distance.AngularDistanceUDF';
CREATE TEMPORARY FUNCTION jaccard_distance AS
'hivemall.knn.distance.JaccardDistanceUDF';
CREATE TEMPORARY FUNCTION manhattan_distance AS
'hivemall.knn.distance.ManhattanDistanceUDF';
CREATE TEMPORARY FUNCTION minkowski_distance AS
'hivemall.knn.distance.MinkowskiDistanceUDF';
CREATE TEMPORARY FUNCTION minhashes AS 'hivemall.knn.lsh.MinHashesUDF';
CREATE TEMPORARY FUNCTION minhash AS 'hivemall.knn.lsh.MinHashUDTF';
CREATE TEMPORARY FUNCTION bbit_minhash AS
'hivemall.knn.lsh.bBitMinHashUDF';
CREATE TEMPORARY FUNCTION voted_avg AS
'hivemall.ensemble.bagging.VotedAvgUDAF';
CREATE TEMPORARY FUNCTION voted_avg AS
'hivemall.ensemble.bagging.VotedAvgUDAF';
CREATE TEMPORARY FUNCTION weight_voted_avg AS
'hivemall.ensemble.bagging.WeightVotedAvgUDAF';
CREATE TEMPORARY FUNCTION wvoted_avg AS
'hivemall.ensemble.bagging.WeightVotedAvgUDAF';
CREATE TEMPORARY FUNCTION max_label AS
'hivemall.ensemble.MaxValueLabelUDAF';
CREATE TEMPORARY FUNCTION maxrow AS 'hivemall.ensemble.MaxRowUDAF';
CREATE TEMPORARY FUNCTION argmin_kld AS
'hivemall.ensemble.ArgminKLDistanceUDAF';
CREATE TEMPORARY FUNCTION mhash AS
'hivemall.ftvec.hashing.MurmurHash3UDF';
CREATE TEMPORARY FUNCTION sha1 AS 'hivemall.ftvec.hashing.Sha1UDF';
CREATE TEMPORARY FUNCTION array_hash_values AS
'hivemall.ftvec.hashing.ArrayHashValuesUDF';
CREATE TEMPORARY FUNCTION prefixed_hash_values AS
'hivemall.ftvec.hashing.ArrayPrefixedHashValuesUDF';
CREATE TEMPORARY FUNCTION polynomial_features AS
'hivemall.ftvec.pairing.PolynomialFeaturesUDF';
CREATE TEMPORARY FUNCTION powered_features AS
'hivemall.ftvec.pairing.PoweredFeaturesUDF';
CREATE TEMPORARY FUNCTION rescale AS 'hivemall.ftvec.scaling.RescaleUDF';
CREATE TEMPORARY FUNCTION rescale_fv AS
'hivemall.ftvec.scaling.RescaleUDF';
CREATE TEMPORARY FUNCTION zscore AS 'hivemall.ftvec.scaling.ZScoreUDF';
CREATE TEMPORARY FUNCTION normalize AS
'hivemall.ftvec.scaling.L2NormalizationUDF';
CREATE TEMPORARY FUNCTION conv2dense AS
'hivemall.ftvec.conv.ConvertToDenseModelUDAF';
CREATE TEMPORARY FUNCTION to_dense_features AS
'hivemall.ftvec.conv.ToDenseFeaturesUDF';
CREATE TEMPORARY FUNCTION to_dense_features AS
'hivemall.ftvec.conv.ToDenseFeaturesUDF';
CREATE TEMPORARY FUNCTION to_dense AS
'hivemall.ftvec.conv.ToDenseFeaturesUDF';
CREATE TEMPORARY FUNCTION to_sparse_features AS
'hivemall.ftvec.conv.ToSparseFeaturesUDF';
CREATE TEMPORARY FUNCTION to_sparse AS
'hivemall.ftvec.conv.ToSparseFeaturesUDF';
CREATE TEMPORARY FUNCTION quantify AS
'hivemall.ftvec.conv.QuantifyColumnsUDTF';
CREATE TEMPORARY FUNCTION vectorize_features AS
'hivemall.ftvec.trans.VectorizeFeaturesUDF';
CREATE TEMPORARY FUNCTION categorical_features AS
'hivemall.ftvec.trans.CategoricalFeaturesUDF';
CREATE TEMPORARY FUNCTION indexed_features AS
'hivemall.ftvec.trans.IndexedFeatures';
CREATE TEMPORARY FUNCTION quantified_features AS
'hivemall.ftvec.trans.QuantifiedFeaturesUDTF';
CREATE TEMPORARY FUNCTION quantitative_features AS
'hivemall.ftvec.trans.QuantitativeFeaturesUDF';
CREATE TEMPORARY FUNCTION amplify AS
'hivemall.ftvec.amplify.AmplifierUDTF';
CREATE TEMPORARY FUNCTION rand_amplify AS
'hivemall.ftvec.amplify.RandomAmplifierUDTF';
CREATE TEMPORARY FUNCTION addBias AS 'hivemall.ftvec.AddBiasUDF';
CREATE TEMPORARY FUNCTION add_bias AS 'hivemall.ftvec.AddBiasUDF';
CREATE TEMPORARY FUNCTION sortByFeature AS
'hivemall.ftvec.SortByFeatureUDF';
CREATE TEMPORARY FUNCTION sort_by_feature AS
'hivemall.ftvec.SortByFeatureUDF';
CREATE TEMPORARY FUNCTION extract_feature AS
'hivemall.ftvec.ExtractFeatureUDF';
CREATE TEMPORARY FUNCTION extract_feature AS
'hivemall.ftvec.ExtractFeatureUDF';
CREATE TEMPORARY FUNCTION extract_weight AS
'hivemall.ftvec.ExtractWeightUDF';
CREATE TEMPORARY FUNCTION add_feature_index AS
'hivemall.ftvec.AddFeatureIndexUDF';
CREATE TEMPORARY FUNCTION feature AS 'hivemall.ftvec.FeatureUDF';
CREATE TEMPORARY FUNCTION feature_index AS
'hivemall.ftvec.FeatureIndexUDF';
CREATE TEMPORARY FUNCTION tf AS 'hivemall.ftvec.text.TermFrequencyUDAF';
CREATE TEMPORARY FUNCTION train_logregr AS
'hivemall.regression.LogressUDTF';
CREATE TEMPORARY FUNCTION train_pa1_regr AS
'hivemall.regression.PassiveAggressiveRegressionUDTF';
CREATE TEMPORARY FUNCTION train_pa1a_regr AS
'hivemall.regression.PassiveAggressiveRegressionUDTF';
CREATE TEMPORARY FUNCTION train_pa2_regr AS
'hivemall.regression.PassiveAggressiveRegressionUDTF';
CREATE TEMPORARY FUNCTION train_pa2a_regr AS
'hivemall.regression.PassiveAggressiveRegressionUDTF';
CREATE TEMPORARY FUNCTION train_arow_regr AS
'hivemall.regression.AROWRegressionUDTF';
CREATE TEMPORARY FUNCTION train_arowe_regr AS
'hivemall.regression.AROWRegressionUDTF';
CREATE TEMPORARY FUNCTION train_arowe2_regr AS
'hivemall.regression.AROWRegressionUDTF';
CREATE TEMPORARY FUNCTION train_adagrad_regr AS
'hivemall.regression.AdaGradUDTF';
CREATE TEMPORARY FUNCTION train_adadelta_regr AS
'hivemall.regression.AdaDeltaUDTF';
CREATE TEMPORARY FUNCTION train_adagrad AS
'hivemall.regression.AdaGradUDTF';
CREATE TEMPORARY FUNCTION train_adagrad AS
'hivemall.regression.AdaGradUDTF';
CREATE TEMPORARY FUNCTION train_adadelta AS
'hivemall.regression.AdaDeltaUDTF';
CREATE TEMPORARY FUNCTION logress AS 'hivemall.regression.LogressUDTF';
CREATE TEMPORARY FUNCTION pa1_regress AS
'hivemall.regression.PassiveAggressiveRegressionUDTF';
CREATE TEMPORARY FUNCTION pa1a_regress AS
'hivemall.regression.PassiveAggressiveRegressionUDTF';
CREATE TEMPORARY FUNCTION pa2_regress AS
'hivemall.regression.PassiveAggressiveRegressionUDTF';
CREATE TEMPORARY FUNCTION pa2a_regress AS
'hivemall.regression.PassiveAggressiveRegressionUDTF';
CREATE TEMPORARY FUNCTION arow_regress AS
'hivemall.regression.AROWRegressionUDTF';
CREATE TEMPORARY FUNCTION arowe_regress AS
'hivemall.regression.AROWRegressionUDTF';
CREATE TEMPORARY FUNCTION arowe2_regress AS
'hivemall.regression.AROWRegressionUDTF';
CREATE TEMPORARY FUNCTION adagrad AS 'hivemall.regression.AdaGradUDTF';
CREATE TEMPORARY FUNCTION adadelta AS 'hivemall.regression.AdaDeltaUDTF';
CREATE TEMPORARY FUNCTION float_array AS
'hivemall.tools.array.AllocFloatArrayUDF';
CREATE TEMPORARY FUNCTION array_remove AS
'hivemall.tools.array.ArrayRemoveUDF';
CREATE TEMPORARY FUNCTION sort_and_uniq_array AS
'hivemall.tools.array.SortAndUniqArrayUDF';
CREATE TEMPORARY FUNCTION subarray_endwith AS
'hivemall.tools.array.SubarrayEndWithUDF';
CREATE TEMPORARY FUNCTION subarray_startwith AS
'hivemall.tools.array.SubarrayStartWithUDF';
CREATE TEMPORARY FUNCTION collect_all AS
CREATE TEMPORARY FUNCTION collect_all AS
'hivemall.tools.array.CollectAllUDAF';
CREATE TEMPORARY FUNCTION concat_array AS
'hivemall.tools.array.ConcatArrayUDF';
CREATE TEMPORARY FUNCTION subarray AS 'hivemall.tools.array.SubarrayUDF';
CREATE TEMPORARY FUNCTION array_avg AS
'hivemall.tools.array.ArrayAvgGenericUDAF';
CREATE TEMPORARY FUNCTION array_sum AS
'hivemall.tools.array.ArraySumUDAF';
CREATE TEMPORARY FUNCTION to_string_array AS
'hivemall.tools.array.ToStringArrayUDF';
CREATE TEMPORARY FUNCTION map_get_sum AS
'hivemall.tools.map.MapGetSumUDF';
CREATE TEMPORARY FUNCTION map_tail_n AS 'hivemall.tools.map.MapTailNUDF';
CREATE TEMPORARY FUNCTION to_map AS 'hivemall.tools.map.UDAFToMap';
CREATE TEMPORARY FUNCTION to_ordered_map AS
'hivemall.tools.map.UDAFToOrderedMap';
CREATE TEMPORARY FUNCTION sigmoid AS
'hivemall.tools.math.SigmoidGenericUDF';
CREATE TEMPORARY FUNCTION taskid AS 'hivemall.tools.mapred.TaskIdUDF';
CREATE TEMPORARY FUNCTION jobid AS 'hivemall.tools.mapred.JobIdUDF';
CREATE TEMPORARY FUNCTION rowid AS 'hivemall.tools.mapred.RowIdUDF';
CREATE TEMPORARY FUNCTION generate_series AS
'hivemall.tools.GenerateSeriesUDTF';
CREATE TEMPORARY FUNCTION convert_label AS
'hivemall.tools.ConvertLabelUDF';
CREATE TEMPORARY FUNCTION x_rank AS 'hivemall.tools.RankSequenceUDF';
CREATE TEMPORARY FUNCTION each_top_k AS 'hivemall.tools.EachTopKUDTF';
CREATE TEMPORARY FUNCTION tokenize AS 'hivemall.tools.text.TokenizeUDF';
CREATE TEMPORARY FUNCTION is_stopword AS
'hivemall.tools.text.StopwordUDF';
CREATE TEMPORARY FUNCTION split_words AS
CREATE TEMPORARY FUNCTION split_words AS
'hivemall.tools.text.SplitWordsUDF';
CREATE TEMPORARY FUNCTION normalize_unicode AS
'hivemall.tools.text.NormalizeUnicodeUDF';
CREATE TEMPORARY FUNCTION lr_datagen AS
'hivemall.dataset.LogisticRegressionDataGeneratorUDTF';
CREATE TEMPORARY FUNCTION f1score AS 'hivemall.evaluation.FMeasureUDAF';
CREATE TEMPORARY FUNCTION mae AS
'hivemall.evaluation.MeanAbsoluteErrorUDAF';
CREATE TEMPORARY FUNCTION mse AS
'hivemall.evaluation.MeanSquaredErrorUDAF';
CREATE TEMPORARY FUNCTION rmse AS
'hivemall.evaluation.RootMeanSquaredErrorUDAF';
CREATE TEMPORARY FUNCTION mf_predict AS 'hivemall.mf.MFPredictionUDF';
CREATE TEMPORARY FUNCTION train_mf_sgd AS
'hivemall.mf.MatrixFactorizationSGDUDTF';
CREATE TEMPORARY FUNCTION train_mf_adagrad AS
'hivemall.mf.MatrixFactorizationAdaGradUDTF';
CREATE TEMPORARY FUNCTION fm_predict AS
'hivemall.fm.FMPredictGenericUDAF';
CREATE TEMPORARY FUNCTION train_fm AS
'hivemall.fm.FactorizationMachineUDTF';
CREATE TEMPORARY FUNCTION train_randomforest_classifier AS
'hivemall.smile.classification.RandomForestClassifierUDTF';
CREATE TEMPORARY FUNCTION train_rf_classifier AS
'hivemall.smile.classification.RandomForestClassifierUDTF';
CREATE TEMPORARY FUNCTION train_randomforest_regr AS
'hivemall.smile.regression.RandomForestRegressionUDTF';
CREATE TEMPORARY FUNCTION train_rf_regr AS
'hivemall.smile.regression.RandomForestRegressionUDTF';
CREATE TEMPORARY FUNCTION tree_predict AS
'hivemall.smile.tools.TreePredictByStackMachineUDF';
CREATE TEMPORARY FUNCTION tree_predict AS
'hivemall.smile.tools.TreePredictByStackMachineUDF';
CREATE TEMPORARY FUNCTION vm_tree_predict AS
'hivemall.smile.tools.TreePredictByStackMachineUDF';
CREATE TEMPORARY FUNCTION rf_ensemble AS
'hivemall.smile.tools.RandomForestEnsembleUDAF';
CREATE TEMPORARY FUNCTION train_gradient_boosting_classifier AS
'hivemall.smile.classification.GradientTreeBoostingClassifierUDTF';
CREATE TEMPORARY FUNCTION guess_attribute_types AS
'hivemall.smile.tools.GuessAttributesUDF';
CREATE TEMPORARY FUNCTION tokenize_ja AS
'hivemall.nlp.tokenizer.KuromojiUDF';
CREATE TEMPORARY MACRO max2(x DOUBLE, y DOUBLE) if(x>y,x,y);
CREATE TEMPORARY MACRO min2(x DOUBLE, y DOUBLE) if(x<y,x,y);
CREATE TEMPORARY MACRO rand_gid(k INT) floor(rand()*k);
CREATE TEMPORARY MACRO rand_gid2(k INT, seed INT) floor(rand(seed)*k);
CREATE TEMPORARY MACRO idf(df_t DOUBLE, n_docs DOUBLE) log(10, n_docs /
max2(1,df_t)) + 1.0;
CREATE TEMPORARY MACRO tfidf(tf FLOAT, df_t DOUBLE, n_docs DOUBLE) tf *
(log(10, n_docs / max2(1,df_t)) + 1.0);
SELECT time, COUNT(1) AS cnt FROM tbl1
WHERE TD_TIME_RANGE(time, '2015-12-11', '2015-12-12', 'JST');
After improvement :)
ADD JAR test.jar;
ADD JAR td-hadoop-1.0.jar;
CREATE DATABASE IF NOT EXISTS `dfc`;
USE `dfc`;
CREATE TABLE `contests` (`uid` INT, `key` STRING, `time` INT)
STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler'
WITH SERDEPROPERTIES ("msgpack.columns.mapping"="uid,key,time")
TBLPROPERTIES (
"td.storage.user"="221",
"td.storage.database"="dfc",
"td.storage.table"="contests_20100606_120720_96abe81a",
"td.storage.path"="221/dfc/contests_20100606_120720_96abe81a",
"td.table_id"="4",
"td.modifiable"="true",
"plazma.data_set.name"="221/dfc/contests_20100606_120720_96abe81a"
);
USE `dfc`;
CREATE TEMPORARY FUNCTION TD_TIME_RANGE AS 'com.treasure_data.hadoop.hive.udf.GenericUDFTimeRange';
CREATE TEMPORARY FUNCTION TD_TIME_ADD AS 'com.treasure_data.hadoop.hive.udf.UDFTimeAdd';
CREATE TEMPORARY FUNCTION TD_TIME_FORMAT AS 'com.treasure_data.hadoop.hive.udf.UDFTimeFormat';
CREATE TEMPORARY FUNCTION TD_TIME_PARSE AS 'com.treasure_data.hadoop.hive.udf.UDFTimeParse';
CREATE TEMPORARY FUNCTION TD_SCHEDULED_TIME AS 'com.treasure_data.hadoop.hive.udf.GenericUDFScheduledTime';
CREATE TEMPORARY MACRO max2(x DOUBLE, y DOUBLE) if(x>y,x,y);
CREATE TEMPORARY MACRO min2(x DOUBLE, y DOUBLE) if(x<y,x,y);
CREATE TEMPORARY MACRO rand_gid(k INT) floor(rand()*k);
CREATE TEMPORARY MACRO rand_gid2(k INT, seed INT) floor(rand(seed)*k);
CREATE TEMPORARY MACRO idf(df_t DOUBLE, n_docs DOUBLE) log(10, n_docs / max2(1,df_t)) + 1.0;
CREATE TEMPORARY MACRO tfidf(tf FLOAT, df_t DOUBLE, n_docs DOUBLE) tf * (log(10, n_docs / max2(1,df_t)) + 1.0);
SELECT `key`, COUNT(1) FROM contests `key` IS NOT NULL GROUP BY `key`
Appendix: disabling unsafe UDFs
• Unusual case to patch Hive itself for our own
purpose
• java_method(), reflect()
• ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java
Logic flow in Hive processing
• CliDriver -> StorageHandler -> InputFormat -> SerDe
-> SemanticAnalyzer -> OutputFormat
• -> MapReduce Application (
• -> Mapper ( SerDe -> RecordReader -> .... )
• -> Shuffler
• -> Reducer ( ... -> RecordWriter )
• )
Hive -> MapReduce
Hive
Storage
Handler
Input
Format
SerDe MapReduce SerDe
Output
Format
TDInputFormat
• It's just a kind of InputFormat of Hadoop
• TDStorageHandler specifies TDInputFormat as
InputFormat
• get/build splits
• provide RecordReader
• override FS access not to read data from HDFS,
but from Plazmadb
Controlling Maps/Reduces
• We need to control Maps/Reduces by our own logic
• along customers' price plans
• to optimize performance / cluster utilization
• Maps from splits
• Hive: # of files (or splittable parts of files) on HDFS
• TD: # of Megabytes, built from chunks in Plazmadb
• calculated and overwritten in TDInputFormat
• Reduces
• total input data size & other factors
• calculated and overwritten in TDInputFormat
Appendix: How to overwrite
configuration values dynamically
@InterfaceAudience.Public
@InterfaceStability.Stable
public class Configuration implements Iterable<Map.Entry<String,String>>,
Writable {
/**
* Configuration objects
*/
private static final WeakHashMap<Configuration,Object> REGISTRY =
new WeakHashMap<Configuration,Object>();
public void setNumReduceTasks(int n) { setInt(JobContext.NUM_REDUCES, n); }
Configuration may be a copy of original... and not used to
build mapreduce job actually
Especially in InputFormat :-(
TDInputFormatUtils.trySetNumReduceTasks(conf, num);
@SuppressWarnings("unchecked")
public static List<JobConf> tryGetOriginalJobConfs(JobConf conf) {
try {
ArrayList<JobConf> list = new ArrayList<JobConf>();
// Configuration.REGISTRY contains all copies of JobConf instances in process.
// This method scans all copies and try to find original one from it to update it.
Field f = Configuration.class.getDeclaredField("REGISTRY");
f.setAccessible(true);
WeakHashMap<Configuration,Object> reg = (WeakHashMap<Configuration,Object>) f.get(null);
for (Configuration c : reg.keySet()) {
if (c instanceof JobConf) {
JobConf jc = (JobConf) c;
if(jc.getCredentials() == conf.getCredentials()) {
// shares the same credentials object means
// cloned configuration
list.add(jc);
}
}
}
return list;
} catch (Exception ex) {
// ignore errors
}
return Collections.emptyList();
}
public static void trySetNumReduceTasks(JobConf conf, int num) {
List<JobConf> jcs = tryGetOriginalJobConfs(conf);
for (JobConf jc : jcs) {
jc.setNumReduceTasks(num);
}
}
Get all copies of Configuration by reflection, and
overwrite all copies with specified values :-)
Time Index Pushdown
• Scan data only for needed in table
• faster processing, less computing resources
• Pushing down scan range
• from SemanticAnalyzer to StorageHandler
• injecting IndexAnalyzer over InputFormat
SELECT col1 FROM tablename
WHERE time > TD_SCHEDULED_TIME() - 86400 AND time < TD_SCHEDULED_TIME()
OR TD_TIME_RANGE(time, '2016-02-13 00:00:00 JST', '2016-02-14 00:00:00 JST')
IndexAnalyzer
• Called from TDInputFormat
• InputFormat can do everything :-)
• Analyze operator tree
• to create time ranges for each tables
} else if (udf instanceof GenericUDFOPLessThan) {
ExprNodeDesc left = node.getChildren().get(0);
ExprNodeDesc right = node.getChildren().get(1);
if (isTimeColumn(right)) {
Long v = getLongConstant(left);
if (v != null) { // VALUE < key
return new TimeRange[] { new TimeRange(v + 1) };
}
} else if (isTimeColumn(left)) {
Long v = getLongConstant(right);
if (v != null) { // key < VALUE
return new TimeRange[] { new TimeRange(0, v - 1) };
}
}
return ALL_RANGES;
} else if (udf instanceof GenericUDFTimeRange) {
// static evaluate TIME_RANGE(time, start[, end[, timezone]])
if (node.getChildren().size() < 2 || node.getChildren().size() > 4) {
return ALL_RANGES;
}
ExprNodeDesc arg0 = node.getChildren().get(0);
ExprNodeDesc arg1 = node.getChildren().get(1);
ExprNodeDesc arg2 = null;
ExprNodeDesc arg3 = null;
Implementing INSERT INTO
• Hive is going to write data into HDFS
• for normal INSERT INTO
• FileFormat/Serialization can be overwritten, but
filesystems can't be
• Our tables are on PlazmaDB
• INSERT INTO queries must write data on Plazma
• And must handle write-and-commit transaction
TDHiveOutputFormat
• TDStorageHandler specifies it as OutputFormat
• Replace ReduceSinkOperator in operator tree
• to override inserting to write data into PlazmaDB
• Called from TDInputFormat
← Reduce Sink operator for INSERT INTO → ← Replaced one
Original operator tree
For INSERT INTO
Modified one
InputFormat can do everything!
<3
ReduceSinkOp w/ One Hour Partitioning
• All data in a partition must be read for needed case
• Rows in partitions are not needed to be sorted
• parted by "time % 3600"
• 1 reducer for 1 partition
ExprNodeDesc hashExpr;
if (conf.getBoolean(TDConstants.TD_CK_HIVE_INSERTINTO_DYNAMIC_PARTITIONING, false)) {
hashExpr = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo,
new GenericPlazmaUnixtimeDataSetDynamicHashUDF(),
args);
} else {
hashExpr = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo,
new GenericPlazmaUnixtimeDataSetKeyHashUDF(),
args);
}
partnCols.add(hashExpr);
// add another MR job to the query plan to sort data by the hashExpr
Operator op = genReduceSinkPlanForSortingBucketing(analyzer,
table, input, sortCols, sortOrders, partnCols, -1);
preventFileReduceSinkOptimizationHack(conf, analyzer, table, "time", 1);
return op;
Appendix: hack NOT to sort rows
• Rows in a partitions have NOT to be sorted
• All rows in a partition should be read at same time
• There's no standard way NOT to sort rows
private static void preventFileReduceSinkOptimizationHack(Configuration conf,
SemanticAnalyzer analyzer, Table dest_tab, String fakeSortCol, int fakeSortOrder)
throws NoSuchFieldException, IllegalAccessException, HiveException
{
dest_tab.setSortCols(Arrays.asList(new Order[] {
new Order("time", 1)
}));
conf.setBoolean("hive.enforce.sorting", true);
// prevent BucketingSortingReduceSinkOptimizer optimizing out the ReduceSinkOperator.
// fake this ReduceSinkOperator is a regular operator
// so that BucketingSortingReduceSinkOptimizer.process doesn't optimize out the operator
Field field = analyzer.getClass()
.getDeclaredField("reduceSinkOperatorsAddedByEnforceBucketingSorting");
field.setAccessible(true);
List<ReduceSinkOperator> list = (List<ReduceSinkOperator>) field.get(analyzer);
list.clear();
}
Optimizing INSERT INTO
• 1 reducer per 1 partition model
• works well for many cases :-)
• doesn't work well for massively large data in an hour
INSERT INTO TABLE destination
SELECT * FROM sourcetable
WHERE TD_TIME_RANGE(time,'2016-02-13 15:00:00','2016-02-13 16:00:00','JST')
• 1 reducer take much long time for such cases
• besides many reducers finishes immediately
• We wanna distribute!
Basics of Shuffle/Reduce
• Shuffle is global sorting for rows before reducers
• Reducer operators estimates:
• all rows are sorted
• disordered rows are boundaries of partitions
rows
from a map
rows
from a map
rows
from a map
rows
from a map
shuffle
(global sort)
partition (sorted rows)
partition (sorted rows)
partition (sorted rows)
partition (sorted rows)
partition (sorted rows)
partition (sorted rows)
partition (sorted rows)
order of
partitions
is not sorted
INSERT INTO w/
less partitions, massive data
• We know these in planning:
• time range (# of partitions): IndexAnalyzer
• data size: InputFormat.getSplits
• # of reducers: InputFormat.getSplits
rows
from a map
rows
from a map
rows
from a map
rows
from a map
shuffle
(global sort)
partition
(non-sorted rows)
Distribute (virtual)
partitions dynamically
• PlazmaDB-level partitions are managed by
StorageHandler (and PlazmaDB client)
• It's not need to match MR-level partitioning and
PlazmaDB partitioning
• How to distribute a PlazmaDB partitions to many
reducers
ExprNodeDesc hashExpr;
if (conf.getBoolean(TDConstants.TD_CK_HIVE_INSERTINTO_DYNAMIC_PARTITIONING, false)) {
hashExpr = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo,
new GenericPlazmaUnixtimeDataSetDynamicHashUDF(),
args);
} else {
hashExpr = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo,
new GenericPlazmaUnixtimeDataSetKeyHashUDF(),
args);
}
partnCols.add(hashExpr);
/*
* calculate Size (== 3600 / F), F is a max number like:
* * factor of 3600 ( 3600 % F == 0, like 1,2,3,4,5,6,8,10,12,15,18,20,24,30,36,40, ...)
* * F * H <= reduces
*/
static int calculateFactor(int reduces, long hours) {
if (reduces <= hours) {
return 1;
}
long factor = reduces / hours;
while (factor >= 2) {
if (3600 % factor == 0)
break;
factor -= 1;
}
return (int) factor;
}
static int calculatePartitioningSize(int reduces, long hours) {
return 3600 / calculateFactor(reduces, hours);
}
rows
from a map
rows
from a map
rows
from a map
rows
from a map
shuffle
(global sort)
w/
dynamic
partitioning
hive partition
hive partition
hive partition
hive partition
hive partition
hive partition
hive partition
PlazmaDB
1-hour
partition
reducer
reducer
reducer
reducer
reducer
reducer
reducer
public void configure(MapredContext context) {
JobConf conf = context.getJobConf();
this.partitioningSize = DEFAULT_PARTITIONING_SIZE;
int reduces = conf.getInt(MRJobConfig.NUM_REDUCES, 0);
if (reduces < 1) {
// dynamic partitioning requires number of reduces
// because reducers generate too many files if 2 or more partitions arrives into a reduce task
return;
}
int distributionFactor = conf.getInt(TDConstants.TD_CK_HIVE_INSERTINTO_DISTRIBUTION_FACTOR, 0);
if (distributionFactor > 0) {
if (distributionFactor <= reduces && 3600 % distributionFactor == 0) {
this.partitioningSize = 3600 / distributionFactor;
return;
}
// distribution factor larger than reduces or not a factor of 3600,
// splits outout into too many/small files
// such specified values will be ignored, and use default rule
}
long splits = conf.getLong(TDConstants.TD_CK_QUERY_SPLIT_NUMBER, 0);
long hours = conf.getLong(TDConstants.TD_CK_QUERY_TIME_RANGE_HOURS, 0);
if (splits < 1 || hours < 1) {
return; // use default size if TDInputFormat fails to set these values
}
if (splits < MIN_SPLITS_TO_DISTRIBUTE || hours > MAX_HOURS_TO_DISTRIBUTE) {
// input data too small or input data has enough many time partitions to distribute in default rule
return;
}
if (reduces < hours * 2) {
// not enough reduces to distribute time based partitions
return;
}
this.partitioningSize = calculatePartitioningSize(reduces, hours);
}
What important
is that
IT WORKS!
A day,
@frsyuki said
We'll improve our code step by step,
with improvements of OSS and its developer
community <3
Thanks!

More Related Content

Hive dirty/beautiful hacks in TD

  • 1. Hive Dirty/Beautiful Hacks in Treasure Data "Rejected" Hadoop Conference Japan 2016 Feb 13, 2016 Satoshi "Moris" Tagomori (@tagomoris)
  • 2. Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.
  • 5. What I'll talk about today • Hive query execution & deployment in TD • Query runner • UDF & Schema management • InputFormat • Controlling maps/reduces • Time index pushdown • Implementing INSERT INTO • Optimizing INSERT INTO
  • 7. Hive query execution in TD PlazmaDB Worker Hadoop Cluster Hive CLI MR App MR App MR App metastore ※ all modified code is here ※ almost non-modified hadoop cluster
  • 9. PlazmaDB Worker Hadoop Cluster Hive CLI MR App MR App MR App Hive CLI Hive deployment
  • 10. PlazmaDB Worker Hadoop Cluster MR App MR App MR App Hive CLI Hive deployment
  • 11. Blue-green deployment for Hadoop clusters PlazmaDB Worker Hadoop Cluster MR App Hive CLI
  • 12. PlazmaDB Worker Hadoop Cluster MR App Hive CLI Hadoop Cluster Blue-green deployment for Hadoop clusters
  • 13. PlazmaDB Worker Hadoop Cluster MR App Hive CLI Hadoop Cluster Hive CLI MR App Blue-green deployment for Hadoop clusters
  • 14. PlazmaDB Worker Hadoop Cluster Hadoop Cluster Hive CLI MR App Blue-green deployment for Hadoop clusters
  • 15. PlazmaDB Worker Hadoop Cluster Hive CLI MR App Blue-green deployment for Hadoop clusters
  • 16. • Hive CLI • Worker code (ruby) build command line options with java properties • Using in-memory disposable metastore • http://www.slideshare.net/lewuathe/maintainable-cloud-architectureofhadoop by @Lewuathe • PlazmaDB • Time indexed database (hourly partition) • mpc1 columnar format files + schema-on-read • http://www.slideshare.net/treasure-data/td-techplazma by @frsyuki Hive query execution in TD
  • 17. Query Runner • QueryRunner extends hive.cli.CliDriver • specify to use MessagePackSerDe forcedly • inject hook to QueryPlanCheck to prohibit SCRIPT operators • replace stdout/stderr/query_result writers • add hooks to report query statistics • Entry point to execute hive queries in TD
  • 18. Example: Hive job env HADOOP_CLASSPATH=test.jar:td-hadoop-1.0.jar HADOOP_OPTS="-Xmx738m -Duser.name=221" hive --service jar td-hadoop-1.0.jar com.treasure_data.hadoop.hive.runner.QueryRunner -hiveconf td.jar.version= -hiveconf plazma.metadb.config={} -hiveconf plazma.storage.config={} -hiveconf td.worker.database.config={} -hiveconf mapreduce.job.priority=HIGH -hiveconf mapreduce.job.queuename=root.q221.high -hiveconf mapreduce.job.name=HiveJob379515 -hiveconf td.query.mergeThreshold=1333382400 -hiveconf td.query.apikey=12345 -hiveconf td.scheduled.time=1342449253 -hiveconf td.outdir=./jobs/379515 -hiveconf hive.metastore.warehouse.dir=/user/hive/221/warehouse -hiveconf hive.auto.convert.join.noconditionaltask=false -hiveconf hive.mapjoin.localtask.max.memory.usage=0.7 -hiveconf hive.mapjoin.smalltable.filesize=25000000 -hiveconf hive.resultset.use.unique.column.names=false -hiveconf hive.auto.convert.join=false -hiveconf hive.optimize.sort.dynamic.partition=false -hiveconf mapreduce.job.reduces=-1 -hiveconf hive.vectorized.execution.enabled=false -hiveconf mapreduce.job.ubertask.enable=true -hiveconf yarn.app.mapreduce.am.resource.mb=2048
  • 19. env HADOOP_CLASSPATH=test.jar:td-hadoop-1.0.jar HADOOP_OPTS="-Xmx738m -Duser.name=221" hive --service jar td-hadoop-1.0.jar com.treasure_data.hadoop.hive.runner.QueryRunner -hiveconf td.jar.version= -hiveconf plazma.metadb.config={} -hiveconf plazma.storage.config={} -hiveconf td.worker.database.config={} -hiveconf mapreduce.job.priority=HIGH -hiveconf mapreduce.job.queuename=root.q221.high -hiveconf mapreduce.job.name=HiveJob379515 -hiveconf td.query.mergeThreshold=1333382400 -hiveconf td.query.apikey=12345 -hiveconf td.scheduled.time=1342449253 -hiveconf td.outdir=./jobs/379515 -hiveconf hive.metastore.warehouse.dir=/user/hive/221/warehouse -hiveconf hive.auto.convert.join.noconditionaltask=false -hiveconf hive.mapjoin.localtask.max.memory.usage=0.7 -hiveconf hive.mapjoin.smalltable.filesize=25000000 -hiveconf hive.resultset.use.unique.column.names=false -hiveconf hive.auto.convert.join=false -hiveconf hive.optimize.sort.dynamic.partition=false -hiveconf mapreduce.job.reduces=-1 -hiveconf hive.vectorized.execution.enabled=false -hiveconf mapreduce.job.ubertask.enable=true -hiveconf yarn.app.mapreduce.am.resource.mb=2048 -hiveconf mapreduce.job.ubertask.maxmaps=1 -hiveconf mapreduce.job.ubertask.maxreduces=1 -hiveconf mapreduce.job.ubertask.maxbytes=536870912 -hiveconf td.hive.insertInto.dynamic.partitioning=false -outdir ./jobs/379515
  • 20. Schema & UDF Management • UDF management: • enable Treasure Data UDFs dynamically • execute CREATE TEMPORARY FUNCTION before queries • Schema on read: • databases/tables from Plazmadb metadata • schema definition from Plazmadb metadata • execute CREATE DATABASE/TABLE before queries
  • 21. Example: Hive job (cont) ADD JAR 'td-hadoop-1.0.jar'; CREATE DATABASE IF NOT EXISTS `db`; USE `db`; CREATE TABLE tagomoris (`v` MAP<STRING,STRING>, `time` INT) STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler' WITH SERDEPROPERTIES ('msgpack.columns.mapping'='*,time') TBLPROPERTIES ( 'td.storage.user'='221', 'td.storage.database'='dfc', 'td.storage.table'='users_20100604_080812_ce9203d0', 'td.storage.path'='221/dfc/users_20100604_080812_ce9203d0', 'td.table_id'='2', 'td.modifiable'='true', 'plazma.data_set.name'='221/dfc/users_20100604_080812_ce9203d0' ); CREATE TABLE tbl1 ( `uid` INT, `key` STRING, `time` INT ) STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler' WITH SERDEPROPERTIES ('msgpack.columns.mapping'='uid,key,time') TBLPROPERTIES ( 'td.storage.user'='221', 'td.storage.database'='dfc',
  • 22. ADD JAR 'td-hadoop-1.0.jar'; CREATE DATABASE IF NOT EXISTS `db`; USE `db`; CREATE TABLE tagomoris (`v` MAP<STRING,STRING>, `time` INT) STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler' WITH SERDEPROPERTIES ('msgpack.columns.mapping'='*,time') TBLPROPERTIES ( 'td.storage.user'='221', 'td.storage.database'='dfc', 'td.storage.table'='users_20100604_080812_ce9203d0', 'td.storage.path'='221/dfc/users_20100604_080812_ce9203d0', 'td.table_id'='2', 'td.modifiable'='true', 'plazma.data_set.name'='221/dfc/users_20100604_080812_ce9203d0' ); CREATE TABLE tbl1 ( `uid` INT, `key` STRING, `time` INT ) STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler' WITH SERDEPROPERTIES ('msgpack.columns.mapping'='uid,key,time') TBLPROPERTIES ( 'td.storage.user'='221', 'td.storage.database'='dfc', 'td.storage.table'='contests_20100606_120720_96abe81a', 'td.storage.path'='221/dfc/contests_20100606_120720_96abe81a', 'td.table_id'='4', 'td.modifiable'='true', 'plazma.data_set.name'='221/dfc/contests_20100606_120720_96abe81a' ); USE `db`;
  • 23. USE `db`; CREATE TEMPORARY FUNCTION MSGPACK_SERIALIZE AS 'com.treasure_data.hadoop.hive.udf.MessagePackSerialize'; CREATE TEMPORARY FUNCTION TD_TIME_RANGE AS 'com.treasure_data.hadoop.hive.udf.GenericUDFTimeRange'; CREATE TEMPORARY FUNCTION TD_TIME_ADD AS 'com.treasure_data.hadoop.hive.udf.UDFTimeAdd'; CREATE TEMPORARY FUNCTION TD_TIME_FORMAT AS 'com.treasure_data.hadoop.hive.udf.UDFTimeFormat'; CREATE TEMPORARY FUNCTION TD_TIME_PARSE AS 'com.treasure_data.hadoop.hive.udf.UDFTimeParse'; CREATE TEMPORARY FUNCTION TD_SCHEDULED_TIME AS 'com.treasure_data.hadoop.hive.udf.GenericUDFScheduledTime'; CREATE TEMPORARY FUNCTION TD_X_RANK AS 'com.treasure_data.hadoop.hive.udf.Rank'; CREATE TEMPORARY FUNCTION TD_FIRST AS 'com.treasure_data.hadoop.hive.udf.GenericUDAFFirst'; CREATE TEMPORARY FUNCTION TD_LAST AS 'com.treasure_data.hadoop.hive.udf.GenericUDAFLast'; CREATE TEMPORARY FUNCTION TD_SESSIONIZE AS 'com.treasure_data.hadoop.hive.udf.UDFSessionize'; CREATE TEMPORARY FUNCTION TD_PARSE_USER_AGENT AS 'com.treasure_data.hadoop.hive.udf.GenericUDFParseUserAgent'; CREATE TEMPORARY FUNCTION TD_HEX2NUM AS 'com.treasure_data.hadoop.hive.udf.UDFHex2num'; CREATE TEMPORARY FUNCTION TD_MD5 AS 'com.treasure_data.hadoop.hive.udf.UDFmd5'; CREATE TEMPORARY FUNCTION TD_RANK_SEQUENCE AS 'com.treasure_data.hadoop.hive.udf.UDFRankSequence'; CREATE TEMPORARY FUNCTION TD_STRING_EXPLODER AS 'com.treasure_data.hadoop.hive.udf.GenericUDTFStringExploder'; CREATE TEMPORARY FUNCTION TD_URL_DECODE AS
  • 24. CREATE TEMPORARY FUNCTION TD_URL_DECODE AS 'com.treasure_data.hadoop.hive.udf.UDFUrlDecode'; CREATE TEMPORARY FUNCTION TD_DATE_TRUNC AS 'com.treasure_data.hadoop.hive.udf.UDFDateTrunc'; CREATE TEMPORARY FUNCTION TD_LAT_LONG_TO_COUNTRY AS 'com.treasure_data.hadoop.hive.udf.UDFLatLongToCountry'; CREATE TEMPORARY FUNCTION TD_SUBSTRING_INENCODING AS 'com.treasure_data.hadoop.hive.udf.GenericUDFSubstringInEncoding'; CREATE TEMPORARY FUNCTION TD_DIVIDE AS 'com.treasure_data.hadoop.hive.udf.GenericUDFDivide'; CREATE TEMPORARY FUNCTION TD_SUMIF AS 'com.treasure_data.hadoop.hive.udf.GenericUDAFSumIf'; CREATE TEMPORARY FUNCTION TD_AVGIF AS 'com.treasure_data.hadoop.hive.udf.GenericUDAFAvgIf'; CREATE TEMPORARY FUNCTION hivemall_version AS 'hivemall.HivemallVersionUDF'; CREATE TEMPORARY FUNCTION perceptron AS 'hivemall.classifier.PerceptronUDTF'; CREATE TEMPORARY FUNCTION train_perceptron AS 'hivemall.classifier.PerceptronUDTF'; CREATE TEMPORARY FUNCTION train_pa AS 'hivemall.classifier.PassiveAggressiveUDTF'; CREATE TEMPORARY FUNCTION train_pa1 AS 'hivemall.classifier.PassiveAggressiveUDTF'; CREATE TEMPORARY FUNCTION train_pa2 AS 'hivemall.classifier.PassiveAggressiveUDTF'; CREATE TEMPORARY FUNCTION train_cw AS 'hivemall.classifier.ConfidenceWeightedUDTF'; CREATE TEMPORARY FUNCTION train_arow AS 'hivemall.classifier.AROWClassifierUDTF'; CREATE TEMPORARY FUNCTION train_arowh AS 'hivemall.classifier.AROWClassifierUDTF';
  • 25. CREATE TEMPORARY FUNCTION train_arowh AS 'hivemall.classifier.AROWClassifierUDTF'; CREATE TEMPORARY FUNCTION train_scw AS 'hivemall.classifier.SoftConfideceWeightedUDTF'; CREATE TEMPORARY FUNCTION train_scw2 AS 'hivemall.classifier.SoftConfideceWeightedUDTF'; CREATE TEMPORARY FUNCTION adagrad_rda AS 'hivemall.classifier.AdaGradRDAUDTF'; CREATE TEMPORARY FUNCTION train_adagrad_rda AS 'hivemall.classifier.AdaGradRDAUDTF'; CREATE TEMPORARY FUNCTION train_multiclass_perceptron AS 'hivemall.classifier.multiclass.MulticlassPerceptronUDTF'; CREATE TEMPORARY FUNCTION train_multiclass_pa AS 'hivemall.classifier.multiclass.MulticlassPassiveAggressiveUDTF'; CREATE TEMPORARY FUNCTION train_multiclass_pa1 AS 'hivemall.classifier.multiclass.MulticlassPassiveAggressiveUDTF'; CREATE TEMPORARY FUNCTION train_multiclass_pa2 AS 'hivemall.classifier.multiclass.MulticlassPassiveAggressiveUDTF'; CREATE TEMPORARY FUNCTION train_multiclass_cw AS 'hivemall.classifier.multiclass.MulticlassConfidenceWeightedUDTF'; CREATE TEMPORARY FUNCTION train_multiclass_arow AS 'hivemall.classifier.multiclass.MulticlassAROWClassifierUDTF'; CREATE TEMPORARY FUNCTION train_multiclass_scw AS 'hivemall.classifier.multiclass.MulticlassSoftConfidenceWeightedUDTF'; CREATE TEMPORARY FUNCTION train_multiclass_scw2 AS 'hivemall.classifier.multiclass.MulticlassSoftConfidenceWeightedUDTF'; CREATE TEMPORARY FUNCTION cosine_similarity AS 'hivemall.knn.similarity.CosineSimilarityUDF'; CREATE TEMPORARY FUNCTION cosine_sim AS 'hivemall.knn.similarity.CosineSimilarityUDF'; CREATE TEMPORARY FUNCTION jaccard AS 'hivemall.knn.similarity.JaccardIndexUDF';
  • 26. CREATE TEMPORARY FUNCTION jaccard AS 'hivemall.knn.similarity.JaccardIndexUDF'; CREATE TEMPORARY FUNCTION jaccard_similarity AS 'hivemall.knn.similarity.JaccardIndexUDF'; CREATE TEMPORARY FUNCTION angular_similarity AS 'hivemall.knn.similarity.AngularSimilarityUDF'; CREATE TEMPORARY FUNCTION euclid_similarity AS 'hivemall.knn.similarity.EuclidSimilarity'; CREATE TEMPORARY FUNCTION distance2similarity AS 'hivemall.knn.similarity.Distance2SimilarityUDF'; CREATE TEMPORARY FUNCTION hamming_distance AS 'hivemall.knn.distance.HammingDistanceUDF'; CREATE TEMPORARY FUNCTION popcnt AS 'hivemall.knn.distance.PopcountUDF'; CREATE TEMPORARY FUNCTION kld AS 'hivemall.knn.distance.KLDivergenceUDF'; CREATE TEMPORARY FUNCTION euclid_distance AS 'hivemall.knn.distance.EuclidDistanceUDF'; CREATE TEMPORARY FUNCTION cosine_distance AS 'hivemall.knn.distance.CosineDistanceUDF'; CREATE TEMPORARY FUNCTION angular_distance AS 'hivemall.knn.distance.AngularDistanceUDF'; CREATE TEMPORARY FUNCTION jaccard_distance AS 'hivemall.knn.distance.JaccardDistanceUDF'; CREATE TEMPORARY FUNCTION manhattan_distance AS 'hivemall.knn.distance.ManhattanDistanceUDF'; CREATE TEMPORARY FUNCTION minkowski_distance AS 'hivemall.knn.distance.MinkowskiDistanceUDF'; CREATE TEMPORARY FUNCTION minhashes AS 'hivemall.knn.lsh.MinHashesUDF'; CREATE TEMPORARY FUNCTION minhash AS 'hivemall.knn.lsh.MinHashUDTF'; CREATE TEMPORARY FUNCTION bbit_minhash AS 'hivemall.knn.lsh.bBitMinHashUDF'; CREATE TEMPORARY FUNCTION voted_avg AS 'hivemall.ensemble.bagging.VotedAvgUDAF';
  • 27. CREATE TEMPORARY FUNCTION voted_avg AS 'hivemall.ensemble.bagging.VotedAvgUDAF'; CREATE TEMPORARY FUNCTION weight_voted_avg AS 'hivemall.ensemble.bagging.WeightVotedAvgUDAF'; CREATE TEMPORARY FUNCTION wvoted_avg AS 'hivemall.ensemble.bagging.WeightVotedAvgUDAF'; CREATE TEMPORARY FUNCTION max_label AS 'hivemall.ensemble.MaxValueLabelUDAF'; CREATE TEMPORARY FUNCTION maxrow AS 'hivemall.ensemble.MaxRowUDAF'; CREATE TEMPORARY FUNCTION argmin_kld AS 'hivemall.ensemble.ArgminKLDistanceUDAF'; CREATE TEMPORARY FUNCTION mhash AS 'hivemall.ftvec.hashing.MurmurHash3UDF'; CREATE TEMPORARY FUNCTION sha1 AS 'hivemall.ftvec.hashing.Sha1UDF'; CREATE TEMPORARY FUNCTION array_hash_values AS 'hivemall.ftvec.hashing.ArrayHashValuesUDF'; CREATE TEMPORARY FUNCTION prefixed_hash_values AS 'hivemall.ftvec.hashing.ArrayPrefixedHashValuesUDF'; CREATE TEMPORARY FUNCTION polynomial_features AS 'hivemall.ftvec.pairing.PolynomialFeaturesUDF'; CREATE TEMPORARY FUNCTION powered_features AS 'hivemall.ftvec.pairing.PoweredFeaturesUDF'; CREATE TEMPORARY FUNCTION rescale AS 'hivemall.ftvec.scaling.RescaleUDF'; CREATE TEMPORARY FUNCTION rescale_fv AS 'hivemall.ftvec.scaling.RescaleUDF'; CREATE TEMPORARY FUNCTION zscore AS 'hivemall.ftvec.scaling.ZScoreUDF'; CREATE TEMPORARY FUNCTION normalize AS 'hivemall.ftvec.scaling.L2NormalizationUDF'; CREATE TEMPORARY FUNCTION conv2dense AS 'hivemall.ftvec.conv.ConvertToDenseModelUDAF'; CREATE TEMPORARY FUNCTION to_dense_features AS 'hivemall.ftvec.conv.ToDenseFeaturesUDF';
  • 28. CREATE TEMPORARY FUNCTION to_dense_features AS 'hivemall.ftvec.conv.ToDenseFeaturesUDF'; CREATE TEMPORARY FUNCTION to_dense AS 'hivemall.ftvec.conv.ToDenseFeaturesUDF'; CREATE TEMPORARY FUNCTION to_sparse_features AS 'hivemall.ftvec.conv.ToSparseFeaturesUDF'; CREATE TEMPORARY FUNCTION to_sparse AS 'hivemall.ftvec.conv.ToSparseFeaturesUDF'; CREATE TEMPORARY FUNCTION quantify AS 'hivemall.ftvec.conv.QuantifyColumnsUDTF'; CREATE TEMPORARY FUNCTION vectorize_features AS 'hivemall.ftvec.trans.VectorizeFeaturesUDF'; CREATE TEMPORARY FUNCTION categorical_features AS 'hivemall.ftvec.trans.CategoricalFeaturesUDF'; CREATE TEMPORARY FUNCTION indexed_features AS 'hivemall.ftvec.trans.IndexedFeatures'; CREATE TEMPORARY FUNCTION quantified_features AS 'hivemall.ftvec.trans.QuantifiedFeaturesUDTF'; CREATE TEMPORARY FUNCTION quantitative_features AS 'hivemall.ftvec.trans.QuantitativeFeaturesUDF'; CREATE TEMPORARY FUNCTION amplify AS 'hivemall.ftvec.amplify.AmplifierUDTF'; CREATE TEMPORARY FUNCTION rand_amplify AS 'hivemall.ftvec.amplify.RandomAmplifierUDTF'; CREATE TEMPORARY FUNCTION addBias AS 'hivemall.ftvec.AddBiasUDF'; CREATE TEMPORARY FUNCTION add_bias AS 'hivemall.ftvec.AddBiasUDF'; CREATE TEMPORARY FUNCTION sortByFeature AS 'hivemall.ftvec.SortByFeatureUDF'; CREATE TEMPORARY FUNCTION sort_by_feature AS 'hivemall.ftvec.SortByFeatureUDF'; CREATE TEMPORARY FUNCTION extract_feature AS 'hivemall.ftvec.ExtractFeatureUDF';
  • 29. CREATE TEMPORARY FUNCTION extract_feature AS 'hivemall.ftvec.ExtractFeatureUDF'; CREATE TEMPORARY FUNCTION extract_weight AS 'hivemall.ftvec.ExtractWeightUDF'; CREATE TEMPORARY FUNCTION add_feature_index AS 'hivemall.ftvec.AddFeatureIndexUDF'; CREATE TEMPORARY FUNCTION feature AS 'hivemall.ftvec.FeatureUDF'; CREATE TEMPORARY FUNCTION feature_index AS 'hivemall.ftvec.FeatureIndexUDF'; CREATE TEMPORARY FUNCTION tf AS 'hivemall.ftvec.text.TermFrequencyUDAF'; CREATE TEMPORARY FUNCTION train_logregr AS 'hivemall.regression.LogressUDTF'; CREATE TEMPORARY FUNCTION train_pa1_regr AS 'hivemall.regression.PassiveAggressiveRegressionUDTF'; CREATE TEMPORARY FUNCTION train_pa1a_regr AS 'hivemall.regression.PassiveAggressiveRegressionUDTF'; CREATE TEMPORARY FUNCTION train_pa2_regr AS 'hivemall.regression.PassiveAggressiveRegressionUDTF'; CREATE TEMPORARY FUNCTION train_pa2a_regr AS 'hivemall.regression.PassiveAggressiveRegressionUDTF'; CREATE TEMPORARY FUNCTION train_arow_regr AS 'hivemall.regression.AROWRegressionUDTF'; CREATE TEMPORARY FUNCTION train_arowe_regr AS 'hivemall.regression.AROWRegressionUDTF'; CREATE TEMPORARY FUNCTION train_arowe2_regr AS 'hivemall.regression.AROWRegressionUDTF'; CREATE TEMPORARY FUNCTION train_adagrad_regr AS 'hivemall.regression.AdaGradUDTF'; CREATE TEMPORARY FUNCTION train_adadelta_regr AS 'hivemall.regression.AdaDeltaUDTF'; CREATE TEMPORARY FUNCTION train_adagrad AS 'hivemall.regression.AdaGradUDTF';
  • 30. CREATE TEMPORARY FUNCTION train_adagrad AS 'hivemall.regression.AdaGradUDTF'; CREATE TEMPORARY FUNCTION train_adadelta AS 'hivemall.regression.AdaDeltaUDTF'; CREATE TEMPORARY FUNCTION logress AS 'hivemall.regression.LogressUDTF'; CREATE TEMPORARY FUNCTION pa1_regress AS 'hivemall.regression.PassiveAggressiveRegressionUDTF'; CREATE TEMPORARY FUNCTION pa1a_regress AS 'hivemall.regression.PassiveAggressiveRegressionUDTF'; CREATE TEMPORARY FUNCTION pa2_regress AS 'hivemall.regression.PassiveAggressiveRegressionUDTF'; CREATE TEMPORARY FUNCTION pa2a_regress AS 'hivemall.regression.PassiveAggressiveRegressionUDTF'; CREATE TEMPORARY FUNCTION arow_regress AS 'hivemall.regression.AROWRegressionUDTF'; CREATE TEMPORARY FUNCTION arowe_regress AS 'hivemall.regression.AROWRegressionUDTF'; CREATE TEMPORARY FUNCTION arowe2_regress AS 'hivemall.regression.AROWRegressionUDTF'; CREATE TEMPORARY FUNCTION adagrad AS 'hivemall.regression.AdaGradUDTF'; CREATE TEMPORARY FUNCTION adadelta AS 'hivemall.regression.AdaDeltaUDTF'; CREATE TEMPORARY FUNCTION float_array AS 'hivemall.tools.array.AllocFloatArrayUDF'; CREATE TEMPORARY FUNCTION array_remove AS 'hivemall.tools.array.ArrayRemoveUDF'; CREATE TEMPORARY FUNCTION sort_and_uniq_array AS 'hivemall.tools.array.SortAndUniqArrayUDF'; CREATE TEMPORARY FUNCTION subarray_endwith AS 'hivemall.tools.array.SubarrayEndWithUDF'; CREATE TEMPORARY FUNCTION subarray_startwith AS 'hivemall.tools.array.SubarrayStartWithUDF'; CREATE TEMPORARY FUNCTION collect_all AS
  • 31. CREATE TEMPORARY FUNCTION collect_all AS 'hivemall.tools.array.CollectAllUDAF'; CREATE TEMPORARY FUNCTION concat_array AS 'hivemall.tools.array.ConcatArrayUDF'; CREATE TEMPORARY FUNCTION subarray AS 'hivemall.tools.array.SubarrayUDF'; CREATE TEMPORARY FUNCTION array_avg AS 'hivemall.tools.array.ArrayAvgGenericUDAF'; CREATE TEMPORARY FUNCTION array_sum AS 'hivemall.tools.array.ArraySumUDAF'; CREATE TEMPORARY FUNCTION to_string_array AS 'hivemall.tools.array.ToStringArrayUDF'; CREATE TEMPORARY FUNCTION map_get_sum AS 'hivemall.tools.map.MapGetSumUDF'; CREATE TEMPORARY FUNCTION map_tail_n AS 'hivemall.tools.map.MapTailNUDF'; CREATE TEMPORARY FUNCTION to_map AS 'hivemall.tools.map.UDAFToMap'; CREATE TEMPORARY FUNCTION to_ordered_map AS 'hivemall.tools.map.UDAFToOrderedMap'; CREATE TEMPORARY FUNCTION sigmoid AS 'hivemall.tools.math.SigmoidGenericUDF'; CREATE TEMPORARY FUNCTION taskid AS 'hivemall.tools.mapred.TaskIdUDF'; CREATE TEMPORARY FUNCTION jobid AS 'hivemall.tools.mapred.JobIdUDF'; CREATE TEMPORARY FUNCTION rowid AS 'hivemall.tools.mapred.RowIdUDF'; CREATE TEMPORARY FUNCTION generate_series AS 'hivemall.tools.GenerateSeriesUDTF'; CREATE TEMPORARY FUNCTION convert_label AS 'hivemall.tools.ConvertLabelUDF'; CREATE TEMPORARY FUNCTION x_rank AS 'hivemall.tools.RankSequenceUDF'; CREATE TEMPORARY FUNCTION each_top_k AS 'hivemall.tools.EachTopKUDTF'; CREATE TEMPORARY FUNCTION tokenize AS 'hivemall.tools.text.TokenizeUDF'; CREATE TEMPORARY FUNCTION is_stopword AS 'hivemall.tools.text.StopwordUDF'; CREATE TEMPORARY FUNCTION split_words AS
  • 32. CREATE TEMPORARY FUNCTION split_words AS 'hivemall.tools.text.SplitWordsUDF'; CREATE TEMPORARY FUNCTION normalize_unicode AS 'hivemall.tools.text.NormalizeUnicodeUDF'; CREATE TEMPORARY FUNCTION lr_datagen AS 'hivemall.dataset.LogisticRegressionDataGeneratorUDTF'; CREATE TEMPORARY FUNCTION f1score AS 'hivemall.evaluation.FMeasureUDAF'; CREATE TEMPORARY FUNCTION mae AS 'hivemall.evaluation.MeanAbsoluteErrorUDAF'; CREATE TEMPORARY FUNCTION mse AS 'hivemall.evaluation.MeanSquaredErrorUDAF'; CREATE TEMPORARY FUNCTION rmse AS 'hivemall.evaluation.RootMeanSquaredErrorUDAF'; CREATE TEMPORARY FUNCTION mf_predict AS 'hivemall.mf.MFPredictionUDF'; CREATE TEMPORARY FUNCTION train_mf_sgd AS 'hivemall.mf.MatrixFactorizationSGDUDTF'; CREATE TEMPORARY FUNCTION train_mf_adagrad AS 'hivemall.mf.MatrixFactorizationAdaGradUDTF'; CREATE TEMPORARY FUNCTION fm_predict AS 'hivemall.fm.FMPredictGenericUDAF'; CREATE TEMPORARY FUNCTION train_fm AS 'hivemall.fm.FactorizationMachineUDTF'; CREATE TEMPORARY FUNCTION train_randomforest_classifier AS 'hivemall.smile.classification.RandomForestClassifierUDTF'; CREATE TEMPORARY FUNCTION train_rf_classifier AS 'hivemall.smile.classification.RandomForestClassifierUDTF'; CREATE TEMPORARY FUNCTION train_randomforest_regr AS 'hivemall.smile.regression.RandomForestRegressionUDTF'; CREATE TEMPORARY FUNCTION train_rf_regr AS 'hivemall.smile.regression.RandomForestRegressionUDTF'; CREATE TEMPORARY FUNCTION tree_predict AS 'hivemall.smile.tools.TreePredictByStackMachineUDF';
  • 33. CREATE TEMPORARY FUNCTION tree_predict AS 'hivemall.smile.tools.TreePredictByStackMachineUDF'; CREATE TEMPORARY FUNCTION vm_tree_predict AS 'hivemall.smile.tools.TreePredictByStackMachineUDF'; CREATE TEMPORARY FUNCTION rf_ensemble AS 'hivemall.smile.tools.RandomForestEnsembleUDAF'; CREATE TEMPORARY FUNCTION train_gradient_boosting_classifier AS 'hivemall.smile.classification.GradientTreeBoostingClassifierUDTF'; CREATE TEMPORARY FUNCTION guess_attribute_types AS 'hivemall.smile.tools.GuessAttributesUDF'; CREATE TEMPORARY FUNCTION tokenize_ja AS 'hivemall.nlp.tokenizer.KuromojiUDF'; CREATE TEMPORARY MACRO max2(x DOUBLE, y DOUBLE) if(x>y,x,y); CREATE TEMPORARY MACRO min2(x DOUBLE, y DOUBLE) if(x<y,x,y); CREATE TEMPORARY MACRO rand_gid(k INT) floor(rand()*k); CREATE TEMPORARY MACRO rand_gid2(k INT, seed INT) floor(rand(seed)*k); CREATE TEMPORARY MACRO idf(df_t DOUBLE, n_docs DOUBLE) log(10, n_docs / max2(1,df_t)) + 1.0; CREATE TEMPORARY MACRO tfidf(tf FLOAT, df_t DOUBLE, n_docs DOUBLE) tf * (log(10, n_docs / max2(1,df_t)) + 1.0); SELECT time, COUNT(1) AS cnt FROM tbl1 WHERE TD_TIME_RANGE(time, '2015-12-11', '2015-12-12', 'JST');
  • 34. After improvement :) ADD JAR test.jar; ADD JAR td-hadoop-1.0.jar; CREATE DATABASE IF NOT EXISTS `dfc`; USE `dfc`; CREATE TABLE `contests` (`uid` INT, `key` STRING, `time` INT) STORED BY 'com.treasure_data.hadoop.hive.mapred.TDStorageHandler' WITH SERDEPROPERTIES ("msgpack.columns.mapping"="uid,key,time") TBLPROPERTIES ( "td.storage.user"="221", "td.storage.database"="dfc", "td.storage.table"="contests_20100606_120720_96abe81a", "td.storage.path"="221/dfc/contests_20100606_120720_96abe81a", "td.table_id"="4", "td.modifiable"="true", "plazma.data_set.name"="221/dfc/contests_20100606_120720_96abe81a" ); USE `dfc`; CREATE TEMPORARY FUNCTION TD_TIME_RANGE AS 'com.treasure_data.hadoop.hive.udf.GenericUDFTimeRange'; CREATE TEMPORARY FUNCTION TD_TIME_ADD AS 'com.treasure_data.hadoop.hive.udf.UDFTimeAdd'; CREATE TEMPORARY FUNCTION TD_TIME_FORMAT AS 'com.treasure_data.hadoop.hive.udf.UDFTimeFormat'; CREATE TEMPORARY FUNCTION TD_TIME_PARSE AS 'com.treasure_data.hadoop.hive.udf.UDFTimeParse'; CREATE TEMPORARY FUNCTION TD_SCHEDULED_TIME AS 'com.treasure_data.hadoop.hive.udf.GenericUDFScheduledTime'; CREATE TEMPORARY MACRO max2(x DOUBLE, y DOUBLE) if(x>y,x,y); CREATE TEMPORARY MACRO min2(x DOUBLE, y DOUBLE) if(x<y,x,y); CREATE TEMPORARY MACRO rand_gid(k INT) floor(rand()*k); CREATE TEMPORARY MACRO rand_gid2(k INT, seed INT) floor(rand(seed)*k); CREATE TEMPORARY MACRO idf(df_t DOUBLE, n_docs DOUBLE) log(10, n_docs / max2(1,df_t)) + 1.0; CREATE TEMPORARY MACRO tfidf(tf FLOAT, df_t DOUBLE, n_docs DOUBLE) tf * (log(10, n_docs / max2(1,df_t)) + 1.0); SELECT `key`, COUNT(1) FROM contests `key` IS NOT NULL GROUP BY `key`
  • 35. Appendix: disabling unsafe UDFs • Unusual case to patch Hive itself for our own purpose • java_method(), reflect() • ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java
  • 36. Logic flow in Hive processing • CliDriver -> StorageHandler -> InputFormat -> SerDe -> SemanticAnalyzer -> OutputFormat • -> MapReduce Application ( • -> Mapper ( SerDe -> RecordReader -> .... ) • -> Shuffler • -> Reducer ( ... -> RecordWriter ) • )
  • 38. TDInputFormat • It's just a kind of InputFormat of Hadoop • TDStorageHandler specifies TDInputFormat as InputFormat • get/build splits • provide RecordReader • override FS access not to read data from HDFS, but from Plazmadb
  • 39. Controlling Maps/Reduces • We need to control Maps/Reduces by our own logic • along customers' price plans • to optimize performance / cluster utilization • Maps from splits • Hive: # of files (or splittable parts of files) on HDFS • TD: # of Megabytes, built from chunks in Plazmadb • calculated and overwritten in TDInputFormat • Reduces • total input data size & other factors • calculated and overwritten in TDInputFormat
  • 40. Appendix: How to overwrite configuration values dynamically @InterfaceAudience.Public @InterfaceStability.Stable public class Configuration implements Iterable<Map.Entry<String,String>>, Writable { /** * Configuration objects */ private static final WeakHashMap<Configuration,Object> REGISTRY = new WeakHashMap<Configuration,Object>(); public void setNumReduceTasks(int n) { setInt(JobContext.NUM_REDUCES, n); } Configuration may be a copy of original... and not used to build mapreduce job actually Especially in InputFormat :-(
  • 41. TDInputFormatUtils.trySetNumReduceTasks(conf, num); @SuppressWarnings("unchecked") public static List<JobConf> tryGetOriginalJobConfs(JobConf conf) { try { ArrayList<JobConf> list = new ArrayList<JobConf>(); // Configuration.REGISTRY contains all copies of JobConf instances in process. // This method scans all copies and try to find original one from it to update it. Field f = Configuration.class.getDeclaredField("REGISTRY"); f.setAccessible(true); WeakHashMap<Configuration,Object> reg = (WeakHashMap<Configuration,Object>) f.get(null); for (Configuration c : reg.keySet()) { if (c instanceof JobConf) { JobConf jc = (JobConf) c; if(jc.getCredentials() == conf.getCredentials()) { // shares the same credentials object means // cloned configuration list.add(jc); } } } return list; } catch (Exception ex) { // ignore errors } return Collections.emptyList(); } public static void trySetNumReduceTasks(JobConf conf, int num) { List<JobConf> jcs = tryGetOriginalJobConfs(conf); for (JobConf jc : jcs) { jc.setNumReduceTasks(num); } } Get all copies of Configuration by reflection, and overwrite all copies with specified values :-)
  • 42. Time Index Pushdown • Scan data only for needed in table • faster processing, less computing resources • Pushing down scan range • from SemanticAnalyzer to StorageHandler • injecting IndexAnalyzer over InputFormat SELECT col1 FROM tablename WHERE time > TD_SCHEDULED_TIME() - 86400 AND time < TD_SCHEDULED_TIME() OR TD_TIME_RANGE(time, '2016-02-13 00:00:00 JST', '2016-02-14 00:00:00 JST')
  • 43. IndexAnalyzer • Called from TDInputFormat • InputFormat can do everything :-) • Analyze operator tree • to create time ranges for each tables } else if (udf instanceof GenericUDFOPLessThan) { ExprNodeDesc left = node.getChildren().get(0); ExprNodeDesc right = node.getChildren().get(1); if (isTimeColumn(right)) { Long v = getLongConstant(left); if (v != null) { // VALUE < key return new TimeRange[] { new TimeRange(v + 1) }; } } else if (isTimeColumn(left)) { Long v = getLongConstant(right); if (v != null) { // key < VALUE return new TimeRange[] { new TimeRange(0, v - 1) }; } } return ALL_RANGES; } else if (udf instanceof GenericUDFTimeRange) { // static evaluate TIME_RANGE(time, start[, end[, timezone]]) if (node.getChildren().size() < 2 || node.getChildren().size() > 4) { return ALL_RANGES; } ExprNodeDesc arg0 = node.getChildren().get(0); ExprNodeDesc arg1 = node.getChildren().get(1); ExprNodeDesc arg2 = null; ExprNodeDesc arg3 = null;
  • 44. Implementing INSERT INTO • Hive is going to write data into HDFS • for normal INSERT INTO • FileFormat/Serialization can be overwritten, but filesystems can't be • Our tables are on PlazmaDB • INSERT INTO queries must write data on Plazma • And must handle write-and-commit transaction
  • 45. TDHiveOutputFormat • TDStorageHandler specifies it as OutputFormat • Replace ReduceSinkOperator in operator tree • to override inserting to write data into PlazmaDB • Called from TDInputFormat ← Reduce Sink operator for INSERT INTO → ← Replaced one Original operator tree For INSERT INTO Modified one
  • 46. InputFormat can do everything! <3
  • 47. ReduceSinkOp w/ One Hour Partitioning • All data in a partition must be read for needed case • Rows in partitions are not needed to be sorted • parted by "time % 3600" • 1 reducer for 1 partition ExprNodeDesc hashExpr; if (conf.getBoolean(TDConstants.TD_CK_HIVE_INSERTINTO_DYNAMIC_PARTITIONING, false)) { hashExpr = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo, new GenericPlazmaUnixtimeDataSetDynamicHashUDF(), args); } else { hashExpr = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo, new GenericPlazmaUnixtimeDataSetKeyHashUDF(), args); } partnCols.add(hashExpr); // add another MR job to the query plan to sort data by the hashExpr Operator op = genReduceSinkPlanForSortingBucketing(analyzer, table, input, sortCols, sortOrders, partnCols, -1); preventFileReduceSinkOptimizationHack(conf, analyzer, table, "time", 1); return op;
  • 48. Appendix: hack NOT to sort rows • Rows in a partitions have NOT to be sorted • All rows in a partition should be read at same time • There's no standard way NOT to sort rows private static void preventFileReduceSinkOptimizationHack(Configuration conf, SemanticAnalyzer analyzer, Table dest_tab, String fakeSortCol, int fakeSortOrder) throws NoSuchFieldException, IllegalAccessException, HiveException { dest_tab.setSortCols(Arrays.asList(new Order[] { new Order("time", 1) })); conf.setBoolean("hive.enforce.sorting", true); // prevent BucketingSortingReduceSinkOptimizer optimizing out the ReduceSinkOperator. // fake this ReduceSinkOperator is a regular operator // so that BucketingSortingReduceSinkOptimizer.process doesn't optimize out the operator Field field = analyzer.getClass() .getDeclaredField("reduceSinkOperatorsAddedByEnforceBucketingSorting"); field.setAccessible(true); List<ReduceSinkOperator> list = (List<ReduceSinkOperator>) field.get(analyzer); list.clear(); }
  • 49. Optimizing INSERT INTO • 1 reducer per 1 partition model • works well for many cases :-) • doesn't work well for massively large data in an hour INSERT INTO TABLE destination SELECT * FROM sourcetable WHERE TD_TIME_RANGE(time,'2016-02-13 15:00:00','2016-02-13 16:00:00','JST') • 1 reducer take much long time for such cases • besides many reducers finishes immediately • We wanna distribute!
  • 50. Basics of Shuffle/Reduce • Shuffle is global sorting for rows before reducers • Reducer operators estimates: • all rows are sorted • disordered rows are boundaries of partitions rows from a map rows from a map rows from a map rows from a map shuffle (global sort) partition (sorted rows) partition (sorted rows) partition (sorted rows) partition (sorted rows) partition (sorted rows) partition (sorted rows) partition (sorted rows) order of partitions is not sorted
  • 51. INSERT INTO w/ less partitions, massive data • We know these in planning: • time range (# of partitions): IndexAnalyzer • data size: InputFormat.getSplits • # of reducers: InputFormat.getSplits rows from a map rows from a map rows from a map rows from a map shuffle (global sort) partition (non-sorted rows)
  • 52. Distribute (virtual) partitions dynamically • PlazmaDB-level partitions are managed by StorageHandler (and PlazmaDB client) • It's not need to match MR-level partitioning and PlazmaDB partitioning • How to distribute a PlazmaDB partitions to many reducers ExprNodeDesc hashExpr; if (conf.getBoolean(TDConstants.TD_CK_HIVE_INSERTINTO_DYNAMIC_PARTITIONING, false)) { hashExpr = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo, new GenericPlazmaUnixtimeDataSetDynamicHashUDF(), args); } else { hashExpr = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo, new GenericPlazmaUnixtimeDataSetKeyHashUDF(), args); } partnCols.add(hashExpr);
  • 53. /* * calculate Size (== 3600 / F), F is a max number like: * * factor of 3600 ( 3600 % F == 0, like 1,2,3,4,5,6,8,10,12,15,18,20,24,30,36,40, ...) * * F * H <= reduces */ static int calculateFactor(int reduces, long hours) { if (reduces <= hours) { return 1; } long factor = reduces / hours; while (factor >= 2) { if (3600 % factor == 0) break; factor -= 1; } return (int) factor; } static int calculatePartitioningSize(int reduces, long hours) { return 3600 / calculateFactor(reduces, hours); } rows from a map rows from a map rows from a map rows from a map shuffle (global sort) w/ dynamic partitioning hive partition hive partition hive partition hive partition hive partition hive partition hive partition PlazmaDB 1-hour partition reducer reducer reducer reducer reducer reducer reducer
  • 54. public void configure(MapredContext context) { JobConf conf = context.getJobConf(); this.partitioningSize = DEFAULT_PARTITIONING_SIZE; int reduces = conf.getInt(MRJobConfig.NUM_REDUCES, 0); if (reduces < 1) { // dynamic partitioning requires number of reduces // because reducers generate too many files if 2 or more partitions arrives into a reduce task return; } int distributionFactor = conf.getInt(TDConstants.TD_CK_HIVE_INSERTINTO_DISTRIBUTION_FACTOR, 0); if (distributionFactor > 0) { if (distributionFactor <= reduces && 3600 % distributionFactor == 0) { this.partitioningSize = 3600 / distributionFactor; return; } // distribution factor larger than reduces or not a factor of 3600, // splits outout into too many/small files // such specified values will be ignored, and use default rule } long splits = conf.getLong(TDConstants.TD_CK_QUERY_SPLIT_NUMBER, 0); long hours = conf.getLong(TDConstants.TD_CK_QUERY_TIME_RANGE_HOURS, 0); if (splits < 1 || hours < 1) { return; // use default size if TDInputFormat fails to set these values } if (splits < MIN_SPLITS_TO_DISTRIBUTE || hours > MAX_HOURS_TO_DISTRIBUTE) { // input data too small or input data has enough many time partitions to distribute in default rule return; } if (reduces < hours * 2) { // not enough reduces to distribute time based partitions return; } this.partitioningSize = calculatePartitioningSize(reduces, hours); }
  • 55. What important is that IT WORKS! A day, @frsyuki said
  • 56. We'll improve our code step by step, with improvements of OSS and its developer community <3 Thanks!