H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI World London

© 2018 KNIME AG. All rights reserved.
Leveraging H2O Machine Learning
with KNIME Analytics Platform
Christian Dietz
KNIME

H2O Distributed Machine Learning Algorithms
Supervised Learning
• Generalized Linear Models: Binomial,
Gaussian, Gamma, Poisson and Tweedie
• Naïve Bayes
Statistical
Analysis
Ensembles
• Distributed Random Forest: Classification
or regression models
• Gradient Boosting Machine: Produces an
ensemble of decision trees with increasing
refined approximations
Deep Neural
Networks
• Deep learning: Create multi-layer feed
forward neural networks starting with an
input layer followed by multiple layers of
nonlinear transformations
Unsupervised Learning
• K-means: Partitions observations into k
clusters/groups of the same spatial size.
Automatically detect optimal k
Clustering
Dimensionality
Reduction
• Principal Component Analysis: Linearly transforms
correlated variables to independent components
• Generalized Low Rank Models: extend the idea of
PCA to handle arbitrary data consisting of numerical,
Boolean, categorical, and missing data
Anomaly
Detection
• Autoencoders: Find outliers using a
nonlinear dimensionality reduction using
deep learning

Platforms with H2O Integration
H2O + KNIME Talk
at KNIME Summit
March 2018

© 2018 KNIME AG. All rights reserved. 4
KNIME®
• KNIME AG founded in 2008
• Offices in Zurich (HQ), Konstanz, Berlin, and Austin
• Maintainer of the Open Source KNIME Analytics Platform
– comprehensive data loading, processing, analysis, modeling platform
– visual frontend
– open: to all sorts of data, other tools (R and Python, etc.), various
user personas
– 20+ open source releases since 2006
– open source.
• KNIME Server
– 14 commercial product releases since 2008
• KNIME cloud offerings

KNIME® Software

KNIME® Analytics Platform

Analysis & Mining
Statistics
Data Mining
Machine Learning
Deep Learning
Web Analytics
Text Mining
Network Analysis
Social Media Analysis
R, Weka, Python, H2O
Community / 3rd
Data Access
MySQL, Oracle, ...
SAS, SPSS, ...
Excel, Flat, ...
Hive, Impala, ...
XML, JSON, PMML
Text, Doc, Image, ...
Web Crawlers
Industry Specific
Community / 3rd
Transformation
Row,
Column
Matrix
Text, Image
Time Series
Java
Python
Community / 3rd
Visualization
R
Python
JavaScript
Community / 3rd
Deployment
via BIRT
PMML
XML, JSON
Databases
Excel, Flat, etc.
Text, Doc, Image
Industry Specific
Community / 3rd
Over 2000 Native and Embedded Nodes Included

KNIME H2O Machine Learning Integration
• Offer our users high-performance machine learning
algorithms from H2O in KNIME
• Allow to mix & match with other KNIME
functionality
– Data wrangling KNIME Analytics Platform functionality
– KNIME Big-Data Connectors
– Text Mining, Image Processing, Cheminformatics, …
– and more!

KNIME H2O Machine Learning Integration

The Data
Date Store ID Visitors
2016-01-01 ba937bf13d40fb24 28
… … …
2017-04-22 324f7c39a8410e7c 216
Date Store ID Visitors
2017-04-23 e8ed9335d0c38333 ?
… … …
2017-05-31 8f13ef0f5e8c64dd ?
Provided data:
• Number of visitors
• Reservations
• Store information
• Calendar date info

Visitor Forecasting
Data
preparation
Model
training
Model
optimization
Model
evaluation
Deployment

Data Preparation with KNIME Nodes

Visitor Forecasting
Data
preparation
Model
training
Model
optimization
Model
evaluation
Deployment

Modeling with the H2O Nodes

Visitor Forecasting
Data
preparation
Model
training
Model
optimization
Model
evaluation
Deployment

Visitor Forecasting
.
Data
preparation
Model
training
Model
optimization
Model
evaluation
Deployment

Blend H2O with…Python, Java and R Scripting…
24

…Image Processing...
25

…Deep Learning...

...Text Processing...

...Databases...

...a growing Big Data Integration.

H2O Sparkling Water in KNIME

Scoring with H2O MOJOs on Apache Spark

Thank You!
www.knime.com

37© 2018 KNIME AG. All rights reserved.
The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by
KNIME AG under license from KNIME GmbH, and are registered in the United States.
KNIME® is also registered in Germany.

H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI World London

More Related Content

What's hot

What's hot (20)

Similar to H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI World London

Similar to H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI World London (20)

More from Sri Ambati

More from Sri Ambati (20)

Recently uploaded

Recently uploaded (20)

H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI World London

Editor's Notes