SlideShare a Scribd company logo
H2O AutoML Roadmap 2016.10
Raymond Peck
Director of Product Engineering, H2O.ai
rpeck@h2o.ai
© H2O.ai, 2016 1
What Will We Cover?
• What is AutoML?
• What is the roadmap for H2O AutoML?
© H2O.ai, 2016 2
What is AutoML?
H2O AutoML automates parts of data preparation and model
training in order to help both Machine Learning / Data Science
experts and complete novices.
Other AutoML projects concentrate on novices.
© H2O.ai, 2016 3
Outside AutoML Projects
• auto-sklearn
• AutoCompete
• TPOT
• DataRobot
• Automatic Statistician
• BigML
• et al...
© H2O.ai, 2016 4

Recommended for you

Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai
Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.aiDriverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai
Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai

Presented at #H2OWorld 2017 in Mountain View, CA. Enjoy the video: https://youtu.be/axIqeaUhow0. Learn more about H2O.ai: https://www.h2o.ai/. Follow @h2oai: https://twitter.com/h2oai. - - - Abstract: Usage of AI and machine learning models is likely to become more commonplace as larger swaths of the economy embrace automation and data-driven decision-making. While these predictive systems can be quite accurate, they have been treated as inscrutable black boxes in the past, that produce only numeric predictions with no accompanying explanations. Unfortunately, recent studies and recent events have drawn attention to mathematical and sociological flaws in prominent weak AI and ML systems, but practitioners usually don’t have the right tools to pry open machine learning black-boxes and debug them. This presentation introduces several new approaches to that increase transparency, accountability, and trustworthiness in machine learning models. If you are a data scientist or analyst and you want to explain a machine learning model to your customers or managers (or if you have concerns about documentation, validation, or regulatory requirements), then this presentation is for you!

machine learning
Driverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on LabDriverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on Lab

Enjoy the webinar recording here: https://youtu.be/Lll1qwQJKVw. Driverless AI speeds up data science workflows by automating feature engineering, model tuning, ensembling, and model deployment. In this presentation, Arno Candel (CTO, H2O.ai), gives a quick overview and guide attendees through an interactive hands-on lab using Qwiklabs. Driverless AI turns Kaggle-winning recipes into production-ready code and is specifically designed to avoid common mistakes such as under or overfitting, data leakage or improper model validation. Avoiding these pitfalls alone can save weeks or more for each model, and is necessary to achieve high modeling accuracy. With Driverless AI, everyone can now train and deploy modeling pipelines with just a few clicks from the GUI. Advanced users can use the client/server API through a variety of languages such as Python, Java, C++, go, C# and many more. To speed up training, Driverless AI uses highly optimized C++/CUDA algorithms to take full advantage of the latest compute hardware. For example, Driverless AI runs orders of magnitudes faster on the latest Nvidia GPU supercomputers on Intel and IBM platforms, both in the cloud or on-premise. There are two more product innovations in Driverless AI: statistically rigorous automatic data visualization and interactive model interpretation with reason codes and explanations in plain English. Both help data scientists and analysts to quickly validate the data and models.

machine learningdata scienceai
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data

The document discusses the Lambda Architecture, which is an approach for building data systems to handle large volumes of real-time streaming data. It proposes using three main design principles: handling human errors by making the system fault-tolerant, storing raw immutable data, and enabling recomputation of results from the raw data. The document then provides two case studies of applying Lambda Architecture principles to analyze mobile app usage data and process high-volume web logs in real-time. It concludes with lessons learned, such as studying Lambda concepts, collecting any available data, and turning data into useful insights.

data analyticdata sciencedata analysis
Who is the Target Audience?
• "Big green button" for novice users such as software
developers and business analysts;
• Iterative, interactive use and controls for expert users:
• Machine Learning experts
• Descriptive Data Scientists
© H2O.ai, 2016 5
What Are the Pieces?
• data cleaning
• feature engineering / feature generation
• feature selection
• for both the original and generated features
• model hyperparameter tuning
• automatic smart ensemble generation
© H2O.ai, 2016 6
Prior Work @ H2O
• ensembles (stacking), from Erin LeDell
• random hyperparameter search with automatic stopping,
from Raymond Peck
• some dataset characterization and feature engineering,
from Spencer Aiello
• hyperopt Bayesian hyperparameter optimization, from
Abhishek Malali
© H2O.ai, 2016 7
Current Work
• random hyperparameter search with parameter values
based on open datasets
• moving ensembles into the back end
• working on basic metalearning for hyperparameter vectors,
starting with 140 OpenML datasets
© H2O.ai, 2016 8

Recommended for you

Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data

- The document discusses the Lambda Architecture, a system designed by Nathan Marz for building real-time big data applications. It is based on three principles: human fault-tolerance, data immutability, and recomputation. - The document provides two case studies of applying Lambda Architecture - at Greengar Studios for API monitoring and statistics, and at eClick for real-time data analytics on streaming user event data. - Key lessons discussed are keeping solutions simple, asking the right questions to enable deep analytics and profit, using reactive and functional approaches, and turning data into useful insights.

big datamarketing analysisreal-time analytics
Quantifying Genuine User Experience in Virtual Desktop Ecosystems
Quantifying Genuine User Experience in Virtual Desktop EcosystemsQuantifying Genuine User Experience in Virtual Desktop Ecosystems
Quantifying Genuine User Experience in Virtual Desktop Ecosystems

Data Con LA 2020 When users complain about slowness in their virtual application or desktop, User Experience becomes a subjective measurement, or a feeling of how well the infrastructure is performing. This talk will focus on the objective measurement and what that looks like for your business. Takeaways: *Attendees will learn the method for monitoring User Experience for virtual apps and desktops. *Attendees will learn the do's and don'ts of monitoring for User Experience in the virtual world. *Attendees will gain a sense of importance of monitoring UX for their business cases when purchasing a monitoring solution like eG Enterprise. Typical Audience: Architects, engineers, managers, end-user solutions experts that work in the virtual desktop space such as Citrix, Horizon, DaaS, and more. Speaker Wendy Howard, Eg Innovations, Technical Consultant

data con ladata con la 2020dcla
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer instead

Data Con LA 2020 Description Machine learning is an essential skill in today's job market. But when it comes to learning Machine Learning, beginners get lot of conflicting advice. I have been teaching ML for software engineers for years. In this talk *I will dis-spell some of the myths surrounding machine learning *give you solid, tangible plan on how to go about learning ML *and give you good pointers to start from *and steer you away from common mistakes Speaker Sujee Maniyam, Elephant Scale, Founder, Principal instructor

data con ladata con la 2020dcla
Future Work
• feature selection
• feature engineering for IID data
• Bayesian hyperparameter search with warm start
• feature engineering for non-IID data, e.g. time series
• iterate w/ larger datasets that are typical for our customers
• distribution guesser for regression
© H2O.ai, 2016 9
How Do We Evaluate Our Work?
• public datasets from
• OpenML
• ChaLearn AutoML challenge
• Kaggle
• our own Data Scientists' work with customer datasets
• customer feedback (soon)
© H2O.ai, 2016 10
Data Cleaning
• outlier analysis (with user feedback)
• sentinel value detection
• as a side-effect of outlier analysis
• type-based heuristics (e.g., 999999, 1970.01.01)
• identifier detection (e.g., customer ID)
• smart imputation
© H2O.ai, 2016 11
Feature Generation
We will be using several techniques including:
• type-based heuristics
• date/time expansion
• log and other transforms of numerics
• interactions (product, ratio, etc)
• feature generation with Deep Learning deepfeatures()
• clustering
© H2O.ai, 2016 12

Recommended for you

Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018

This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes! This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling. He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.

Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...

This session was recorded in San Francisco on February 5th, 2019 and can be viewed here: https://youtu.be/CgoxjmdyMiU This session will discuss how to get up and running quickly with containerized H2O environments (H2O Flow, Sparkling Water, and Driverless AI) at scale, in a multi-tenant architecture with a shared pool of resources using CPUs and/or GPUs. See how how you can spin up (and tear down) your H2O environments on-demand, with just a few mouse clicks. Find out how to enable quota management of GPU resources for greater efficiency, and easily connect your compute to your datasets for large-scale distributed machine learning. Learn how to operationalize your machine learning pipelines and deliver faster time-to-value for your AI initiative — while ensuring enterprise-grade security and high performance. Bio: Nanda Vijaydev is senior director of solutions at BlueData (now HPE) - where she leverages technologies like Hadoop, Spark, and TensorFlow to build solutions for enterprise analytics and machine learning use cases. Nanda has 10 years of experience in data management and data science. Previously, she worked on data science and big data projects in multiple industries, including healthcare and media; was a principal solutions architect at Silicon Valley Data Science; and served as director of solutions engineering at Karmasphere. Nanda has an in-depth understanding of the data analytics and data management space, particularly in the areas of data integration, ETL, warehousing, reporting, and machine learning.

Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...

ING is a large financial institution operating since 1881 with over 33 million customers. It aims to become more data-driven through its Think Forward strategy. It is building a streaming analytics platform using Apache Flink for real-time processing to enable uses cases like fraud detection and personalized insights. The platform uses a probabilistic approach combining event pattern matching, machine learning models in PMML format, and a post-processing stage to produce notifications. It is developed according to ING's agile way of working and provides both functional and modular flexibility.

ingfast dataanalytics
Feature Selection
We will be evaluating several techniques including:
• Mutual Information (non-linear correlation)
• variable importance from GBM and Deep Learning
• PCA
• GLM with Elastic Net / LASSO
Perhaps different selectors for initial data and transforms / interactions
to trade off speed and the detection of non-linear relationships.
© H2O.ai, 2016 13
Hyperparameter Tuning
• currently do random hyperparameter search with metric-based
smart stopping
• hyperparameter values taken from hand-tuning 140 OpenML
datasets
• soon adding simple "nearest neighbors" warm start (basic
metalearning)
• then adding Bayesian hyperparameter optimization
• possibly integrating hyperopt into the back end
© H2O.ai, 2016 14
Automatic Smart Ensemble
Generation
• currently adding Erin LeDell's stacking / SuperLearner into the back end
• initially, ensemble top N models from hyperparameter searches
• optional "use original features"
• smarter ensemble generation for faster scoring, less overfitting:
• greedy ensemble creation
• ensemble models with uncorrelated residuals
© H2O.ai, 2016 15
Possible Futures
• try to predict accuracy from dataset metadata
• training time prediction
• scoring time prediction
• multiple concurrent H2O clusters for speed
• freeze/thaw model training
• outlier analysis with user feedback
• residuals analysis with user feedback
• composite models using pre-clustering step
© H2O.ai, 2016 16

Recommended for you

How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp

Do you know how to use StreamSets Data Collector with Google Cloud Platform (GCP)? In this session we'll explain how YaloChat designed and implemented a streaming architecture that is sustainable, operable and scalable. Discover how we deployed Data Collector to integrate GCP components such as Pub / Sub and BigQuery to achieve DataOps in the cloud

big dataanalyticsgcp
Enterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, ClouderaEnterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, Cloudera

This document discusses an approach to enterprise metadata integration using a multilayer metadata model. Key points include: - Status dashboards provide facts from technical, operational, application, and quality metadata layers - A graph database allows for context exploration across the entire cluster - The integration of metadata from multiple sources provides a more holistic view of business knowledge

neo4jmetadatagraphconnect europe 2017
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learning

The document discusses real-time machine learning using the Lambda architecture. It describes the need for models that can learn incrementally from streaming data and remain accurate over time. The Lambda architecture is introduced as having a speed layer for real-time processing, a serving layer to query current and batch views, and a batch layer for immutable datasets. Mahout is described as an Apache library for scalable machine learning like recommendation, clustering, and classification using Hadoop. Basic recommendation algorithms are covered along with use cases like e-commerce personalization, fraud detection, and media metadata generation.

More Related Content

What's hot

Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
Sri Ambati
 
Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...
Sri Ambati
 
H2O Driverless AI Workshop
H2O Driverless AI WorkshopH2O Driverless AI Workshop
H2O Driverless AI Workshop
Sri Ambati
 
Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai
Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.aiDriverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai
Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai
Sri Ambati
 
Driverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on LabDriverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on Lab
Sri Ambati
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
Trieu Nguyen
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
Trieu Nguyen
 
Quantifying Genuine User Experience in Virtual Desktop Ecosystems
Quantifying Genuine User Experience in Virtual Desktop EcosystemsQuantifying Genuine User Experience in Virtual Desktop Ecosystems
Quantifying Genuine User Experience in Virtual Desktop Ecosystems
Data Con LA
 
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Data Con LA
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Sri Ambati
 
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Sri Ambati
 
Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...
Bas Geerdink
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp
Joseph Arriola
 
Enterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, ClouderaEnterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, Cloudera
Neo4j
 
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learning
Vinoth Kannan
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize Seattle
Sri Ambati
 
Productionizing H2O Models with Apache Spark
Productionizing H2O Models with Apache SparkProductionizing H2O Models with Apache Spark
Productionizing H2O Models with Apache Spark
Sri Ambati
 
Spark Summit Europe 2016 Keynote - Databricks CEO
Spark Summit Europe 2016 Keynote  - Databricks CEO Spark Summit Europe 2016 Keynote  - Databricks CEO
Spark Summit Europe 2016 Keynote - Databricks CEO
Databricks
 
platform for Machine Learning
 platform for Machine Learning platform for Machine Learning
platform for Machine Learning
SivapriyaS12
 
Machine Learning with H2O
Machine Learning with H2OMachine Learning with H2O
Machine Learning with H2O
Sri Ambati
 

What's hot (20)

Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
 
Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...
 
H2O Driverless AI Workshop
H2O Driverless AI WorkshopH2O Driverless AI Workshop
H2O Driverless AI Workshop
 
Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai
Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.aiDriverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai
Driverless AI Hands-on Focused on Machine Learning Interpretability - H2O.ai
 
Driverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on LabDriverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on Lab
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Quantifying Genuine User Experience in Virtual Desktop Ecosystems
Quantifying Genuine User Experience in Virtual Desktop EcosystemsQuantifying Genuine User Experience in Virtual Desktop Ecosystems
Quantifying Genuine User Experience in Virtual Desktop Ecosystems
 
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
 
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
 
Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...Fast Data at ING – the why, what and how of the streaming analytics platform ...
Fast Data at ING – the why, what and how of the streaming analytics platform ...
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp
 
Enterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, ClouderaEnterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, Cloudera
 
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learning
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize Seattle
 
Productionizing H2O Models with Apache Spark
Productionizing H2O Models with Apache SparkProductionizing H2O Models with Apache Spark
Productionizing H2O Models with Apache Spark
 
Spark Summit Europe 2016 Keynote - Databricks CEO
Spark Summit Europe 2016 Keynote  - Databricks CEO Spark Summit Europe 2016 Keynote  - Databricks CEO
Spark Summit Europe 2016 Keynote - Databricks CEO
 
platform for Machine Learning
 platform for Machine Learning platform for Machine Learning
platform for Machine Learning
 
Machine Learning with H2O
Machine Learning with H2OMachine Learning with H2O
Machine Learning with H2O
 

Viewers also liked

Nvidia Deep Learning Solutions - Alex Sabatier
Nvidia Deep Learning Solutions - Alex SabatierNvidia Deep Learning Solutions - Alex Sabatier
Nvidia Deep Learning Solutions - Alex Sabatier
Sri Ambati
 
Deep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno CandelDeep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno Candel
Sri Ambati
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
Sri Ambati
 
Deep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry LarkoDeep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry Larko
Sri Ambati
 
Sparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal MalohlavaSparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal Malohlava
Sri Ambati
 
Transformation, H2O Open Dallas 2016, Keynote by Sri Ambati,
Transformation, H2O Open Dallas 2016, Keynote by Sri Ambati, Transformation, H2O Open Dallas 2016, Keynote by Sri Ambati,
Transformation, H2O Open Dallas 2016, Keynote by Sri Ambati,
Sri Ambati
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
Sri Ambati
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling Water
Sri Ambati
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
Sri Ambati
 
Cybersecurity with AI - Ashrith Barthur
Cybersecurity with AI - Ashrith BarthurCybersecurity with AI - Ashrith Barthur
Cybersecurity with AI - Ashrith Barthur
Sri Ambati
 
ISAX
ISAXISAX
Using Machine Learning For Solving Time Series Probelms
Using Machine Learning For Solving Time Series ProbelmsUsing Machine Learning For Solving Time Series Probelms
Using Machine Learning For Solving Time Series Probelms
Sri Ambati
 
Skutil - H2O meets Sklearn - Taylor Smith
Skutil - H2O meets Sklearn - Taylor SmithSkutil - H2O meets Sklearn - Taylor Smith
Skutil - H2O meets Sklearn - Taylor Smith
Sri Ambati
 
Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015
Sri Ambati
 
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014
Sri Ambati
 
H2O Deep Learning at Next.ML
H2O Deep Learning at Next.MLH2O Deep Learning at Next.ML
H2O Deep Learning at Next.ML
Sri Ambati
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
Sri Ambati
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
Sri Ambati
 
Applying Machine Learning using H2O
Applying Machine Learning using H2OApplying Machine Learning using H2O
Applying Machine Learning using H2O
Sri Ambati
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Sri Ambati
 

Viewers also liked (20)

Nvidia Deep Learning Solutions - Alex Sabatier
Nvidia Deep Learning Solutions - Alex SabatierNvidia Deep Learning Solutions - Alex Sabatier
Nvidia Deep Learning Solutions - Alex Sabatier
 
Deep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno CandelDeep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno Candel
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
 
Deep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry LarkoDeep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry Larko
 
Sparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal MalohlavaSparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal Malohlava
 
Transformation, H2O Open Dallas 2016, Keynote by Sri Ambati,
Transformation, H2O Open Dallas 2016, Keynote by Sri Ambati, Transformation, H2O Open Dallas 2016, Keynote by Sri Ambati,
Transformation, H2O Open Dallas 2016, Keynote by Sri Ambati,
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling Water
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Cybersecurity with AI - Ashrith Barthur
Cybersecurity with AI - Ashrith BarthurCybersecurity with AI - Ashrith Barthur
Cybersecurity with AI - Ashrith Barthur
 
ISAX
ISAXISAX
ISAX
 
Using Machine Learning For Solving Time Series Probelms
Using Machine Learning For Solving Time Series ProbelmsUsing Machine Learning For Solving Time Series Probelms
Using Machine Learning For Solving Time Series Probelms
 
Skutil - H2O meets Sklearn - Taylor Smith
Skutil - H2O meets Sklearn - Taylor SmithSkutil - H2O meets Sklearn - Taylor Smith
Skutil - H2O meets Sklearn - Taylor Smith
 
Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015
 
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014
 
H2O Deep Learning at Next.ML
H2O Deep Learning at Next.MLH2O Deep Learning at Next.ML
H2O Deep Learning at Next.ML
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 
Applying Machine Learning using H2O
Applying Machine Learning using H2OApplying Machine Learning using H2O
Applying Machine Learning using H2O
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
 

Similar to H2O AutoML roadmap - Ray Peck

MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demandsMongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB
 
LIMS Implementation
LIMS ImplementationLIMS Implementation
LIMS Implementation
Robin Emig
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
Databricks
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
Sri Ambati
 
DevOps Days Rockies MLOps
DevOps Days Rockies MLOpsDevOps Days Rockies MLOps
DevOps Days Rockies MLOps
Matthew Reynolds
 
MongoDB World 2019: High Performance Auditing of Changes Based on MongoDB Cha...
MongoDB World 2019: High Performance Auditing of Changes Based on MongoDB Cha...MongoDB World 2019: High Performance Auditing of Changes Based on MongoDB Cha...
MongoDB World 2019: High Performance Auditing of Changes Based on MongoDB Cha...
MongoDB
 
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Open Data Group
 
Machine Learning - Eine Challenge für Architekten
Machine Learning - Eine Challenge für ArchitektenMachine Learning - Eine Challenge für Architekten
Machine Learning - Eine Challenge für Architekten
Harald Erb
 
Bigowl aitech
Bigowl aitechBigowl aitech
Techniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloudTechniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloud
Akshay Mathur
 
Algolytics company Overview 2015
Algolytics company Overview 2015Algolytics company Overview 2015
Algolytics company Overview 2015
Algolytics
 
Algolytics company Overview 2015
Algolytics company Overview 2015Algolytics company Overview 2015
Algolytics company Overview 2015
Algolytics (old account)
 
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Matt Stubbs
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud Services
Jean Ihm
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Ramiro Aduviri Velasco
 
Internship Presentation.pdf
Internship Presentation.pdfInternship Presentation.pdf
Internship Presentation.pdf
vishwajeetparmar1
 
Making advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders MeetupMaking advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders Meetup
Olivier Koch
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Denodo
 
What is the future of data strategy?
What is the future of data strategy?What is the future of data strategy?
What is the future of data strategy?
Denodo
 
SharePoint Site Redesign : Information Architecture and User-centered Design ...
SharePoint Site Redesign : Information Architecture and User-centered Design ...SharePoint Site Redesign : Information Architecture and User-centered Design ...
SharePoint Site Redesign : Information Architecture and User-centered Design ...
arsathe
 

Similar to H2O AutoML roadmap - Ray Peck (20)

MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demandsMongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
 
LIMS Implementation
LIMS ImplementationLIMS Implementation
LIMS Implementation
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
 
DevOps Days Rockies MLOps
DevOps Days Rockies MLOpsDevOps Days Rockies MLOps
DevOps Days Rockies MLOps
 
MongoDB World 2019: High Performance Auditing of Changes Based on MongoDB Cha...
MongoDB World 2019: High Performance Auditing of Changes Based on MongoDB Cha...MongoDB World 2019: High Performance Auditing of Changes Based on MongoDB Cha...
MongoDB World 2019: High Performance Auditing of Changes Based on MongoDB Cha...
 
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
 
Machine Learning - Eine Challenge für Architekten
Machine Learning - Eine Challenge für ArchitektenMachine Learning - Eine Challenge für Architekten
Machine Learning - Eine Challenge für Architekten
 
Bigowl aitech
Bigowl aitechBigowl aitech
Bigowl aitech
 
Techniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloudTechniques for scaling application with security and visibility in cloud
Techniques for scaling application with security and visibility in cloud
 
Algolytics company Overview 2015
Algolytics company Overview 2015Algolytics company Overview 2015
Algolytics company Overview 2015
 
Algolytics company Overview 2015
Algolytics company Overview 2015Algolytics company Overview 2015
Algolytics company Overview 2015
 
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEAR...
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud Services
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Internship Presentation.pdf
Internship Presentation.pdfInternship Presentation.pdf
Internship Presentation.pdf
 
Making advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders MeetupMaking advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders Meetup
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
What is the future of data strategy?
What is the future of data strategy?What is the future of data strategy?
What is the future of data strategy?
 
SharePoint Site Redesign : Information Architecture and User-centered Design ...
SharePoint Site Redesign : Information Architecture and User-centered Design ...SharePoint Site Redesign : Information Architecture and User-centered Design ...
SharePoint Site Redesign : Information Architecture and User-centered Design ...
 

More from Sri Ambati

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Sri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
Sri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
Sri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
Sri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
Sri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
Sri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
Sri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
Sri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
Sri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
Sri Ambati
 

More from Sri Ambati (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 

Recently uploaded

Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
welrejdoall
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
UiPathCommunity
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 

Recently uploaded (20)

Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 

H2O AutoML roadmap - Ray Peck

  • 1. H2O AutoML Roadmap 2016.10 Raymond Peck Director of Product Engineering, H2O.ai rpeck@h2o.ai © H2O.ai, 2016 1
  • 2. What Will We Cover? • What is AutoML? • What is the roadmap for H2O AutoML? © H2O.ai, 2016 2
  • 3. What is AutoML? H2O AutoML automates parts of data preparation and model training in order to help both Machine Learning / Data Science experts and complete novices. Other AutoML projects concentrate on novices. © H2O.ai, 2016 3
  • 4. Outside AutoML Projects • auto-sklearn • AutoCompete • TPOT • DataRobot • Automatic Statistician • BigML • et al... © H2O.ai, 2016 4
  • 5. Who is the Target Audience? • "Big green button" for novice users such as software developers and business analysts; • Iterative, interactive use and controls for expert users: • Machine Learning experts • Descriptive Data Scientists © H2O.ai, 2016 5
  • 6. What Are the Pieces? • data cleaning • feature engineering / feature generation • feature selection • for both the original and generated features • model hyperparameter tuning • automatic smart ensemble generation © H2O.ai, 2016 6
  • 7. Prior Work @ H2O • ensembles (stacking), from Erin LeDell • random hyperparameter search with automatic stopping, from Raymond Peck • some dataset characterization and feature engineering, from Spencer Aiello • hyperopt Bayesian hyperparameter optimization, from Abhishek Malali © H2O.ai, 2016 7
  • 8. Current Work • random hyperparameter search with parameter values based on open datasets • moving ensembles into the back end • working on basic metalearning for hyperparameter vectors, starting with 140 OpenML datasets © H2O.ai, 2016 8
  • 9. Future Work • feature selection • feature engineering for IID data • Bayesian hyperparameter search with warm start • feature engineering for non-IID data, e.g. time series • iterate w/ larger datasets that are typical for our customers • distribution guesser for regression © H2O.ai, 2016 9
  • 10. How Do We Evaluate Our Work? • public datasets from • OpenML • ChaLearn AutoML challenge • Kaggle • our own Data Scientists' work with customer datasets • customer feedback (soon) © H2O.ai, 2016 10
  • 11. Data Cleaning • outlier analysis (with user feedback) • sentinel value detection • as a side-effect of outlier analysis • type-based heuristics (e.g., 999999, 1970.01.01) • identifier detection (e.g., customer ID) • smart imputation © H2O.ai, 2016 11
  • 12. Feature Generation We will be using several techniques including: • type-based heuristics • date/time expansion • log and other transforms of numerics • interactions (product, ratio, etc) • feature generation with Deep Learning deepfeatures() • clustering © H2O.ai, 2016 12
  • 13. Feature Selection We will be evaluating several techniques including: • Mutual Information (non-linear correlation) • variable importance from GBM and Deep Learning • PCA • GLM with Elastic Net / LASSO Perhaps different selectors for initial data and transforms / interactions to trade off speed and the detection of non-linear relationships. © H2O.ai, 2016 13
  • 14. Hyperparameter Tuning • currently do random hyperparameter search with metric-based smart stopping • hyperparameter values taken from hand-tuning 140 OpenML datasets • soon adding simple "nearest neighbors" warm start (basic metalearning) • then adding Bayesian hyperparameter optimization • possibly integrating hyperopt into the back end © H2O.ai, 2016 14
  • 15. Automatic Smart Ensemble Generation • currently adding Erin LeDell's stacking / SuperLearner into the back end • initially, ensemble top N models from hyperparameter searches • optional "use original features" • smarter ensemble generation for faster scoring, less overfitting: • greedy ensemble creation • ensemble models with uncorrelated residuals © H2O.ai, 2016 15
  • 16. Possible Futures • try to predict accuracy from dataset metadata • training time prediction • scoring time prediction • multiple concurrent H2O clusters for speed • freeze/thaw model training • outlier analysis with user feedback • residuals analysis with user feedback • composite models using pre-clustering step © H2O.ai, 2016 16