SlideShare a Scribd company logo Confidential
Head of Applied AI, Beans.AI Confidential
Building LLM Solutions using Open Source and Closed Source Solutions
in Coherent Manner Confidential
Ecosystem of Open Source LLM Tools
The Significance of blending Open and Closed
Source LLMs
Open Source LLMs: Features and Benefits
Closed Source LLMs: Advantages and Use-Cases
Integration Strategies: Methodologies
Case Study Highlights and Learning from the
Industry Dataset Enrichment.
Low to No Code Fine-Tuning Techniques
v Confidential
About Me
• Head of Applied AI/Computer Vision, Beans AI
• Beans AI based out of Palo Alto, CA
• We are Location Intelligence Platform.
• Hyper-Accurate Maps, Much more accurate than
Google, Apple and Bing for apartments.
• Computer Vision and Image based Synthesis is
inherent part of Innovation at Beans Maps
• I deal with Satellite Imagery, Location Data,
Convexity Optimization domains at my day to day
• Holds Masters from Georgia Tech.

Recommended for you

Harish software engineer (rpa) 4+ yrs exp
Harish software engineer (rpa) 4+ yrs expHarish software engineer (rpa) 4+ yrs exp
Harish software engineer (rpa) 4+ yrs exp

The document contains a summary of an individual seeking work as a robotic process automation developer. It includes their contact information, experience working with RPA tools like Automation Anywhere, skills in SQL and other programming languages, and descriptions of four projects they worked on involving RPA for banking and insurance clients. Their roles involved requirements gathering, automation development, testing, and working with clients and other teams.


Pooja Vaishnavi is a software test engineer with over 5 years of experience in manual testing, Selenium automation, and business analysis. She has experience in Agile methodologies, database and web service testing, black box testing techniques, test case preparation and execution, bug tracking, and functional, system, regression, and performance testing. She is proficient in tools like Jira, SQL, Selenium, and has knowledge of Linux, SDLC, and ETL testing. Her experience includes projects in retail, banking, and data management domains for clients like Walmart, Karvy, CYIENT, and 1WorldSync.

Evolving to Cloud-Native - Anand Rao
Evolving to Cloud-Native - Anand RaoEvolving to Cloud-Native - Anand Rao
Evolving to Cloud-Native - Anand Rao

This document discusses the journey to cloud computing and cloud native applications. It covers evolving from on-premise servers and monolithic applications to distributed architectures like microservices, containers, and serverless functions. The key steps are assessing applications to determine readiness, prioritizing workloads based on business value, and establishing centers of excellence to help teams migrate applications incrementally through pilots while learning from others' experiences. The goal is to maximize cloud advantages like elastic scaling and continuous delivery while navigating technical challenges along the path to cloud native.

springone tourspringone platformcloud native
v Confidential
Ecosystem of LLMs
Wu Dao 2.0
Vicuna 33B
Proprietary/Closed Source Open Source
v Confidential
Open Source LLMs: Benefits
● Enhanced data security and privacy: Self-hosted deployment
● Cost savings: No licensing/subscription fees and no API calls expenditure.
● No External Dependency: No reliance on select few vendor avoiding lock-in.
● Code transparency and Constructive Collaboration/Validation: Underlying code and methodologies are
vetted for functionality by community.
● Language Model Customization: Domain Adaptation is more manageable with open-source LLMs by Fine-
● Active Community Support: Often thriving communities , Quicker issue resolution, access to resources and
● Fosters innovation: Open-source LLMs encourage innovation by enabling organizations to experiment and
build upon existing models.
● Boon for Startups: Leverage models as a foundation for creative and unique applications.
v Confidential
Closed Source LLMs: Advantages
● Support and Reliability: Vendor Support, Professional assistance, Maintenance, Troubleshooting, SLA Requirements
● Customization for Specific Business Needs: Accommodate Unique requirements of a business
● Security and Data Privacy: May offer more robust security features and privacy assurances critical for industries with
sensitive data.
● Performance: Regular optimizations and enhancements for better performance for specific tasks or industry
● Integration with Proprietary Systems: Dedicated tooling support to use existing proprietary software stacks within an
organization to avoid extensive re-engineering.
● Compliance and Liability: For regulated industries greater assurance of compliance.
● Guardrails Ownership: Responsibility for compliance often falls on the vendor, reducing the legal and financial risks for
the user.
● Continuous Development and Updates: Dedicated teams to keep up-to-date cutting edge
● Commercial Viability: Better viability for businesses with limited resource and investments, Enable quicker feature
v Confidential
Best of Both Worlds
● Number of options available.
● Possibility of exploiting more than one solution.
● In-house Performance comparison for “your” task, Not just a benchmark.
● Different baselines for particular domain adaptation.
● Amount of fine tuning needs are not same for similar tasks.
● Possibility of using specific LLM solution for specific task in pipeline.
● Combinations available to hyper ensemble these LLM solutions.
● Ability to pick and choose LLMs without affecting other LLMs in play.

Recommended for you

BUDDY White Paper
BUDDY White PaperBUDDY White Paper
BUDDY White Paper

Buddy, partnered with industry leaders such as Amazon, Docker, Github, Microsoft, and Google, is a winning development automation platform that serves a rapidly growing market valued to become $345 billion by 2022. Over 7,000 developers use Buddy every day across 120+ countries. Featured customers: INC. Magazine, & ING Bank. Our vision is to become the backbone on which talented people can build world-altering apps & services. Our goal is to take the load off millions of developers by offloading everything that can be automated – giving them back the time for being creative.

Mendix Essentials Presentatie Gerolf Roovers26/08/2011
Mendix Essentials Presentatie Gerolf Roovers26/08/2011Mendix Essentials Presentatie Gerolf Roovers26/08/2011
Mendix Essentials Presentatie Gerolf Roovers26/08/2011

This document summarizes Mendix's agile application development platform. It discusses how Mendix solves issues like lack of business and IT alignment and rigid enterprise software by enabling rapid application development and continuous collaboration. Mendix's platform allows non-technical users to visually model applications and includes tools for capturing requirements, development, deployment and management. Case studies show Mendix can reduce development times by up to 5x and Mendix offers a no-cure, no-pay proof-of-concept model to demonstrate capabilities.

Assure - Complete Standardization for HP ALM
Assure - Complete Standardization for HP ALMAssure - Complete Standardization for HP ALM
Assure - Complete Standardization for HP ALM

HP ALM/QC projects are often customized to suit different needs and processes. However, this benefit and flexibility can easily become a challenge. Projects with different configuration poses serious challenges when taking HP ALM’s cross-project features in use. Assure Standardization suite will resolve these issues safely and quickly.

standard processstandardizationconsolidation
v Confidential
Integration Strategies
● Grunualize the task at hand:
Break LLM “initiative” into LLM “tasks”
● Categorize the tasks by Stochasticity Tolerance and Criticality:
Different LLM solutions pose varying degree of temperature sensitivity.
● Less tolerant tasks are candidate for Proprietary Off-the-shelf solutions.
● More tolerant tasks are candidate for Open Source with no or less fine tuning needs.
v Confidential
Case Study Highlights
● At Beans.AI, we use combination of approaches like:
Few tasks are achieved using Prompt Engineering/RAG based approach.
Few tasks are achieved using Limited to moderate Fine Tuning.
● Both Closed source and Open Source LLMs are used.
● Responses from Closed Source LLMs are used by Open Source LLMs and vice versa in pipeline.
● Used for automated support, Insights from dashboard, automated email order etc.
v Confidential
Dataset Enrichment
● No, NOT THAT data enrichment!
● Most of the time:
For “your” purpose, you need “your” data.
● “Your” data is limited by:
Quantity, Quality and Variety
● LLMs are used to overcome:
Quantity: By creating more samples of data
Quality: By working with humans in the loop type setup
Variety: By revising and rewriting intents in many different possible ways.
v Confidential
Dataset Enrichment(cond.)
● Task: Question Answering Bot for your particular app. Say: Delivery Support App.
● Interaction: Delivery Driver asks a question in the app and expects “how-to” type response.
Question: How do I mark an address not deliverable in the app?
Candidate Answer: Explains the steps to do the same.
● Current Training Data: Set of Questions and Answers in knowledge article.
Enrichment Step:
Prompt engineered app to create variations of your domain specific questions as:
“Ask the above question in 20 different ways”
All these new 20 ways of asking the “same” question, create new training examples for you.

Recommended for you

A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf
A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdfA Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf
A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf

Are you trying to decide between low code, no code, and traditional development? Our comprehensive breakdown of each option will help you make the best decision for your project.

codingno codelow code
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though PartnershipsTHE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships

Small businesses struggle with piecing together disparate systems from multiple vendors, costing time and money and inhibiting productivity. Open source software provides an opportunity for partners to help small businesses implement enterprise-class open source applications and gain significant revenue through professional services. The document outlines CyOp's value proposition as a partner that can implement and support open source applications both on-premises and in the cloud to address small business needs without vendor lock-in.

CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...

Speaker: Franz Walder, Product Manager, panagenda Abstract: panagenda reached out to 750+ professionals to share their company’s Domino application strategy. Join this session to find out what was most important to your peers and what challenges they had to overcome to make their project a success. Find out about the critical questions everybody should ask and have answers to throughout their project. Franz Walder presents the exciting results of the survey and explains what role analytics can play when tackling these challenges.

#franz walder#hcl domino#application strategy
v Confidential
Dataset Enrichment(cond.)
All of these questions below ask the EXACT same thing!
● What's the process for labeling an address as undeliverable within the application?
● Can you guide me through the steps to indicate that an address is non-deliverable in the app?
● How can I flag an address as undeliverable when using the app?
● What is the method for setting an address to 'not deliverable' status in the application?
● Is there a way to mark an address as 'cannot be delivered to' in the app interface?
● Could you explain how to designate an address as not deliverable on the app?
● I'm looking to mark an address as non-deliverable in the app; how do I do that?
● How does one go about indicating that an address is not serviceable in the app?
● In the app, what are the steps to mark an address as one that can't be delivered to?
● What’s the procedure to flag an address as 'not deliverable' in the app's system?
v Confidential
Case Optimization
● We at Beans.AI use LLMs o analyze the pipeline to be used on the fly.
● E.g.
Super Set of Tasks For the Jobs:
Task 1, Task 2, Task 3, Task 4, Task 5, Task 6
Set of Tasks actually need for “job” instance:
Task 2, Task 4 and Task 6 Only.
● Proprietary LLMs with higher reasoning and guardrails is used to find the Tasks needed to be run.
● Short-listed Tasks can be run with actual sensitive data uses locally deployed Open Source LLM.
v Confidential
Low to No Code Fine-Tuning Techniques
● H2O LLM Studio: Equivalent to Stable Diffusion’s Automatic1111 or ComfyUI.
● Fine-tune Open Source LLMs without any coding, While can extensible with code.
● GUI specially for LLMs.
● Support for hyperparameters specific to finetune of LLMs.
● Support Low-Rank Adaptation (LoRA) and lower quantization to achieve lean memory footprint.
● Model Performance Tracking in UI.
● Test the fine-tuned model by testing it to get instant feedback.
● Most Important Enabler: Almost touch-less export to Hugging Face Hub.
v Confidential
Low to No Code Fine-Tuning Techniques(cond.)
My first fine-tuning using LLM Studio took almost same time as this presentation!

Recommended for you

[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...
[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...
[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...

This document discusses the benefits of using open source software to manage API lifecycles. It notes that digital transformation requires integrating new technologies rapidly, which open source allows through wider collaboration and input. Open source ensures better security, transparency, and extensibility. It also leads to higher quality code through more eyeballs and passionate developers. Open source APIs are also more cost effective and support corporate social responsibility goals. The document cites WSO2 as an example of an open source API management vendor that contributes significantly to many open source projects.

by WSO2
api managementapidaysopen source

- Mohan M has over 3 years of experience developing applications using Python and Django. He has skills in Python, Django, MySQL, Linux, and Eclipse. - Some of his responsibilities include developing tools for arithmetic calculations, creating charts and graphs, processing data, monitoring backups, and collaborating with application teams on installation, updates, and version upgrades. - He has worked on projects involving transaction analysis for a bank to identify customer behavior trends and a manual testing project for an e-commerce site's coupon and savings platform.

Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade

This document discusses best practices for developing data science products at Philip Morris International (PMI). It covers: - PMI's data science team of over 40 people across four hubs working on fraud prevention and other problems. - Key principles for PMI's data science work, including being business-driven, investing in people, self-organizing, iterating to improve, and co-creating solutions. - Challenges in data product development involving integrating work between data scientists and other teams, and practices like continuous integration/delivery to overcome these challenges. - The role of data scientists in contributing code that is readable, testable, reusable, reproducible, and usable by other teams to integrate into

webhack Confidential

More Related Content

Similar to Building LLM Solutions using Open Source and Closed Source Solutions in Coherent Manner

Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Michael Ming Lei
What are the Best Practices for Enterprise Software Applications?
What are the Best Practices for Enterprise Software Applications?What are the Best Practices for Enterprise Software Applications?
What are the Best Practices for Enterprise Software Applications?
BoTree Technologies
Harish software engineer (rpa) 4+ yrs exp
Harish software engineer (rpa) 4+ yrs expHarish software engineer (rpa) 4+ yrs exp
Harish software engineer (rpa) 4+ yrs exp
Harish M
Pooja vaishnavi
Evolving to Cloud-Native - Anand Rao
Evolving to Cloud-Native - Anand RaoEvolving to Cloud-Native - Anand Rao
Evolving to Cloud-Native - Anand Rao
VMware Tanzu
BUDDY White Paper
BUDDY White PaperBUDDY White Paper
BUDDY White Paper
Achmad Surya Afandy
Mendix Essentials Presentatie Gerolf Roovers26/08/2011
Mendix Essentials Presentatie Gerolf Roovers26/08/2011Mendix Essentials Presentatie Gerolf Roovers26/08/2011
Mendix Essentials Presentatie Gerolf Roovers26/08/2011
Assure - Complete Standardization for HP ALM
Assure - Complete Standardization for HP ALMAssure - Complete Standardization for HP ALM
Assure - Complete Standardization for HP ALM
A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf
A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdfA Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf
A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf
Expert App Devs
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though PartnershipsTHE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...
[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...
[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
Fangda Wang
How to become a Software Engineer Carrier Path for Software Developer
How to become a Software Engineer Carrier Path for Software DeveloperHow to become a Software Engineer Carrier Path for Software Developer
How to become a Software Engineer Carrier Path for Software Developer
jeetendra mandal
Introduce Test Harness for Direct To Consumer Solutions.pdf
Introduce Test Harness for Direct To Consumer Solutions.pdfIntroduce Test Harness for Direct To Consumer Solutions.pdf
Introduce Test Harness for Direct To Consumer Solutions.pdf
Knoldus Inc.
Karuna Resume
Karuna ResumeKaruna Resume
Karuna Resume
karuna karu
DevOps for Enterprise Systems - Rosalind Radcliffe
DevOps for Enterprise Systems - Rosalind RadcliffeDevOps for Enterprise Systems - Rosalind Radcliffe
DevOps for Enterprise Systems - Rosalind Radcliffe
DevOps for Enterprise Systems

Similar to Building LLM Solutions using Open Source and Closed Source Solutions in Coherent Manner (20)

Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
What are the Best Practices for Enterprise Software Applications?
What are the Best Practices for Enterprise Software Applications?What are the Best Practices for Enterprise Software Applications?
What are the Best Practices for Enterprise Software Applications?
Harish software engineer (rpa) 4+ yrs exp
Harish software engineer (rpa) 4+ yrs expHarish software engineer (rpa) 4+ yrs exp
Harish software engineer (rpa) 4+ yrs exp
Evolving to Cloud-Native - Anand Rao
Evolving to Cloud-Native - Anand RaoEvolving to Cloud-Native - Anand Rao
Evolving to Cloud-Native - Anand Rao
BUDDY White Paper
BUDDY White PaperBUDDY White Paper
BUDDY White Paper
Mendix Essentials Presentatie Gerolf Roovers26/08/2011
Mendix Essentials Presentatie Gerolf Roovers26/08/2011Mendix Essentials Presentatie Gerolf Roovers26/08/2011
Mendix Essentials Presentatie Gerolf Roovers26/08/2011
Assure - Complete Standardization for HP ALM
Assure - Complete Standardization for HP ALMAssure - Complete Standardization for HP ALM
Assure - Complete Standardization for HP ALM
A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf
A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdfA Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf
A Comprehensive Breakdown of Low Code, No Code and Traditional Development.pdf
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though PartnershipsTHE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships
THE OPEN SOURCE OPPORTUNITY: Monetizing Open Source Though Partnerships
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...
[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...
[APIdays Singapore 2019] Managing the API lifecycle with Open Source Technolo...
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
How to become a Software Engineer Carrier Path for Software Developer
How to become a Software Engineer Carrier Path for Software DeveloperHow to become a Software Engineer Carrier Path for Software Developer
How to become a Software Engineer Carrier Path for Software Developer
Introduce Test Harness for Direct To Consumer Solutions.pdf
Introduce Test Harness for Direct To Consumer Solutions.pdfIntroduce Test Harness for Direct To Consumer Solutions.pdf
Introduce Test Harness for Direct To Consumer Solutions.pdf
Karuna Resume
Karuna ResumeKaruna Resume
Karuna Resume
DevOps for Enterprise Systems - Rosalind Radcliffe
DevOps for Enterprise Systems - Rosalind RadcliffeDevOps for Enterprise Systems - Rosalind Radcliffe
DevOps for Enterprise Systems - Rosalind Radcliffe

More from Sri Ambati

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati CEO/Founder: Sri Ambati Keynote at Wells Fargo Day CEO/Founder: Sri Ambati Keynote at Wells Fargo CEO/Founder: Sri Ambati Keynote at Wells Fargo Day CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Sri Ambati
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
Sri Ambati
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
Sri Ambati
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
Sri Ambati
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
Sri Ambati
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
Sri Ambati
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
Sri Ambati
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
Sri Ambati
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
Sri Ambati
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
Sri Ambati
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Sri Ambati
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Sri Ambati
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
Sri Ambati
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
Sri Ambati
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
Sri Ambati
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
Sri Ambati
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
Sri Ambati
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
Sri Ambati
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
Sri Ambati

More from Sri Ambati (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O... CEO/Founder: Sri Ambati Keynote at Wells Fargo Day CEO/Founder: Sri Ambati Keynote at Wells Fargo CEO/Founder: Sri Ambati Keynote at Wells Fargo Day CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey

Recently uploaded

Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Awais Yaseen
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Andrey Yasko
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels

Recently uploaded (20)

Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Choose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presenceChoose our Linux Web Hosting for a seamless and successful online presence
Choose our Linux Web Hosting for a seamless and successful online presence
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx

Building LLM Solutions using Open Source and Closed Source Solutions in Coherent Manner

  • 1. Confidential SANDEEP SINGH Head of Applied AI, Beans.AI
  • 2. Confidential Building LLM Solutions using Open Source and Closed Source Solutions in Coherent Manner
  • 3. Confidential Agenda Introduction Ecosystem of Open Source LLM Tools The Significance of blending Open and Closed Source LLMs Open Source LLMs: Features and Benefits Closed Source LLMs: Advantages and Use-Cases Integration Strategies: Methodologies Case Study Highlights and Learning from the Industry Dataset Enrichment. Low to No Code Fine-Tuning Techniques Conclusion
  • 4. v Confidential About Me • Head of Applied AI/Computer Vision, Beans AI • Beans AI based out of Palo Alto, CA • We are Location Intelligence Platform. • Hyper-Accurate Maps, Much more accurate than Google, Apple and Bing for apartments. • Computer Vision and Image based Synthesis is inherent part of Innovation at Beans Maps • I deal with Satellite Imagery, Location Data, Convexity Optimization domains at my day to day job. • Holds Masters from Georgia Tech.
  • 5. v Confidential Ecosystem of LLMs GPT-4 PaLM SageMaker Neo IBM Watson Salesforce Einstein Wu Dao 2.0 Clarifai Cohere Anthropic Claude MT-NLG LLaMA 2 Falcon- 40B/180B Vicuna 33B MPT-30B GPT-NeoX- 20B CodeGen GPT-J OPT-175B BLOOM Baichuan- 13B Proprietary/Closed Source Open Source
  • 6. v Confidential Open Source LLMs: Benefits ● Enhanced data security and privacy: Self-hosted deployment ● Cost savings: No licensing/subscription fees and no API calls expenditure. ● No External Dependency: No reliance on select few vendor avoiding lock-in. ● Code transparency and Constructive Collaboration/Validation: Underlying code and methodologies are vetted for functionality by community. ● Language Model Customization: Domain Adaptation is more manageable with open-source LLMs by Fine- tuning. ● Active Community Support: Often thriving communities , Quicker issue resolution, access to resources and collaborative. ● Fosters innovation: Open-source LLMs encourage innovation by enabling organizations to experiment and build upon existing models. ● Boon for Startups: Leverage models as a foundation for creative and unique applications.
  • 7. v Confidential Closed Source LLMs: Advantages ● Support and Reliability: Vendor Support, Professional assistance, Maintenance, Troubleshooting, SLA Requirements ● Customization for Specific Business Needs: Accommodate Unique requirements of a business ● Security and Data Privacy: May offer more robust security features and privacy assurances critical for industries with sensitive data. ● Performance: Regular optimizations and enhancements for better performance for specific tasks or industry ● Integration with Proprietary Systems: Dedicated tooling support to use existing proprietary software stacks within an organization to avoid extensive re-engineering. ● Compliance and Liability: For regulated industries greater assurance of compliance. ● Guardrails Ownership: Responsibility for compliance often falls on the vendor, reducing the legal and financial risks for the user. ● Continuous Development and Updates: Dedicated teams to keep up-to-date cutting edge ● Commercial Viability: Better viability for businesses with limited resource and investments, Enable quicker feature developments.
  • 8. v Confidential Best of Both Worlds ● Number of options available. ● Possibility of exploiting more than one solution. ● In-house Performance comparison for “your” task, Not just a benchmark. ● Different baselines for particular domain adaptation. ● Amount of fine tuning needs are not same for similar tasks. ● Possibility of using specific LLM solution for specific task in pipeline. ● Combinations available to hyper ensemble these LLM solutions. ● Ability to pick and choose LLMs without affecting other LLMs in play.
  • 9. v Confidential Integration Strategies ● Grunualize the task at hand: Break LLM “initiative” into LLM “tasks” ● Categorize the tasks by Stochasticity Tolerance and Criticality: Different LLM solutions pose varying degree of temperature sensitivity. ● Less tolerant tasks are candidate for Proprietary Off-the-shelf solutions. ● More tolerant tasks are candidate for Open Source with no or less fine tuning needs.
  • 10. v Confidential Case Study Highlights ● At Beans.AI, we use combination of approaches like: Few tasks are achieved using Prompt Engineering/RAG based approach. Few tasks are achieved using Limited to moderate Fine Tuning. ● Both Closed source and Open Source LLMs are used. ● Responses from Closed Source LLMs are used by Open Source LLMs and vice versa in pipeline. ● Used for automated support, Insights from dashboard, automated email order etc.
  • 11. v Confidential Dataset Enrichment ● No, NOT THAT data enrichment! ● Most of the time: For “your” purpose, you need “your” data. ● “Your” data is limited by: Quantity, Quality and Variety ● LLMs are used to overcome: Quantity: By creating more samples of data Quality: By working with humans in the loop type setup Variety: By revising and rewriting intents in many different possible ways.
  • 12. v Confidential Dataset Enrichment(cond.) Example: ● Task: Question Answering Bot for your particular app. Say: Delivery Support App. ● Interaction: Delivery Driver asks a question in the app and expects “how-to” type response. Question: How do I mark an address not deliverable in the app? Candidate Answer: Explains the steps to do the same. ● Current Training Data: Set of Questions and Answers in knowledge article. Enrichment Step: Prompt engineered app to create variations of your domain specific questions as: “Ask the above question in 20 different ways” All these new 20 ways of asking the “same” question, create new training examples for you.
  • 13. v Confidential Dataset Enrichment(cond.) All of these questions below ask the EXACT same thing! ● What's the process for labeling an address as undeliverable within the application? ● Can you guide me through the steps to indicate that an address is non-deliverable in the app? ● How can I flag an address as undeliverable when using the app? ● What is the method for setting an address to 'not deliverable' status in the application? ● Is there a way to mark an address as 'cannot be delivered to' in the app interface? ● Could you explain how to designate an address as not deliverable on the app? ● I'm looking to mark an address as non-deliverable in the app; how do I do that? ● How does one go about indicating that an address is not serviceable in the app? ● In the app, what are the steps to mark an address as one that can't be delivered to? ● What’s the procedure to flag an address as 'not deliverable' in the app's system?
  • 14. v Confidential Case Optimization ● We at Beans.AI use LLMs o analyze the pipeline to be used on the fly. ● E.g. Super Set of Tasks For the Jobs: Task 1, Task 2, Task 3, Task 4, Task 5, Task 6 Set of Tasks actually need for “job” instance: Task 2, Task 4 and Task 6 Only. ● Proprietary LLMs with higher reasoning and guardrails is used to find the Tasks needed to be run. ● Short-listed Tasks can be run with actual sensitive data uses locally deployed Open Source LLM.
  • 15. v Confidential Low to No Code Fine-Tuning Techniques ● H2O LLM Studio: Equivalent to Stable Diffusion’s Automatic1111 or ComfyUI. ● Fine-tune Open Source LLMs without any coding, While can extensible with code. ● GUI specially for LLMs. ● Support for hyperparameters specific to finetune of LLMs. ● Support Low-Rank Adaptation (LoRA) and lower quantization to achieve lean memory footprint. ● Model Performance Tracking in UI. ● Test the fine-tuned model by testing it to get instant feedback. ● Most Important Enabler: Almost touch-less export to Hugging Face Hub.
  • 16. v Confidential Low to No Code Fine-Tuning Techniques(cond.) My first fine-tuning using LLM Studio took almost same time as this presentation!