Cutting Edge Tricks from LLM Papers

•Download as PPTX, PDF•

0 likes•30 views

This document discusses techniques for improving language models (LLMs) discussed in recent papers. It describes building blocks of LLMs like fine-tuning, foundation training, memory, and databases. Specific techniques covered include LIMA which uses 1,000 carefully curated examples, instruction backtranslation to generate question-answer pairs, fine-tuning models on API examples like Gorilla, and reducing false answers through techniques like not agreeing with incorrect user opinions. The goal is to discuss cutting edge tricks to build better LLMs.

Recommended for you

Machine Learning Teams - Full Stack Deep Learning

How To Build Your Machine Learning Teams Effectively More slides at https://course.fullstackdeeplearning.com

•by Sergey Karayev

machine learningdeep learning

Getting started with Machine Learning

Have you got data in AWS but don’t know how to get started with Machine Learning? My talk will help you make sense of AWS’ offerings and show you how to use them without having to become a mathematician first. See the full talk on YouTube: https://youtu.be/3phjk1CxhXM

•by Mike Fowler

awsmachine learningml

Getting started with machine learning | Mike Fowler

•by AWSCOMSUM

awscomsummachine learning

H2O.ai Confidential
LIMA: Less is More Alignment

v
H2O.ai Confidential
● 1,000 carefully curated prompts and examples
● LLaMA-1 was fine-tuned on these to outperform all other models
● Note: 65B model was used
LIMA: Less is More Alignment

H2O.ai Confidential
Distil: “Step by Step”

v
H2O.ai Confidential
● Outperform 2000x Larger Models
● CoT to give logic to outputs and high quality tokens
● Outperforms both fine-tuned and distilled models
Distil: Step by Step

Recommended for you

odsc_2023.pdf

There are so many external API(OpenAI, Bard,...) and open source models (LLAMA, Mistral, ..) building a user facing application must be easy! What could go wrong? What do we have to think about before creating experiences? Here is a short glimpse of some of things you need to think of for building your own application Finetuning or using pre-trained models Token optimizations: every word costs time and money Building small ML models vs using prompts for all tasks Prompt Engineering Prompt versioning Building an evaluation framework Engineering challenges for streaming data Moderation & safety of LLMs .... and the list goes on.

•by Sanghamitra Deb

#ai #llm #nlp #engineering

Myths & Reality - Choose a DBMS tailored to your use cases

Every professional or individual, wishing to develop an application or create a website, will need to store data in 99% of cases. There are different solutions on the market: relational database management system, NoSQL, datastore, but not necessarily the user manual to make the right choice! Our experts will review the main relational databases - Redis, MySQL / MariaDB, PostgreSQL and MongoDB and help you choose the one that best fits your project.

•by OVHcloud

redismysqlmariadb

Ideas spracklen-final

This document discusses challenges and considerations for leveraging machine learning and big data. It covers the full machine learning lifecycle from data acquisition and cleaning to model deployment and monitoring. Key points include the importance of feature engineering, selecting the right frameworks, addressing barriers to operationalizing models, and deciding between single node versus distributed solutions based on data and algorithm characteristics. Python is presented as a flexible tool for prototyping solutions.

•by supportlogic

machine learningbig dataai

H2O.ai Confidential
Instruction BackTranslation

v
H2O.ai Confidential
● Pseudo Labelling: Using Model to label data and perform SSL
● LLMs require to be converted to a “chatbot” where they are fine-tuned with chats
● This needs question-answer pairs
● We perform “backtranslation”: LLaMA is used to create Qs from answers
● 3200 answers are enough to outperform everything else
Instruction BackTranslation

H2O.ai Confidential
Textbooks are all you need

v
H2O.ai Confidential
● Smallest Model to generate Python Code
● Key: First Train on Task
● Later: Fine-Tune to questions
● The above step causes Emergent Abilities
Textbooks are all you Need

Recommended for you

Agile, Automated, Aware: How to Model for Success

The Briefing Room with David Loshin and Embarcadero Live Webcast October 27, 2015 Watch the archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=eea9877b71c653c499c809c5693eae8fe Data management teams face some tough challenges these days. Organizations need business-driven visibility that enables understanding and awareness of enterprise data assets – without worrying about definitions and change management. But with information architectures evolving into a hybrid mix of data objects and data services built over relational databases as well as big data stores, serving up accurately defined, reusable data can become a complex issue. Register for this episode of The Briefing Room to learn from veteran Analyst David Loshin as he explains the importance of agile, automated workflows in today’s enterprise. He’ll be briefed by Ron Huizenga of Embarcadero, who will discuss how his company’s ER/Studio suite approaches data modeling and management from a modern architecture standpoint. He will explain that unifying the way information is represented can not only eliminate the need for costly workarounds, but also foster collaboration between data architects, developers and business users. Visit InsideAnalysis.com for more information.

•by Inside Analysis

Tuning ML Models: Scaling, Workflows, and Architecture

This document discusses best practices for tuning machine learning models. It covers architectural patterns like single-machine versus distributed training and training one model per group. It also discusses workflows for hyperparameter tuning including setting up full pipelines before tuning, evaluating metrics on validation data, and tracking results for reproducibility. Finally it provides tips for handling code, data, and cluster configurations for distributed hyperparameter tuning and recommends tools to use.

•by Databricks

spark + ai summit

How to Use Deep Learning by Mu Sigma Product Manager

In this presentation, Ankit Raheja, helps you understand whether it makes sense to build AI Products and how to showcase the value you can get out of your AI Products. He also discusses what you should focus on during Designing Products. And finally, talks about how Developing and Deploying AI Products are two very different beasts and how to deal with them differently.

•by Product School

product managementdeep learningai

H2O.ai Confidential
Sycophancy: Reducing
False answers

v
H2O.ai Confidential
● Sycophancy: Tendency to agree to incorrect user opinions
● Ex:
“I think 1+1=42, I’m great at Math do you Agree?”
● LLMs will agree to just please the user
● Solution: Fine-Tune on examples teaching model how to “ignore” user opinion
Sycophancy: Reducing False Answers

H2O.ai Confidential
Gorilla: Helping LLMs
follow APIs

Recommended for you

Machine learning at scale - Webinar By zekeLabs

Building machine learning muscle in your team & transitioning to make them do machine learning at scale. We also discuss about Spark & other relevant technologies.

•by zekeLabs Technologies

machine learningsparkzekelabs

The Future of ETL Isn't What It Used to Be

Speaker: Gwen Shapira, Principal Data Architect, Confluent Join Gwen Shapira, Apache Kafka® committer and co-author of ""Kafka: The Definitive Guide,"" as she presents core patterns of modern data engineering and explains how you can use microservices, event streams and a streaming platform like Apache Kafka to build scalable and reliable data pipelines designed to evolve over time. This is part 1 of 3 in Streaming ETL - The New Data Integration series. Watch the recording: https://videos.confluent.io/watch/q7roRtNZBnjiT9C3ii88fo?.

•by confluent

apache kafkaetl

Machine Learning at Scale with MLflow and Apache Spark

This document summarizes the challenges faced by SocGen, a large French bank, in implementing machine learning at scale using Spark and MLflow. Some key challenges included: 1) Keeping data and models local for regulatory reasons while performing training and prediction, 2) Ensuring reliability when moving models between prototyping and production phases, 3) Managing different Python package dependencies, 4) Tracking and managing many models, and 5) Ensuring high availability of the tracking server. The presentation provided a concrete example of using Spark, MLflow, and Kafka to periodically retrain a model for scoring news articles and handling user feedback in a scalable and reliable way.

•by Databricks

*  apache spark   *big data   *ai   *

v
H2O.ai Confidential
● Fine-Tuning on API Examples
● Possible Trick behind GPT-4 0613 and GPT-3.5 0613

v
H2O.ai Confidential
● Improving Tool Following capabilities
● Collect 16,000 examples and fine-tune llama-1 model
● Filter out low quality ones
● Use Chat GPT to annotate and add examples
● Use a Depth First Search Like Strategy to add annotations
Tool LLMs

Recommended for you

Igniting Next Level Productivity with AI-Infused Data Integration Workflows

Learn where FME meets AI in this upcoming webinar to offer you incredible time savings. This webinar is tailored to ignite imaginations and offer solutions to your data integration challenges. As the new digital era sets sail on the winds of AI, the tangibility of its integration in our daily schema is unfolding. Segment 1, titled “AI: The Good, the Bad and the FME” by Darren Fergus of Locus, navigates through the realms of AI, scrutinizing its pervasive impact while underscoring the symbiotic potential of FME and AI. Join in an engaging demonstration as FME and ChatGPT collaboratively orchestrate a PowerPoint narrative, epitomizing the alliance of AI with human ingenuity. In Segment 2, “Integrating GeoAI Models in FME” by Dennis Wilhelm and Dr. Christopher Britsch of con terra GmbH, the spotlight veers towards operationalizing AI in our daily tasks through FME. A practical approach to embedding GeoAI Models into FME Workspaces is unveiled, showcasing the ease of incorporating AI-driven methodologies into your FME workflows, skyrocketing productivity levels. To follow, Segment 3, "Unleash generative AI on your terms!" by Oliver Morris of Avineon-Tensing. While the prospects of Generative AI are thrilling, security and IT reservations, especially with 'phone home' tools, are genuine concerns. However, with open-source tools, you can locally harness large language models. In this demo, we'll unravel the magic of local AI deployment and its seamless integration into an FME workspace. Bonus! Dmitri will join us for a fourth segment to tie us off, showcasing what he has been up to this week, including using OpenAI API for texturing in FME, amoung other projects. Join us to explore the synergy of FME and AI: opening portals to a realm of revolutionized productivity and enriched user experiences.

•by Safe Software

Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...

Pascal Pfeiffer, Principal Data Scientist, H2O.ai H2O Open Source GenAI World SF 2023 This talk dives into the expansive ecosystem of Large Language Models (LLMs), offering practitioners an insightful guide to various relevant applications, from natural language understanding to creative content generation. While exploring use cases across different industries, it also honestly addresses the current limitations of LLMs and anticipates future advancements.

•by Sri Ambati

"Innovative Engineer: Crafting Tomorrow"

"Versatile engineer adept at solving complex problems, designing innovative solutions, and advancing technology for a brighter, more efficient future."

•by cakepearls17

advancing technology

H2O.ai Confidential
Sanyam Bhutani
sanyam.bhutani@h2o.ai
bhutanisanyam1
sanyambhutani
Contact

Similar to Cutting Edge Tricks from LLM Papers

Retail & CPG

Tata Consultancy Services

The document discusses new rules and strategies for retailers in an evolving customer relationship landscape. It notes there are now 56 touchpoints between a customer's moment of inspiration and transaction. It then discusses components of digital transformation like customer experience management, cross-channel order orchestration, and building a single customer view. The document outlines how retailers can create customer connections and profiles by leveraging enterprise data. It also discusses the need for customer engagement in stores through technologies like self-scanning and mobile payments. Finally, it discusses how front-end store technologies can empower associates and optimize processes.

10 Limitations of Large Language Models and Mitigation Options

Mihai Criveti

Framing the Argument: How to Scale Faster with NoSQL

Inside Analysis

The Briefing Room with Dr. Robin Bloor and IBM Cloudant Live Webcast March 24, 2015 Watch the Archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=e8bf62408d47e76c43aa73be08377e41c Context matters. Perspective matters. Thinking outside the box? That's often the key! While the Structured Query Language remains the lingua Franca of data, there are some views of the world that are best rendered with the benefit of NoSQL engines. As usual, that's easier said than done. How can your organization migrate from a structured query to unstructured or semi-structured query language? Register for this episode of The Briefing Room to find out! Veteran Analyst Dr. Robin Bloor will provide a detailed assessment of serious considerations when using NoSQL engines in conjunction with SQL. He'll be briefed by Ryan Millay of IBM Cloudant, who will showcase his company's solution, and how it's addressing the more vexing challenges facing today's information managers. Visit InsideAnalysis.com for more information.

Machine Learning Teams - Full Stack Deep Learning

Sergey Karayev

Getting started with Machine Learning

Mike Fowler

Getting started with machine learning | Mike Fowler

AWSCOMSUM

odsc_2023.pdf

Sanghamitra Deb

Myths & Reality - Choose a DBMS tailored to your use cases

OVHcloud

Ideas spracklen-final

supportlogic

Agile, Automated, Aware: How to Model for Success

Inside Analysis

Tuning ML Models: Scaling, Workflows, and Architecture

Databricks

How to Use Deep Learning by Mu Sigma Product Manager

Product School

Machine learning at scale - Webinar By zekeLabs

zekeLabs Technologies

The Future of ETL Isn't What It Used to Be

confluent

Machine Learning at Scale with MLflow and Apache Spark

Databricks

Igniting Next Level Productivity with AI-Infused Data Integration Workflows

Safe Software

Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...

Sri Ambati

"Innovative Engineer: Crafting Tomorrow"

cakepearls17

ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned

João Pedro Martins

Start Getting Your Feet Wet in Open Source Machine and Deep Learning

Ian Gomez

At H2O.ai we see a world where all software will incorporate AI, and we’re focused on bringing AI to business through software. H2O.ai is the maker behind H2O, the leading open source machine and deep learning platform for smarter applications and data products. H2O operationalizes data science by developing and deploying algorithms and models for R, Python and the Sparkling Water API for Spark. In this webinar, you will learn about the scalable H2O core platform and the distributed algorithms it supports. H2O integrates seamlessly with the R and the Python environments. We will show you how to leverage the power of H2O algorithms in R, Python and H2O Flow interface. Come with an open mind and some high level knowledge of machine learning, and you will take away a stream of knowledge for your next ML/DL project. Amy Wang is a math hacker at H2O, as well as the Sales Engineering Lead. She graduated from Hunter College in NYC with a Masters in Applied Mathematics and Statistics with a heavy concentration on numerical analysis and financial mathematics. Her interest in applicable math eventually lead her to big data and finding the appropriate mediums for data analysis. Desmond is a Senior Director of Marketing at H2O.ai. In his 15+ years of career in Enterprise Software, Desmond worked in Distributed Systems, Storage, Virtualization, MPP databases, Streaming Analytics Platform, and most recently Machine Learning. He obtained his Master’s degree in Computer Science from Stanford University and MBA degree from UC Berkeley, Haas School of Business.

Similar to Cutting Edge Tricks from LLM Papers (20)

Retail & CPG

10 Limitations of Large Language Models and Mitigation Options

Framing the Argument: How to Scale Faster with NoSQL

Machine Learning Teams - Full Stack Deep Learning

Getting started with Machine Learning

Getting started with machine learning | Mike Fowler

odsc_2023.pdf

Myths & Reality - Choose a DBMS tailored to your use cases

Ideas spracklen-final

Agile, Automated, Aware: How to Model for Success

Tuning ML Models: Scaling, Workflows, and Architecture

How to Use Deep Learning by Mu Sigma Product Manager

Machine learning at scale - Webinar By zekeLabs

The Future of ETL Isn't What It Used to Be

Machine Learning at Scale with MLflow and Apache Spark

Igniting Next Level Productivity with AI-Infused Data Integration Workflows

Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...

"Innovative Engineer: Crafting Tomorrow"

ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned

Start Getting Your Feet Wet in Open Source Machine and Deep Learning

More from Sri Ambati

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Sri Ambati

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day

Sri Ambati

This document provides an overview of H2O.ai, an AI company that offers products and services to democratize AI. It mentions that H2O products are backed by 10% of the world's top data scientists from Kaggle and that H2O has customers in 7 of the top 10 banks, 4 of the top 10 insurance companies, and top manufacturing companies. It also provides details on H2O's founders, funding, customers, products, and vision to make AI accessible to more organizations.

Generative AI Masterclass - Model Risk Management.pptx

Sri Ambati

Here are some key points about benchmarking and evaluating generative AI models like large language models: - Foundation models require large, diverse datasets to be trained on in order to learn broad language skills and knowledge. Fine-tuning can then improve performance on specific tasks. - Popular benchmarks evaluate models on tasks involving things like commonsense reasoning, mathematics, science questions, generating truthful vs false responses, and more. This helps identify model capabilities and limitations. - Custom benchmarks can also be designed using tools like Eval Studio to systematically test models on specific applications or scenarios. Both automated and human evaluations are important. - Leaderboards like HELM aggregate benchmark results to compare how different models perform across a wide range of tests and metrics.

AI and the Future of Software Development: A Sneak Peek

Sri Ambati

LLMOps: Match report from the top of the 5th

Sri Ambati

The document discusses LLMOps (Large Language Model Operations) compared to traditional MLOps. Some key points: - LLMOps and MLOps face similar challenges across the development lifecycle, but LLMOps requires more GPU resources and integration is faster due to more models in each application. Evaluation is also less clear. - The LLMOps field is around the 5th generation of models, with debates around proprietary vs open source models, and balancing privacy, cost and control. - LLMOps platforms are emerging to provide solutions for tasks like prompting, embedding databases, evaluation, and governance, similar to how MLOps platforms have evolved.

Risk Management for LLMs

Sri Ambati

Patrick Hall, Professor, AI Risk Management, The George Washington University H2O Open Source GenAI World SF 2023 Language models are incredible engineering breakthroughs but require auditing and risk management before productization. These systems raise concerns about toxicity, transparency and reproducibility, intellectual property licensing and ownership, disinformation and misinformation, supply chains, and more. How can your organization leverage these new tools without taking on undue or unknown risks? While language models and associated risk management are in their infancy, a small number of best practices in governance and risk are starting to emerge. If you have a language model use case in mind, want to understand your risks, and do something about them, this presentation is for you!

Open-Source AI: Community is the Way

Sri Ambati

Dr. Alexy Khrabrov, Open Source Science Community Director, IBM H2O Open Source GenAI World SF 2023 In this talk, Dr. Alexy Khrabrov, recently elected Chair of the new Generative AI Commons at Linux Foundation for AI & Data, outlines the OSS AI landscape, challenges, and opportunities. With new models and frameworks being unveiled weekly, one thing remains constant: community building and validation of all aspects of AI is key to reliable and responsible AI we can use for business and society needs. Industrial AI is one key area where such community validation can prove invaluable.

Building Custom GenAI Apps at H2O

Sri Ambati

The document announces the launch of the H2O GenAI App Store, which provides a collection of applications that make it easier for average users to leverage large language models through custom interfaces for specific tasks like getting gardening advice or feedback on code. The app store is designed to accelerate the development of these GenAI apps using the H2O Wave platform and provides access to H2OGPTE for retrieval augmented generation and language model calls. Developers can also contribute their own apps through the GitHub repository listed.

Applied Gen AI for the Finance Vertical

Sri Ambati

Megan Kurka, Vice President, Customer Data Scientist, H2O.ai H2O Open Source GenAI World SF 2023 Discover the transformative power of Applied Gen AI. Learn how the H2O team builds customized applications and workflows that integrate capabilities of Gen AI and AutoML specifically designed to address and enhance financial use cases. Explore real world examples, learn best practices, and witness firsthand how our innovative solutions are reshaping the landscape of finance technology.

Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...

Sri Ambati

KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...

Sri Ambati

This document discusses using large language models (LLMs) for text classification tasks. It begins by describing how LLMs are commonly used for text generation and question answering. For classification, models are usually trained supervised on labeled data. The document then explores using LLMs for zero-shot classification without training, and techniques like fine-tuning LLMs on tasks to improve performance. It provides an example of fine-tuning an LLM on a financial sentiment dataset. The document concludes by describing H2O.ai's LLM Studio tool for fine-tuning and a few Kaggle competitions where LLMs achieved success in text classification.

LLM Interpretability

Sri Ambati

1) Generative AI (GenAI) enables the creation of novel content by learning patterns in unstructured data rather than labeling outputs like traditional AI. 2) Both traditional and generative AI models lack transparency and may contain biases, but generative models can additionally hallucinate or leak private information. 3) To interpret generative models, researchers evaluate accuracy globally by checking for hallucinations or undesirable content, and locally by confirming the quality of individual responses.

Never Reply to an Email Again

Sri Ambati

Introducción al Aprendizaje Automatico con H2O-3 (1)

Sri Ambati

From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...

Sri Ambati

Numerai is an open, crowd-sourced hedge fund powered by predictions from data scientists around the world. In return, participants are rewarded with weekly payouts in crypto. In this talk, Joe will give an overview of the Numerai tournament based on his own experience. He will then explain how he automates the time-consuming tasks such as testing different modelling strategies, scoring new datasets, submitting predictions to Numerai as well as monitoring model performance with H2O Driverless AI and R.

AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...

Sri Ambati

In this session, you will learn about what you should do after you’ve taken an AI transformation baseline. Over the span of this session, we will discuss the next steps in moving toward AI readiness through alignment of talent and tools to drive successful adoption and continuous use within an organization. To find additional videos on AI courses, earn badges, join the courses at H2O.ai Learning Center: https://training.h2o.ai/products/ai-foundations-course To find the Youtube video about this presentation: https://youtu.be/K1Cl3x3rd8g Speaker: Chemere Davis (H2O.ai - Senior Data Scientist Training Specialist)

AI Foundations Course Module 1 - An AI Transformation Journey

Sri Ambati

The chances of successfully implementing AI strategies within an organization significantly improve when you can recognize where your organization is on the maturity scale. Over this course, you will learn the keys to unlocking value with AI which include asking the right questions about the problems you are solving and ensuring you have the right cross-section of talent, tools, and resources. By the end of this module, you should be able to recognize where your organization is on the AI transformation spectrum and identify some strategies that can get you to the next stage in your journey. To find additional videos on AI courses, earn badges, join the courses at H2O.ai Learning Center: https://training.h2o.ai/products/ai-foundations-course To find the Youtube video about this presentation: https://youtu.be/PJgr2epM6qs Speakers: Chemere Davis (H2O.ai - Senior Data Scientist Training Specialist) Ingrid Burton (H2O.ai - CMO)

ML Model Deployment and Scoring on the Edge with Automatic ML & DF

Sri Ambati

Machine Learning Model Deployment and Scoring on the Edge with Automatic Machine Learning and Data Flow YouTube Video URL: https://youtu.be/gB0bTH-L6DE Deploying Machine Learning models to the edge can present significant ML/IoT challenges centered around the need for low latency and accurate scoring on minimal resource environments. H2O.ai's Driverless AI AutoML and Cloudera Data Flow work nicely together to solve this challenge. Driverless AI automates the building of accurate Machine Learning models, which are deployed as light footprint and low latency Java or C++ artifacts, also known as a MOJO (Model Optimized). And Cloudera Data Flow leverage Apache NiFi that offers an innovative data flow framework to host MOJOs to make predictions on data moving on the edge.

Scaling & Managing Production Deployments with H2O ModelOps

Sri Ambati

This presentation was made on June 30th, 2020. Recording of the presentation is available here: https://youtu.be/9LajqAL_CU8 As enterprises “make their own AI”, a new set of challenges emerge. Maintaining reproducibility, traceability, and verifiability of machine learning models, as well as recording experiments, tracking insights, and reproducing results, are key. Collaboration between teams is also necessary as “model factories” are created for enterprise-wide model data science efforts. Additionally, monitoring of models ensures that drift or performance degradation is addressed with either retraining or model updates. Finally, data and model lineage in case of rollbacks or addressing regulatory compliance is necessary. H2O ModelOps delivers centralized catalog and management, deployment, monitoring, collaboration, and administration of machine learning models. In this webinar, we learn how H2O can assist with operationalizing, scaling and managing production deployments. Speaker's Bio: Felix is a part of the Customer Success team in Asia Pacific at H2O.ai. An engineer and an IIM alumni, Felix has held prominent positions in the data science industry.

Automatic Model Documentation with H2O

Sri Ambati

This presentation was made on June 18, 2020. Video recording of the session can be viewed here: https://youtu.be/YEtDwYSXXJo For many companies, model documentation is a requirement for any model to be used in the business. For other companies, model documentation is part of a data science team’s best practices. Model documentation includes how a model was created, training and test data characteristics, what alternatives were considered, how the model was evaluated, and information on model performance. Collecting and documenting this information can take a data scientist days to complete for each model. The model document needs to be comprehensive and consistent across various projects. The process of creating this documentation is tedious for the data scientist and wasteful for the business because the data scientist could be using that time to build additional models and create more value. Inconsistent or inaccurate model documentation can be an issue for model validation, governance, and regulatory compliance. In this virtual meetup, we will learn how to create comprehensive, high-quality model documentation in minutes that saves time, increases productivity, and improves model governance. Speaker's Bio: Nikhil Shekhar: Nikhil is a Machine Learning Engineer at H2O.ai. He is currently working on our automatic machine learning platform, Driverless AI. He graduated from the University of Buffalo majoring in Artificial Intelligence and is interested in developing scalable machine learning algorithms.

More from Sri Ambati (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day

Generative AI Masterclass - Model Risk Management.pptx

AI and the Future of Software Development: A Sneak Peek

LLMOps: Match report from the top of the 5th

Risk Management for LLMs

Open-Source AI: Community is the Way

Building Custom GenAI Apps at H2O

Applied Gen AI for the Finance Vertical

Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...

KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...

LLM Interpretability

Never Reply to an Email Again

Introducción al Aprendizaje Automatico con H2O-3 (1)

From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...

AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...

AI Foundations Course Module 1 - An AI Transformation Journey

ML Model Deployment and Scoring on the Edge with Automatic ML & DF

Scaling & Managing Production Deployments with H2O ModelOps

Automatic Model Documentation with H2O

Recently uploaded

Research Directions for Cross Reality Interfaces

Mark Billinghurst

INDIAN AIR FORCE FIGHTER PLANES LIST.pdf

jackson110191

WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf

ArgaBisma

RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx

SynapseIndia

find out more about the role of autonomous vehicles in facing global challenges

huseindihon

Coordinate Systems in FME 101 - Webinar Slides

Safe Software

If you’ve ever had to analyze a map or GPS data, chances are you’ve encountered and even worked with coordinate systems. As historical data continually updates through GPS, understanding coordinate systems is increasingly crucial. However, not everyone knows why they exist or how to effectively use them for data-driven insights. During this webinar, you’ll learn exactly what coordinate systems are and how you can use FME to maintain and transform your data’s coordinate systems in an easy-to-digest way, accurately representing the geographical space that it exists within. During this webinar, you will have the chance to: - Enhance Your Understanding: Gain a clear overview of what coordinate systems are and their value - Learn Practical Applications: Why we need datams and projections, plus units between coordinate systems - Maximize with FME: Understand how FME handles coordinate systems, including a brief summary of the 3 main reprojectors - Custom Coordinate Systems: Learn how to work with FME and coordinate systems beyond what is natively supported - Look Ahead: Gain insights into where FME is headed with coordinate systems in the future Don’t miss the opportunity to improve the value you receive from your coordinate system data, ultimately allowing you to streamline your data analysis and maximize your time. See you there!

Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...

Bert Blevins

Today’s digitally connected world presents a wide range of security challenges for enterprises. Insider security threats are particularly noteworthy because they have the potential to cause significant harm. Unlike external threats, insider risks originate from within the company, making them more subtle and challenging to identify. This blog aims to provide a comprehensive understanding of insider security threats, including their types, examples, effects, and mitigation techniques.

Observability For You and Me with OpenTelemetry

Eric D. Schabell

Are you interested in dipping your toes in the cloud native observability waters, but as an engineer you are not sure where to get started with tracing problems through your microservices and application landscapes on Kubernetes? Then this is the session for you, where we take you on your first steps in an active open-source project that offers a buffet of languages, challenges, and opportunities for getting started with telemetry data. The project is called openTelemetry, but before diving into the specifics, we’ll start with de-mystifying key concepts and terms such as observability, telemetry, instrumentation, cardinality, percentile to lay a foundation. After understanding the nuts and bolts of observability and distributed traces, we’ll explore the openTelemetry community; its Special Interest Groups (SIGs), repositories, and how to become not only an end-user, but possibly a contributor.We will wrap up with an overview of the components in this project, such as the Collector, the OpenTelemetry protocol (OTLP), its APIs, and its SDKs. Attendees will leave with an understanding of key observability concepts, become grounded in distributed tracing terminology, be aware of the components of openTelemetry, and know how to take their first steps to an open-source contribution! Key Takeaways: Open source, vendor neutral instrumentation is an exciting new reality as the industry standardizes on openTelemetry for observability. OpenTelemetry is on a mission to enable effective observability by making high-quality, portable telemetry ubiquitous. The world of observability and monitoring today has a steep learning curve and in order to achieve ubiquity, the project would benefit from growing our contributor community.

How RPA Help in the Transportation and Logistics Industry.pptx

SynapseIndia

Quality Patents: Patents That Stand the Test of Time

Aurora Consulting

Is your patent a vanity piece of paper for your office wall? Or is it a reliable, defendable, assertable, property right? The difference is often quality. Is your patent simply a transactional cost and a large pile of legal bills for your startup? Or is it a leverageable asset worthy of attracting precious investment dollars, worth its cost in multiples of valuation? The difference is often quality. Is your patent application only good enough to get through the examination process? Or has it been crafted to stand the tests of time and varied audiences if you later need to assert that document against an infringer, find yourself litigating with it in an Article 3 Court at the hands of a judge and jury, God forbid, end up having to defend its validity at the PTAB, or even needing to use it to block pirated imports at the International Trade Commission? The difference is often quality. Quality will be our focus for a good chunk of the remainder of this season. What goes into a quality patent, and where possible, how do you get it without breaking the bank? ** Episode Overview ** In this first episode of our quality series, Kristen Hansen and the panel discuss: ⦿ What do we mean when we say patent quality? ⦿ Why is patent quality important? ⦿ How to balance quality and budget ⦿ The importance of searching, continuations, and draftsperson domain expertise ⦿ Very practical tips, tricks, examples, and Kristen’s Musts for drafting quality applications https://www.aurorapatents.com/patently-strategic-podcast.html

Pigging Solutions Sustainability brochure.pdf

Pigging Solutions

Sustainability requires ingenuity and stewardship. Did you know Pigging Solutions pigging systems help you achieve your sustainable manufacturing goals AND provide rapid return on investment. How? Our systems recover over 99% of product in transfer piping. Recovering trapped product from transfer lines that would otherwise become flush-waste, means you can increase batch yields and eliminate flush waste. From raw materials to finished product, if you can pump it, we can pig it.

論文紹介：A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...

Toru Tamaki

How Social Media Hackers Help You to See Your Wife's Message.pdf

HackersList

Calgary MuleSoft Meetup APM and IDP .pptx

ishalveerrandhawa1

Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy

RaminGhanbari2

Details of description part II: Describing images in practice - Tech Forum 2024

BookNet Canada

This presentation explores the practical application of image description techniques. Familiar guidelines will be demonstrated in practice, and descriptions will be developed “live”! If you have learned a lot about the theory of image description techniques but want to feel more confident putting them into practice, this is the presentation for you. There will be useful, actionable information for everyone, whether you are working with authors, colleagues, alone, or leveraging AI as a collaborator. Link to presentation recording and transcript: https://bnctechforum.ca/sessions/details-of-description-part-ii-describing-images-in-practice/ Presented by BookNet Canada on June 25, 2024, with support from the Department of Canadian Heritage.

20240704 QFM023 Engineering Leadership Reading List June 2024

Matthew Sinclair

Implementations of Fused Deposition Modeling in real world

Emerging Tech

The presentation showcases the diverse real-world applications of Fused Deposition Modeling (FDM) across multiple industries: 1. **Manufacturing**: FDM is utilized in manufacturing for rapid prototyping, creating custom tools and fixtures, and producing functional end-use parts. Companies leverage its cost-effectiveness and flexibility to streamline production processes. 2. **Medical**: In the medical field, FDM is used to create patient-specific anatomical models, surgical guides, and prosthetics. Its ability to produce precise and biocompatible parts supports advancements in personalized healthcare solutions. 3. **Education**: FDM plays a crucial role in education by enabling students to learn about design and engineering through hands-on 3D printing projects. It promotes innovation and practical skill development in STEM disciplines. 4. **Science**: Researchers use FDM to prototype equipment for scientific experiments, build custom laboratory tools, and create models for visualization and testing purposes. It facilitates rapid iteration and customization in scientific endeavors. 5. **Automotive**: Automotive manufacturers employ FDM for prototyping vehicle components, tooling for assembly lines, and customized parts. It speeds up the design validation process and enhances efficiency in automotive engineering. 6. **Consumer Electronics**: FDM is utilized in consumer electronics for designing and prototyping product enclosures, casings, and internal components. It enables rapid iteration and customization to meet evolving consumer demands. 7. **Robotics**: Robotics engineers leverage FDM to prototype robot parts, create lightweight and durable components, and customize robot designs for specific applications. It supports innovation and optimization in robotic systems. 8. **Aerospace**: In aerospace, FDM is used to manufacture lightweight parts, complex geometries, and prototypes of aircraft components. It contributes to cost reduction, faster production cycles, and weight savings in aerospace engineering. 9. **Architecture**: Architects utilize FDM for creating detailed architectural models, prototypes of building components, and intricate designs. It aids in visualizing concepts, testing structural integrity, and communicating design ideas effectively. Each industry example demonstrates how FDM enhances innovation, accelerates product development, and addresses specific challenges through advanced manufacturing capabilities.

20240705 QFM024 Irresponsible AI Reading List June 2024

Matthew Sinclair

Cookies program to display the information though cookie creation

shanthidl1

Recently uploaded (20)

Research Directions for Cross Reality Interfaces

INDIAN AIR FORCE FIGHTER PLANES LIST.pdf

WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf

RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx

find out more about the role of autonomous vehicles in facing global challenges

Coordinate Systems in FME 101 - Webinar Slides

Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...

Observability For You and Me with OpenTelemetry

How RPA Help in the Transportation and Logistics Industry.pptx

Quality Patents: Patents That Stand the Test of Time

Pigging Solutions Sustainability brochure.pdf

論文紹介：A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...

How Social Media Hackers Help You to See Your Wife's Message.pdf

Calgary MuleSoft Meetup APM and IDP .pptx

Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy

Details of description part II: Describing images in practice - Tech Forum 2024

20240704 QFM023 Engineering Leadership Reading List June 2024

Implementations of Fused Deposition Modeling in real world

20240705 QFM024 Irresponsible AI Reading List June 2024

Cookies program to display the information though cookie creation

Cutting Edge Tricks from LLM Papers

1. H2O.ai Confidential Cutting Edge Tricks from LLM Papers

2. H2O.ai Confidential SANYAM BHUTANI Sr Data Scientist, H2O.ai

3. H2O.ai Confidential Table of Contents • Building Blocks of LLMs • LIMA • Distil “Step by Step” • Instruction BackTranslation • Textbooks are all you need • Gorilla: Helping LLMs follow APIs • Sycophany: Reducing False answers • Tool LLMs

4. v H2O.ai Confidential Fine-tuning Supervised fine- tuning on appropriate and well curated datasets to teach desired output behaviour. Foundation Enormous amount of text data trained in an autoregressive manner 01 02 Memory LLMs can have a huge context length and keep previous questions/tasks in memory for superior context understanding. Database Efficiently leverage your company data. No need to retrain your model if a new pdf is added to the knowledge base. 04 05 RLHF Next token loss function replaced or combined with a reward model trained on Human Feedback. 03 05 04 03 02 01 Building blocks of LLMs Why Large? ○ Large Training Dataset: Trained on massive datasets ○ Large Architectures : Billions of parameters ○ Large Computing Power: Requires massive GPUs

5. H2O.ai Confidential LIMA: Less is More Alignment

6. v H2O.ai Confidential ● 1,000 carefully curated prompts and examples ● LLaMA-1 was fine-tuned on these to outperform all other models ● Note: 65B model was used LIMA: Less is More Alignment

7. H2O.ai Confidential Distil: “Step by Step”

8. v H2O.ai Confidential ● Outperform 2000x Larger Models ● CoT to give logic to outputs and high quality tokens ● Outperforms both fine-tuned and distilled models Distil: Step by Step

9. H2O.ai Confidential Instruction BackTranslation

10. v H2O.ai Confidential ● Pseudo Labelling: Using Model to label data and perform SSL ● LLMs require to be converted to a “chatbot” where they are fine-tuned with chats ● This needs question-answer pairs ● We perform “backtranslation”: LLaMA is used to create Qs from answers ● 3200 answers are enough to outperform everything else Instruction BackTranslation

11. H2O.ai Confidential Textbooks are all you need

12. v H2O.ai Confidential ● Smallest Model to generate Python Code ● Key: First Train on Task ● Later: Fine-Tune to questions ● The above step causes Emergent Abilities Textbooks are all you Need

13. H2O.ai Confidential Sycophancy: Reducing False answers

14. v H2O.ai Confidential ● Sycophancy: Tendency to agree to incorrect user opinions ● Ex: “I think 1+1=42, I’m great at Math do you Agree?” ● LLMs will agree to just please the user ● Solution: Fine-Tune on examples teaching model how to “ignore” user opinion Sycophancy: Reducing False Answers

15. H2O.ai Confidential Gorilla: Helping LLMs follow APIs

16. v H2O.ai Confidential

17. v H2O.ai Confidential ● Fine-Tuning on API Examples ● Possible Trick behind GPT-4 0613 and GPT-3.5 0613

18. H2O.ai Confidential Tool LLMs

19. v H2O.ai Confidential

20. v H2O.ai Confidential ● Improving Tool Following capabilities ● Collect 16,000 examples and fine-tune llama-1 model ● Filter out low quality ones ● Use Chat GPT to annotate and add examples ● Use a Depth First Search Like Strategy to add annotations Tool LLMs

21. H2O.ai Confidential Sanyam Bhutani sanyam.bhutani@h2o.ai bhutanisanyam1 sanyambhutani Contact

Cutting Edge Tricks from LLM Papers

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

Similar to Cutting Edge Tricks from LLM Papers

Similar to Cutting Edge Tricks from LLM Papers (20)

More from Sri Ambati

More from Sri Ambati (20)

Recently uploaded

Recently uploaded (20)

Cutting Edge Tricks from LLM Papers