Sravan Bodapati’s Post

Principal Scientist and Sr.Manager of Applied Science - Amazon AGI Foundation Models

3mo

AI at Meta has released the most powerful open-source model yet as of today: 𝑳𝑳𝑨𝑴𝑨3 - 8B and 70B models at 8k context length! They highlight improvements in each of the following aspects as the key differentiator: a) Model Architecture b) Pretraining data c) scaling up pre-training d) Instruction fine-tuning 💥 𝐏𝐫𝐞𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 & 𝐒𝐜𝐚𝐥𝐢𝐧𝐠 𝐮𝐩: ✈️ Vocab of 128k tokens & GQA across both 8B and 70B, 8k native context length ✈️ Pretrained over 15T tokens, 4 times more code than LLAMA2 ✈️ Multi-lingual : over 5% of pretraining data consists of high-quality non English data that covers over 30 languages: don't expect same performance as English ✈️ LLAMA2 generates training data (like in Self-Instruct) to generate text classifiers that filter data powering LLAMA3 ✈️ Both 8B and 70B improve even after the model is trained on 2 orders of magnitude more data i.e log-linear improvement after 15T tokens training ✈️ Training runs on 2 custom-built 24k GPU clusters with an overall training time efficiency of 95% ✈️ 𝘖𝘷𝘦𝘳𝘢𝘭𝘭, 3 𝘵𝘪𝘮𝘦𝘴 𝘮𝘰𝘳𝘦 𝘦𝘧𝘧𝘪𝘤𝘪𝘦𝘯𝘵 𝘵𝘩𝘢𝘯 𝘓𝘓𝘈𝘔𝘈2 💥 𝑰𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏 𝑭𝒊𝒏𝒆𝒕𝒖𝒏𝒊𝒏𝒈: ✈️ Combination of SFT + PPO + DPO + Rejection Sampling ✈️ 𝑸𝒖𝒂𝒍𝒊𝒕𝒚 𝒐𝒇 𝒑𝒓𝒐𝒎𝒑𝒕𝒔 𝒖𝒔𝒆𝒅 𝒊𝒏 𝑺𝑭𝑻, 𝒑𝒓𝒆𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝒓𝒂𝒏𝒌𝒊𝒏𝒈𝒔 𝒖𝒔𝒆𝒅 𝒊𝒏 𝑫𝑷𝑶, 𝑷𝑷𝑶 𝒉𝒂𝒔 𝒂𝒏 𝒐𝒖𝒕𝒔𝒊𝒛𝒆𝒅 𝒊𝒎𝒑𝒂𝒄𝒕 𝒐𝒏 𝒎𝒐𝒅𝒆𝒍 𝒑𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆 ✈️ Careful data curation and multiple rounds of QA on annotations from human annotators ✈️Training on preference ranking improved the model performance greatly on coding and reasoning tasks 💥 𝑳𝑳𝑨𝑴𝑨3 𝑮𝒂𝒖𝒓𝒅𝒓𝒂𝒊𝒍𝒔: ✈️ Updated and new safety tools : LLAMA Gaurd2 & cybersecEval 2 ✈️ CodeShield - inference time gaurdrail for filtering insecure code ✈️ Instruction Fine-tuned model redteamed for safety : By generating adverserial prompts that try to elicit problematic responses. ✈️ LLAMAGaurd - foundational for prompt and response safety, can be easily finetuned to create a new taxonomy 💥 𝑰𝒏𝒇𝒆𝒓𝒆𝒏𝒄𝒆 & 𝑭𝒖𝒕𝒖𝒓𝒆 ✈️ Despite LLAMA3 having 1B more params than LLAMA2 7B, the improved tokenizer efficiency and GQA contribute to maintaining the inference efficiency on par with Llama 2 7B. ✈️ 400B+ parameters, Multilingual, Multimodal, and longer context window LLAMA3 would be available on AWS soon. Try it out! #generativeAI #llm #llama3 #aws #bedrock #sota

Introducing Meta Llama 3: The most capable openly available LLM to date

ai.meta.com

1 Comment

Sravan Bodapati

Principal Scientist and Sr.Manager of Applied Science - Amazon AGI Foundation Models

3mo

Medium: https://medium.com/@sravanbabubodapati/llama3-highlights-4725f849e3a4

To view or add a comment, sign in

More Relevant Posts

OmarEbnElKhattab Hosney

Senior Data Scientist
1mo
Report this post
Microsoft has released the Phi-3 Vision Locally, a lightweight, open-source multimodal model built upon synthetic data and filtered publicly available websites. It's part of the 53 Model Family, with 4.2 billion parameters, containing an image encoder, connector, projector, and 53 Mini Language Model. The model is designed for general-purpose AI systems and applications requiring memory or compute constraint environments, latency-bound scenarios, and has been trained on 500 billion vision and text tokens. 🔹 The Phi-3 Vision model is a lightweight state-of-the-art open multimodal model built upon synthetic data and filtered publicly available websites. 📊 The model belongs to the 53 Model Family and has 128k context length in tokens, supporting Visual and Text input capabilities. 💪 It underwent rigorous enhancement processes, including Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). 📊 The model has been trained on an offline text dataset with a cutoff date of March 15, 2024, and is an open-weight release. 💻 It requires Python 3.11 and can run even with 16 GB of VRAM. 🔧 The prerequisites include the Transformers library from Source, Flash Attention, MP Pillow, Torch, and Dodge Vision. 📂 The model is designed for general-purpose AI systems and applications requiring memory or compute constraint environments. 🎯 It has been trained on 500 billion vision and text tokens. 💡 It provides uses for Optical Character Recognition (OCR), chart and table understanding, and more. 🔩 The model takes text input as an image and the best prompt template is chat format. 🕰️ The training data was generated between February and April this year.
Like Comment
To view or add a comment, sign in
KX

52,091 followers
3mo
Report this post
Looking to advance your Large Language Models with RAG? With KDB.AI and LlamaIndex, we’re enabling developers to create cutting-edge AI applications, complete with temporal filters for enhanced semantic search and content summarization. With simplified ingestion, indexing and model integration, developers can now quickly build services such as: - Document Q&A - Data Augmented Chatbots - Knowledge Agents - Structured Analytics - Content Generation Read our latest blog: https://lnkd.in/efnmA2jR Then get started with our latest Advanced RAG sample: https://lnkd.in/ewAiJCUT Laurie Voss Jerry Liu Ashok Reddy LlamaIndex #DevOps #Developers #GenAI #RAG #LLMs #Analytics #AI

Build RAG-Enabled Applications with LlamaIndex and KDB.AI | KX

https://kx.com
Like Comment
To view or add a comment, sign in
James Liew

Passion on Products, Tech, AI, and Blockchain
10mo
Report this post
Good layman explanation on what Vectors are!

David Buxton

Putting the AI in HR with Harriet; ex-Arachnys founder
11mo

🚀 Here’s the second in our Behind the Scenes series about building an AI product. This one https://lnkd.in/eiF4nDGX is on Vector Stores and how they are transformative - particularly for what we are doing with Harriet. There's lots of techy resources out there on this subject, so rather than repeating that, this is for business leaders and key decision makers who need a good understanding of this technology, and what it means for your business, without being in the weeds. For context: Harriet is an AI people ops assistant who lives in Slack and is trained on all of your internal documents. She uses vector databases to streamline your data, which then enables Harriet's AI brain to enforce business processes and policies across your organization. Quick summary of what I cover: ✨ What vector stores are in layman's terms. ✨ Their transformative potential in the business landscape. ✨ The synergy between vector stores and LLMs. Keep an eye out for Part II, where I’ll break down the challenges and real-world applications of Vector stores. Any and all thoughts - I would love to hear them. Let me know what you think. #AI #VectorStores #LLMs #CTO #CPO #TechSimplified

Vector stores part I: a non-technical introduction | Harriet

hrharriet.com
Like Comment
To view or add a comment, sign in
Tatousha Narcisse Smith

Conversational AI UX and Prompt Engineer
6mo
Report this post
"Developers who have jumped in to try out Google Gemini for free should know their data might be used to train its generative artificial intelligence (AI) models, including those that power Google AI Studio and Gemini Pro." #data #dataprivacy #trainingdata #aimodels #ai #llm #chat #chatbots #generativeai #api #multimodal #multimodalai #genai #aiagents

What developers trying out Google Gemini should know about their data

zdnet.com
Like Comment
To view or add a comment, sign in
David Buxton

Putting the AI in HR with Harriet; ex-Arachnys founder
11mo
Report this post
🚀 Here’s the second in our Behind the Scenes series about building an AI product. This one https://lnkd.in/eiF4nDGX is on Vector Stores and how they are transformative - particularly for what we are doing with Harriet. There's lots of techy resources out there on this subject, so rather than repeating that, this is for business leaders and key decision makers who need a good understanding of this technology, and what it means for your business, without being in the weeds. For context: Harriet is an AI people ops assistant who lives in Slack and is trained on all of your internal documents. She uses vector databases to streamline your data, which then enables Harriet's AI brain to enforce business processes and policies across your organization. Quick summary of what I cover: ✨ What vector stores are in layman's terms. ✨ Their transformative potential in the business landscape. ✨ The synergy between vector stores and LLMs. Keep an eye out for Part II, where I’ll break down the challenges and real-world applications of Vector stores. Any and all thoughts - I would love to hear them. Let me know what you think. #AI #VectorStores #LLMs #CTO #CPO #TechSimplified

Vector stores part I: a non-technical introduction | Harriet

hrharriet.com

3 Comments
Like Comment
To view or add a comment, sign in
Phannachet Boonyamanee

Putting the AI in HR with Harriet
11mo
Report this post
Intro to Vector stores (in Layman’a terms) and super useful insights from our real-world experiences 🤩🧠 Worth checking it out if you’re interested in this space. #pinecone #vectordatabase #llms #ai

David Buxton

Putting the AI in HR with Harriet; ex-Arachnys founder
11mo

🚀 Here’s the second in our Behind the Scenes series about building an AI product. This one https://lnkd.in/eiF4nDGX is on Vector Stores and how they are transformative - particularly for what we are doing with Harriet. There's lots of techy resources out there on this subject, so rather than repeating that, this is for business leaders and key decision makers who need a good understanding of this technology, and what it means for your business, without being in the weeds. For context: Harriet is an AI people ops assistant who lives in Slack and is trained on all of your internal documents. She uses vector databases to streamline your data, which then enables Harriet's AI brain to enforce business processes and policies across your organization. Quick summary of what I cover: ✨ What vector stores are in layman's terms. ✨ Their transformative potential in the business landscape. ✨ The synergy between vector stores and LLMs. Keep an eye out for Part II, where I’ll break down the challenges and real-world applications of Vector stores. Any and all thoughts - I would love to hear them. Let me know what you think. #AI #VectorStores #LLMs #CTO #CPO #TechSimplified

Vector stores part I: a non-technical introduction | Harriet

hrharriet.com
Like Comment
To view or add a comment, sign in
Krzysztof Chruniak

Domain Architect, AI, ML, LLM, Big Data, Cloud, Serverless
5mo
Report this post
🚀 Large Language Models (LLMs) at the Enterprise Crossroads: Insights from Arize's Survey 📊 The landscape of artificial intelligence is witnessing a pivotal moment with the accelerating adoption of Large Language Models (LLMs) in enterprises. A survey conducted by Arize in September 2023, involving over 350 AI engineers, data scientists, developers, and technical business executives, sheds light on this transformative phase. The findings reveal a significant shift towards embracing LLM-powered applications, underscoring the growing confidence in their enterprise readiness and deployment capabilities. Major Takeaways from the Survey: 🔹 Rapid Adoption: 61.7% of developers and ML teams are now planning or already have an LLM app in production within a year, marking a notable increase from 51.7% in April. 🔹 Diverse LLM Ecosystem: While OpenAI remains a dominant player, alternatives like Meta’s Llama 2, Google PaLM 2, Databricks (Dolly), and MosaicML are gaining traction, indicating a vibrant and competitive landscape. 🔹 Evolving Barriers: Concerns are shifting from data privacy and the need for a business case to more nuanced challenges like on-prem requirements and the accuracy of responses, highlighting the maturation of LLM adoption considerations. 🔹 Regulatory Perspectives: A significant portion of technical teams prefer to delay new AI regulations or better enforce existing ones, suggesting a cautious approach to regulatory intervention. 🔹 Preference for Third-Party APIs: Most teams favor using a third-party public API for LLM integration, followed by proprietary fine-tuned models, reflecting a pragmatic approach to leveraging LLM capabilities. Implications for the Future of Corporate AI Strategy: 🔹 Strategic Experimentation: The survey underscores the importance of creating sandbox environments for LLM experimentation, allowing enterprises to tailor AI solutions to their specific needs. 🔹 LLM Observability and Governance: As adoption deepens, the focus on LLM observability and governance tools becomes crucial, ensuring responsible and efficient deployment. 🔹 Innovation and Competitive Edge: The growing adoption of LLMs signals a broader shift towards AI-driven innovation, offering enterprises new avenues to enhance productivity, creativity, and customer engagement. 💬 Your Thoughts? How is your organization navigating the adoption of LLMs? What strategies are you employing to overcome the hurdles and maximize the benefits of this powerful AI technology? 🔗 https://lnkd.in/dWwwBaC9 #AI #LLMs #EnterpriseTech #Innovation #ArtificialIntelligence

Survey: Large Language Model Adoption Reaches Tipping Point

arize.com
Like Comment
To view or add a comment, sign in
Harriet

895 followers
11mo
Report this post
🎓 Co-Founder David Buxton shares some helpful insights on vector stores and why they are so crucial when building with LLMs. Business leaders check it out: https://lnkd.in/eiF4nDGX #vectorstores #AI #LLM

David Buxton

Putting the AI in HR with Harriet; ex-Arachnys founder
11mo

🚀 Here’s the second in our Behind the Scenes series about building an AI product. This one https://lnkd.in/eiF4nDGX is on Vector Stores and how they are transformative - particularly for what we are doing with Harriet. There's lots of techy resources out there on this subject, so rather than repeating that, this is for business leaders and key decision makers who need a good understanding of this technology, and what it means for your business, without being in the weeds. For context: Harriet is an AI people ops assistant who lives in Slack and is trained on all of your internal documents. She uses vector databases to streamline your data, which then enables Harriet's AI brain to enforce business processes and policies across your organization. Quick summary of what I cover: ✨ What vector stores are in layman's terms. ✨ Their transformative potential in the business landscape. ✨ The synergy between vector stores and LLMs. Keep an eye out for Part II, where I’ll break down the challenges and real-world applications of Vector stores. Any and all thoughts - I would love to hear them. Let me know what you think. #AI #VectorStores #LLMs #CTO #CPO #TechSimplified

Vector stores part I: a non-technical introduction | Harriet

hrharriet.com
Like Comment
To view or add a comment, sign in
Ankit Prakash

Founder, Sprout24 - Discover Verified SaaS Context Data
2mo
Report this post
Recently, a significant shift has occurred in the landscape of large language models (LLMs), marking a decisive turn towards a business-to-business (B2B) approach. Just a few weeks ago, Snowflake unveiled Artic, their groundbreaking 17 billion parameter model, tailored specifically for enterprise applications. This model stands out not only for its impressive technical capabilities in SQL generation, code completion, and logical operations but also for its bold claim as "The Best LLM for Enterprise AI." This development resonates deeply with me, reflecting a broader trend where companies like Databricks with their Mosaic initiative are emphasizing attributes like training and inference efficiency. As someone deeply immersed in this field, it's fascinating to see these specialized models like Artic and DBRX shift away from the more generalized approach of predecessors like GPT-4. The focus is now distinctly on meeting the nuanced needs of data-driven businesses, where practical tools like code automation are valued over vast general knowledge. Moreover, the move towards smaller, more cost-effective models—such as Meta's Llama3 8b—suggests a strategic pivot to accessibility and efficiency. This evolution excites me as it promises a richer diversity of models, each offering tailored capabilities that cater to specific domains of expertise. It's a thrilling time for us—founders, developers, and users—as we stand at the cusp of further innovations that promise to redefine the boundaries of what LLMs can achieve. Thank you Tomasz Tunguz for helping understand this new development.
Like Comment
To view or add a comment, sign in

8,726 followers

View Profile Follow

Sravan Bodapati’s Post

Introducing Meta Llama 3: The most capable openly available LLM to date

ai.meta.com

More from this author

LLAMA3 Highlights

Is ChatGPT Good at Search Re-Ranking ?

World is continuous/analog and so is LLM/AI progress

Explore topics