Anoop Deoras’ Post

View profile for Anoop Deoras, graphic

Director, AI/ML for Foundation Models, GenAI Apps and Services for Developer Experiences (Amazon Q)

Many of us will be at the ICML’24 in Vienna and look forward to connecting with you! Among a large number of papers that Amazon would be presenting at ICML (https://lnkd.in/gAQXFrqA), want to highlight a few papers from my group: Fewer Truncations Improve Language Modeling https://lnkd.in/gbDR3F2Q Quick Summary: Best-fit Packing, a method that reduces document truncation during training of language models, enhancing data integrity and improving performance Collage: Light-Weight Low-Precision Strategy for LLM Training https://lnkd.in/ggHi6Th6 Quick Summary: presents Collage, a method utilizing multi-component float representation for low-precision computations to enhance the accuracy and efficiency of large language model training, achieving comparable performance to higher precision methods while significantly reducing memory usage and computational costs. Bifurcated Attention for Single-Context Large-Batch Sampling https://lnkd.in/g7YkTqzt Quick Summary: Paper introduces a method called bifurcated attention, which optimizes memory IO during incremental decoding for high batch sizes and long contexts, achieving significant latency reductions without increasing computational load, thus enhancing real-time applications such as parallel answer generation and ranking. Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models https://lnkd.in/gpey_tUY Quick Summary: Memory-Efficient Zeroth-Order Stochastic Variance-Reduced Gradient (MeZO-SVRG), which improves stability and convergence in fine-tuning language models while reducing memory and computational costs. Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation https://lnkd.in/g_D_gCge Quick Summary: The paper presents a method for evaluating the accuracy of Retrieval-Augmented Large Language Models (RAG). Repoformer: Selective Retrieval for Repository-Level Code Completion  https://lnkd.in/gneyRdF7 Quick Summary: A selective retrieval-augmented generation (RAG) framework significantly enhances repository-level code completion performance and efficiency by selectively retrieving contexts only when beneficial. Explaining Probabilistic Models with Distributional Values https://lnkd.in/gaEAmKgr Quick Summary: This paper critiques game-theoretic explainable machine learning methods like SHAP for their misalignment with the desired explanation targets and proposes a new framework using distributional values. -- We work on Amazon Q Developer. Its your assistant for the entire software development lifecycle (SDLC). If this all sounds interesting, feel free to drop by the Amazon booth or reach out to me on LinkedIn and we can setup a time to chat. #amazonscience #genai #ml #ai #icml24 #deeplearning #aws

ICML 2024

ICML 2024

amazon.science

To view or add a comment, sign in

Explore topics