Neural Magic reposted this
New research posted! At Neural Magic we've collaborated with Cerebras Systems to release the first highly sparse, foundational large language models (LLMs). We removed up to 70% of the connections (nearly 5 billion / 10 gigabytes of the weights!) without affecting the accuracy of popular tasks such as chat, code generation, and summarization. The result is cheaper, faster, and more energy-efficient models, with nearly 9X savings for LLM deployments. This is just the beginning of our push towards establishing efficient LLMs as the default pathway. This will save immense amounts of energy and money as enterprises and the open-source community continue to fine-tune and deploy these models for their revolutionary applications. So, watch for more efficient versions of the latest architectures as we roll out better results over the next few weeks and months. Feel free to ask any questions you have about this latest research, which is linked below. This includes everything from high-level questions about the practicality of this research to deep dives into why/how pruning and distillation work! Paper: https://lnkd.in/entBB_AK Models: https://lnkd.in/eDPAtf8p
-
-
-
-
-
+1
Great
Really great job!
Generative AI Architect @ Nvidia
1moRahul Bhangale