Neural Magic’s Post

Neural Magic reposted this

View profile for Mark Kurtz, graphic

CTO @ Neural Magic

New research posted! At Neural Magic we've collaborated with Cerebras Systems to release the first highly sparse, foundational large language models (LLMs). We removed up to 70% of the connections (nearly 5 billion / 10 gigabytes of the weights!) without affecting the accuracy of popular tasks such as chat, code generation, and summarization. The result is cheaper, faster, and more energy-efficient models, with nearly 9X savings for LLM deployments. This is just the beginning of our push towards establishing efficient LLMs as the default pathway. This will save immense amounts of energy and money as enterprises and the open-source community continue to fine-tune and deploy these models for their revolutionary applications. So, watch for more efficient versions of the latest architectures as we roll out better results over the next few weeks and months. Feel free to ask any questions you have about this latest research, which is linked below. This includes everything from high-level questions about the practicality of this research to deep dives into why/how pruning and distillation work! Paper: https://lnkd.in/entBB_AK Models: https://lnkd.in/eDPAtf8p

  • Llama 2 7B sparsity vs baseline accuracy recovery for a chat fine-tuning task
  • Llama 2 7B sparsity vs baseline accuracy recovery for a code generatino fine-tuning task
  • Llama 2 7B sparsity vs baseline accuracy recovery for an instruction following fine-tuning task
  • Llama 2 7B prefill performance for various sparsity levels across FP32 baseline and INT8 quantized precisions on an 8-Core CPU
  • Llama 2 7B decode performance for various sparsity levels across FP32 baseline and INT8 quantized precisions on an 8-Core CPU
    +1
Anish Mukherjee

Generative AI Architect @ Nvidia

1mo
Like
Reply
Vipul Kumar

Director - Head of Engineering

1mo

Great

Nicky Clarke 🎶

Visionary technologist and lateral thinker driving market value in regulated, complex ecosystems.

1mo

Really great job!

See more comments

To view or add a comment, sign in

Explore topics