arXiv
Trackbacks
Trackbacks indicate external web sites that link to articles in arXiv.org. Trackbacks do not reflect the opinion of arXiv.org and may not reflect the opinions of that article's authors.
Trackback guide
By sending a trackback, you can notify arXiv.org that you have created a web page that references a paper. Popular blogging software supports trackback: you can send us a trackback about this paper by giving your software the following trackback URL:
https://arxiv.org/trackback/{arXiv_id}
Some blogging software supports trackback autodiscovery -- in this case, your software will automatically send a trackback as soon as your create a link to our abstract page. See our trackback help page for more information.
Trackbacks for 2208.07339
Deploying Large Language Models: vLLM and QuantizationStep by Step Guide on How to Accelerate...
[ Towards Data Science - Medium@ towardsdatascience.com/depl... ] trackback posted Tue, 16 Apr 2024 06:38:39 UTC
Introduction to Weight Quantization
[ Towards Data Science - Medium@ towardsdatascience.com/intr... ] trackback posted Fri, 7 Jul 2023 07:58:09 UTC
Click to view metadata for 2208.07339
[Submitted on 15 Aug 2022 (v1), last revised 10 Nov 2022 (this version, v2)]Title:LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Abstract: