Daily-Trend-Review

24/05/10: 1.58 bits, FrugalGPT

hellcat 2024. 5. 10. 23:04

Data Engineering for Scaling Language Models to 128K Context

 

Are All Large Language Models Really in 1.58 Bits?

 

Are all LLMs really 1.58 bits? Inference at 4x the speed or more?

Dive deep in to changes to the Transformer architecture to learn about how researchers have discovered a huge speedup in LLM inference.

learning-exhaust.hashnode.dev

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

 

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

There is a rapidly growing number of large language models (LLMs) that users can query for a fee. We review the cost associated with querying popular LLM APIs, e.g. GPT-4, ChatGPT, J1-Jumbo, and find that these models have heterogeneous pricing structures,

arxiv.org

4-bit LLM Quantization with GPTQ

 

ML Blog - 4-bit LLM Quantization with GPTQ

 

mlabonne.github.io

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

 

OpenELM:An Efficient Language Model Family with Open Training and Inference Framework

 

Will infinite context windows kill LLM fine-tuning and RAG?

 

Will infinite context windows kill LLM fine-tuning and RAG?

LLMs with infinite context windows are making it easier to create proof-of-concepts and prototypes. But scale still requires careful engineering.

bdtechtalks.com

Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference

 

Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference

Transformers have emerged as the backbone of large language models (LLMs). However, generation remains inefficient due to the need to store in memory a cache of key-value representations for past tokens, whose size scales linearly with the input sequence l

arxiv.org

 

'Daily-Trend-Review' 카테고리의 다른 글

24/05/29: MS build 2024  (0) 2024.05.29
24/05/12: LLM pricing  (0) 2024.05.12
24/04/16: Are All Large Language Models Really in 1.58 Bits?  (0) 2024.04.16
24/04/13: LLM cost vs. Performance  (0) 2024.04.13
24/03/31: Transformer math 101  (0) 2024.03.31