24/05/10: 1.58 bits, FrugalGPT

Daily-Trend-Review

24/05/10: 1.58 bits, FrugalGPT

hellcat 2024. 5. 10. 23:04

Data Engineering for Scaling Language Models to 128K Context

Are All Large Language Models Really in 1.58 Bits?

Are all LLMs really 1.58 bits? Inference at 4x the speed or more?

Dive deep in to changes to the Transformer architecture to learn about how researchers have discovered a huge speedup in LLM inference.

learning-exhaust.hashnode.dev

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

There is a rapidly growing number of large language models (LLMs) that users can query for a fee. We review the cost associated with querying popular LLM APIs, e.g. GPT-4, ChatGPT, J1-Jumbo, and find that these models have heterogeneous pricing structures,

arxiv.org

4-bit LLM Quantization with GPTQ

ML Blog - 4-bit LLM Quantization with GPTQ

mlabonne.github.io

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

OpenELM:An Efficient Language Model Family with Open Training and Inference Framework

Will infinite context windows kill LLM fine-tuning and RAG?

LLMs with infinite context windows are making it easier to create proof-of-concepts and prototypes. But scale still requires careful engineering.

bdtechtalks.com

Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference

Transformers have emerged as the backbone of large language models (LLMs). However, generation remains inefficient due to the need to store in memory a cache of key-value representations for past tokens, whose size scales linearly with the input sequence l

arxiv.org

'Daily-Trend-Review' 카테고리의 다른 글

24/05/29: MS build 2024 (0)	2024.05.29
24/05/12: LLM pricing (0)	2024.05.12
24/04/16: Are All Large Language Models Really in 1.58 Bits? (0)	2024.04.16
24/04/13: LLM cost vs. Performance (0)	2024.04.13
24/03/31: Transformer math 101 (0)	2024.03.31

현재글24/05/10: 1.58 bits, FrugalGPT

AI, Quant 투자 공부

글쓰기 좋아하는 AI 엔지니어의 AI와 Quant 투자 스터디를 위한 공간

transformer, QLORA, 삼프로tv, Generative-AI, jupyter notebook, 정채진프로, training, mdd, 강환국, etf, vscode, 퀀트투자, State of GPT, llma, GPT, LLaMA-Adapter, llm, gpt-4, 거인의포트폴리오, ChatGPT,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

AI, Quant 투자 공부