2023/09/24: 상품화를 위한 LLM 최적화

Daily-Trend-Review

2023/09/24: 상품화를 위한 LLM 최적화

hellcat 2023. 9. 24. 22:40

From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting

NExT-GPT: Any-to-Any Multimodal LLM

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Optimizing your LLM in production

Do Machine Learning Models Memorize or Generalize?

By Adam Pearce, Asma Ghandeharioun, Nada Hussein, Nithum Thain, Martin Wattenberg and Lucas Dixon August 2023 In 2021, researchers made a striking discovery while training a series of tiny models on toy tasks . They found a set of models that suddenly flip

pair.withgoogle.com

In the long (context) run

In the long (context) run | Harm de Vries

It's not the quadratic attention; it's the lack of long pre-training data

www.harmdevries.com

Building RAG-based LLM Applications for Production (Part 1)

In this guide, we will learn how to develop and productionize a retrieval augmented generation (RAG) based LLM application, with a focus on scale, evaluation and routing.

www.anyscale.com

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

YaRN: Efficient Context Window Extension of Large Language Models

'Daily-Trend-Review' 카테고리의 다른 글

2023/10/06: long context llms (0)	2023.10.06
2023/09/27: Speed up Inference (0)	2023.09.27
2023/09/21: 언어모델링=압축 (0)	2023.09.21
2023/09/18: Textbooks Are All You Need 등 (0)	2023.09.18
2023/09/10: LLM 경제학 (0)	2023.09.10

현재글2023/09/24: 상품화를 위한 LLM 최적화

AI, Quant 투자 공부

글쓰기 좋아하는 AI 엔지니어의 AI와 Quant 투자 스터디를 위한 공간

jupyter notebook, LLaMA-Adapter, mdd, llm, 정채진프로, 삼프로tv, 강환국, training, State of GPT, 거인의포트폴리오, ChatGPT, gpt-4, vscode, etf, QLORA, transformer, 퀀트투자, GPT, Generative-AI, llma,

Today :
Yesterday :

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

AI, Quant 투자 공부