Daily-Trend-Review

2023/09/21: 언어모델링=압축

hellcat 2023. 9. 21. 08:48

Language Modeling Is Compression

Building RAG-based LLM Applications for Production (Part 1)

 

Building RAG-based LLM Applications for Production (Part 1)

In this guide, we will learn how to develop and productionize a retrieval augmented generation (RAG) based LLM application, with a focus on scale, evaluation and routing.

www.anyscale.com

10 Ways to Improve the Performance of Retrieval Augmented Generation Systems

 

Building a Scalable Pipeline for Large Language Models and RAG : an Overview

Large language models (LLMs) have shown immense potential for generating human-like text. However, their knowledge is still limited to…

ai.plainenglish.io

Memory bandwidth constraints imply economies of scale in AI inference

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints