Daily-Trend-Review

2023/05/07: Single GPU로 LLM 추론하기, 효율적인 Transformers 등

hellcat 2023. 5. 7. 19:50

1. High-throughput Generative Inference of Large Language Models with a Single GPU

source: https://arxiv.org/pdf/2303.06865.pdf

2. Deploying Large NLP Models: Infrastructure Cost Optimization

source: https://neptune.ai/blog/nlp-models-infrastructure-cost-optimization

3. What Are Transformer Models and How Do They Work?

source: https://txt.cohere.com/what-are-transformer-models/

4. Efficient Transformers: A Survey

source: https://arxiv.org/pdf/2009.06732.pdf

5. HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

source: https://arxiv.org/pdf/2303.17580.pdf

6. Andrej Karpathy's twitter - 최근 opensource LLM ecosystem에 대한 의견

source: https://twitter.com/karpathy/status/1654892810590650376

7. Why we should train smaller LLMs on more tokens

source: https://www.harmdevries.com/post/model-size-vs-compute-overhead/

8. LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

source: https://arxiv.org/pdf/2303.16199.pdf

9. 그림으로 이해하는 스테이블 디퓨전

source: https://medium.com/@aldente0630/%EA%B7%B8%EB%A6%BC%EC%9C%BC%EB%A1%9C-%EC%9D%B4%ED%95%B4%ED%95%98%EB%8A%94-%EC%8A%A4%ED%85%8C%EC%9D%B4%EB%B8%94-%EB%94%94%ED%93%A8%EC%A0%84-61f8ec9d5bf