Fine-Tune Your Own Llama 2 Model in a Colab
ML Blog - Fine-Tune Your Own Llama 2 Model in a Colab Notebook
mlabonne.github.io
Decoding Strategies in Large Language Models
ML Blog - Decoding Strategies in Large Language Models
mlabonne.github.io
Introduction to Weight Quantization
ML Blog - Introduction to Weight Quantization
mlabonne.github.io
LLM Inference Series: 4. KV caching, a deeper look
LLM Inference Series: 4. KV caching, a deeper look
In this post, we will look at how big the KV cache, a common optimization for LLM inference, can grow and at common mitigation strategies.
medium.com
LLM Inference Series: 5. Dissecting model performance
LLM Inference Series: 5. Dissecting model performance
In this post, we look deeper into the different types of bottleneck that affect model latency and explain what arithmetic intensity is.
medium.com
How GPT models work: for data scientists and ML engineers
Bea Stollnitz - How GPT models work: for data scientists and ML engineers
Learn Azure ML and machine learning with Bea Stollnitz.
bea.stollnitz.com
The Transformer architecture of GPT models
Bea Stollnitz - The Transformer architecture of GPT models
Learn Azure ML and machine learning with Bea Stollnitz.
bea.stollnitz.com
Some intuitions about large language models
Some intuitions about large language models — Jason Wei
An open question these days is why large language models work so well. In this blog post I will discuss six basic intuitions about large language models. Many of them are inspired by manually examining data, which is an exercise that I’ve found helpful a
www.jasonwei.net
'Daily-Trend-Review' 카테고리의 다른 글
2024/02/12: Large Language Models - the hardware connection (0) | 2024.02.12 |
---|---|
24/02/06: Why GPT-3.5 is (mostly) cheaper than Llama2 (0) | 2024.02.06 |
2024/01/27: Harmonizing Multi-GPUs (0) | 2024.01.27 |
2024/01/26: Leading with open Models, frameworks, and systems (0) | 2024.01.26 |
2024/01/20: 스터디 내용 정리 (0) | 2024.01.20 |