Daily-Trend-Review

24/02/04: fine-tune your lown llama 2 model in a colab note book

hellcat 2024. 2. 4. 22:07

Fine-Tune Your Own Llama 2 Model in a Colab

 

ML Blog - Fine-Tune Your Own Llama 2 Model in a Colab Notebook

 

mlabonne.github.io

 

Decoding Strategies in Large Language Models

 

ML Blog - Decoding Strategies in Large Language Models

 

mlabonne.github.io

 

Introduction to Weight Quantization

 

ML Blog - Introduction to Weight Quantization

 

mlabonne.github.io

 

LLM Inference Series: 4. KV caching, a deeper look

 

LLM Inference Series: 4. KV caching, a deeper look

In this post, we will look at how big the KV cache, a common optimization for LLM inference, can grow and at common mitigation strategies.

medium.com

 

LLM Inference Series: 5. Dissecting model performance

 

LLM Inference Series: 5. Dissecting model performance

In this post, we look deeper into the different types of bottleneck that affect model latency and explain what arithmetic intensity is.

medium.com

 

How GPT models work: for data scientists and ML engineers

 

Bea Stollnitz - How GPT models work: for data scientists and ML engineers

Learn Azure ML and machine learning with Bea Stollnitz.

bea.stollnitz.com

 

The Transformer architecture of GPT models

 

Bea Stollnitz - The Transformer architecture of GPT models

Learn Azure ML and machine learning with Bea Stollnitz.

bea.stollnitz.com

 

Some intuitions about large language models

 

Some intuitions about large language models — Jason Wei

An open question these days is why large language models work so well. In this blog post I will discuss six basic intuitions about large language models. Many of them are inspired by manually examining data, which is an exercise that I’ve found helpful a

www.jasonwei.net