Daily-Trend-Review

2023/10/27: transformer-math

hellcat 2023. 10. 27. 18:01

Transformer Math 101

 

Transformer Math 101

We present basic math related to computation and memory usage for transformers

blog.eleuther.ai

MemGPT: Towards LLMs As Operating Systems

Understanding the Performance of Transformer Inference

 

Understanding the Performance of Transformer Inference

Abstract The state of the art results in natural language processing tasks have been obtained by scaling up transformer-based machine learning models, which can have more than a hundred billion parameters. Training and deploying these models can be difficu

dspace.mit.edu

Efficient Memory Management for Large Language Model Serving with PagedAttention

 

'Daily-Trend-Review' 카테고리의 다른 글

MBU(Model Bandwidth Utilization)  (0) 2023.11.11
2023/11/11: Sliding Window Attention(SWA) 메커니즘  (0) 2023.11.11
2023/10/24: attention  (0) 2023.10.24
2023/10/18: Long-context 최적화 방법  (0) 2023.10.18
2023/10/16: RAG  (0) 2023.10.16