Transformer Math 101
Transformer Math 101
We present basic math related to computation and memory usage for transformers
blog.eleuther.ai
MemGPT: Towards LLMs As Operating Systems
Understanding the Performance of Transformer Inference
Understanding the Performance of Transformer Inference
Abstract The state of the art results in natural language processing tasks have been obtained by scaling up transformer-based machine learning models, which can have more than a hundred billion parameters. Training and deploying these models can be difficu
dspace.mit.edu
Efficient Memory Management for Large Language Model Serving with PagedAttention
'Daily-Trend-Review' 카테고리의 다른 글
MBU(Model Bandwidth Utilization) (0) | 2023.11.11 |
---|---|
2023/11/11: Sliding Window Attention(SWA) 메커니즘 (0) | 2023.11.11 |
2023/10/24: attention (0) | 2023.10.24 |
2023/10/18: Long-context 최적화 방법 (0) | 2023.10.18 |
2023/10/16: RAG (0) | 2023.10.16 |