S-LORA: SERVING THOUSANDS OF CONCURRENT LORA ADAPTERS
SIMPLIFYING TRANSFORMER BLOCKS
Alternating updates for efficient transformers
Alternating updates for efficient transformers
Posted by Xin Wang, Software Engineer, and Nishanth Dikkala, Research Scientist, Google Research Contemporary deep learning models have been remarkably successful in many domains, ranging from natural language to computer vision. Transformer neural network
blog.research.google
GPT4All: An Ecosystem of Open Source Compressed Language Models
'Daily-Trend-Review' 카테고리의 다른 글
PagedAttention + vLLM (0) | 2023.11.30 |
---|---|
좋은 개발 리더 되기 (0) | 2023.11.29 |
MBU(Model Bandwidth Utilization) (0) | 2023.11.11 |
2023/11/11: Sliding Window Attention(SWA) 메커니즘 (0) | 2023.11.11 |
2023/10/27: transformer-math (0) | 2023.10.27 |