S-LORA: SERVING THOUSANDS OF CONCURRENT LORA ADAPTERS
SIMPLIFYING TRANSFORMER BLOCKS
Alternating updates for efficient transformers
GPT4All: An Ecosystem of Open Source Compressed Language Models
'Daily-Trend-Review' 카테고리의 다른 글
PagedAttention + vLLM (0) | 2023.11.30 |
---|---|
좋은 개발 리더 되기 (0) | 2023.11.29 |
MBU(Model Bandwidth Utilization) (0) | 2023.11.11 |
2023/11/11: Sliding Window Attention(SWA) 메커니즘 (0) | 2023.11.11 |
2023/10/27: transformer-math (0) | 2023.10.27 |