1. How does GPT-3 spend its 175B parameters?
source: https://www.lesswrong.com/posts/3duR8CrvcHywrnhLo/how-does-gpt-3-spend-its-175b-parameters
'Daily-Trend-Review' 카테고리의 다른 글
2023/05/25: 학습 flops 평가, 무한 외부 메모리를 가진 ChatGPT 등 (0) | 2023.05.25 |
---|---|
2023/05/23: LIMA, transformer 구현 (0) | 2023.05.23 |
2023/05/17: Dr.LLaMA, 100k Context Windows, PaLM2 MEGABYTE (0) | 2023.05.17 |
2023/05/10: LLM Tech Stack, Open LLMs, 8bit MM 등 (0) | 2023.05.10 |
2023/05/07: Single GPU로 LLM 추론하기, 효율적인 Transformers 등 (0) | 2023.05.07 |