LLM inference - HW/SW Optimizations
Optimizing your LLM in production
Reducing Activation Recomputation in Large Transformer Models
Cornell ECE 5545: ML HW & Systems. Lecture 7: Quantization
Nvidia Unveils Most powerful GPU Blackwell B200 Unleashes AI Performance Speed
State-space LLMs: Do we need Attention?
Open Source AI is AI we can Trust — with Soumith Chintala of Meta AI
A little guide to building Large Language Models in 2024
FP8-LM: Training FP8 Large Language Models
'Daily-Trend-Review' 카테고리의 다른 글
24/04/16: Are All Large Language Models Really in 1.58 Bits? (0) | 2024.04.16 |
---|---|
24/04/13: LLM cost vs. Performance (0) | 2024.04.13 |
24/03/10: AGI의 정의 (0) | 2024.03.10 |
24/03/10: It is fake AGI, stupid! (0) | 2024.03.10 |
24/03/09: Transformer Alternatives (0) | 2024.03.10 |