1. AI Canon
source: https://a16z.com/2023/05/25/ai-canon/
2. State of GPT
source: https://build.microsoft.com/en-US/sessions/db3f4859-cd30-4445-a0cd-553c3304f8e2
3. VOYAGER: An Open-Ended Embodied Agent with Large Language Models
source: https://arxiv.org/pdf/2305.16291.pdf
4. LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
source: https://arxiv.org/pdf/2303.16199.pdf
5. QLORA: Efficient Finetuning of Quantized LLMs
source: https://arxiv.org/pdf/2305.14314.pdf
6. Why we should train smaller LLMs on more tokens
source: https://www.harmdevries.com/post/model-size-vs-compute-overhead/
7. Scaling Data-Constrained Language Models
source: https://arxiv.org/pdf/2305.16264.pdf
8. Deploying Large NLP Models: Infrastructure Cost Optimization
source: https://neptune.ai/blog/nlp-models-infrastructure-cost-optimization
9. AI & Compute
'Daily-Trend-Review' 카테고리의 다른 글
2023/07/01: Emerging Architectures for LLM Applications (0) | 2023.07.01 |
---|---|
2023/06/22: Generative AI 등 (0) | 2023.06.22 |
2023/05/25: 학습 flops 평가, 무한 외부 메모리를 가진 ChatGPT 등 (0) | 2023.05.25 |
2023/05/23: LIMA, transformer 구현 (0) | 2023.05.23 |
2023/05/19: GPT-3 모델 파라미터 계산하기 (0) | 2023.05.19 |