Multi-Query Attention is All You Need
source: https://blog.fireworks.ai/multi-query-attention-is-all-you-need-db072e758055
LLaMA 2: The Dawn of a New Era
source: https://betterprogramming.pub/the-dawn-of-a-new-era-llama2-b0b1a9175029
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
source: https://crfm.stanford.edu/2023/07/17/flash2.html
Can Longer Sequences Help Take the Next Leap in AI?
source: https://ai.stanford.edu/blog/longer-sequences-next-leap-ai/
How does in-context learning work? A framework for understanding the differences from traditional supervised learning
source: https://ai.stanford.edu/blog/understanding-incontext/
Generative AI - Learn the LangChain Basics by Building a Berlin Travel Guide
Augmenting Language Models with Long-Term Memory
source: https://arxiv.org/abs/2306.07174
'Daily-Trend-Review' 카테고리의 다른 글
2023/07/24: LongNet (0) | 2023.07.24 |
---|---|
2023/07/21(2) : In-Context Learning, Emergent Abilities, (0) | 2023.07.21 |
2023/07/18: Long Sequence (0) | 2023.07.18 |
2023/07/16: LLM에 대한 실용적인 소개 등 (0) | 2023.07.16 |
2023/07/11: GPT-4, Longnet, knowledge base (0) | 2023.07.11 |