https://twitter.com/virattt/status/1778828787951546382
X의 virat님(@virattt)
Friday is LLM battle day. I added DBRX to the financial metrics challenge. Overall, very impressed with DBRX. Main takeaways: • correctly calculated metrics • ranked top 4 fastest models • competitive pricing DBRX was +50% cheaper and +100% faster th
twitter.com
llm-pricing-cost-quality.ipynb
Run, share, and edit Python notebooks
colab.research.google.com
https://coconut-mode.com/posts/ring-attention/
Ring Attention Explained | Coconut Mode
Near infinite context window for language models.
coconut-mode.com
https://twitter.com/bonniesjli/status/1778846068588814486
X의 Bonnie Li님(@bonniesjli)
How do LLMs scale to million token context window? Ring Attention is a nice trick to parallelize long sequence across devices and rotate them in a ring with zero overhead scaling. In our new blog, we cover the tricks behind this magic. It looks like this (
twitter.com
Leave No Context Behind: Efficient Infinite Context Trnasformers with Infini-attention
https://arxiv.org/abs/2404.07143
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention. T
arxiv.org
'Daily-Trend-Review' 카테고리의 다른 글
24/05/10: 1.58 bits, FrugalGPT (0) | 2024.05.10 |
---|---|
24/04/16: Are All Large Language Models Really in 1.58 Bits? (0) | 2024.04.16 |
24/03/31: Transformer math 101 (0) | 2024.03.31 |
24/03/10: AGI의 정의 (0) | 2024.03.10 |
24/03/10: It is fake AGI, stupid! (0) | 2024.03.10 |