Are all LLMs really 1.58 bits? Inference at 4x the speed or more?
Dive deep in to changes to the Transformer architecture to learn about how researchers have discovered a huge speedup in LLM inference.
learning-exhaust.hashnode.dev
Latent Space | swyx & Alessio | Substack
The AI Engineer newsletter + Top 10 US Tech podcast. Exploring AI UX, Agents, Devtools, Infra, Open Source Models. See https://latent.space/about for highlights from Chris Lattner, Andrej Karpathy, George Hotz, Simon Willison, Emad Mostaque, et al! Click t
www.latent.space
Comparison of Models: Quality, Performance & Price Analysis
https://artificialanalysis.ai/models
Comparison of AI Models across Quality, Performance, Price | Artificial Analysis
Comparison and analysis of AI models across key metrics including quality, price, performance and speed (throughput tokens per second & latency), context window & others.
artificialanalysis.ai
https://www.perplexity.ai/hub/blog/turbocharging-llama-2-70b-with-nvidia-h100
Turbocharging Llama 2 70B with NVIDIA H100
The journey of accelerated LLM inference
www.perplexity.ai
https://medium.com/@plienhar/llm-inference-series-4-kv-caching-a-deeper-look-4ba9a77746c8
LLM Inference Series: 4. KV caching, a deeper look
In this post, we will look at how big the KV cache, a common optimization for LLM inference, can grow and at common mitigation strategies.
medium.com
https://bea.stollnitz.com/blog/gpt-transformer/
Bea Stollnitz - The Transformer architecture of GPT models
Learn Azure ML and machine learning with Bea Stollnitz.
bea.stollnitz.com
Smarter, not Bigger — 1 Million token context is not all you need!
Open source at the test bench to verify long context information extraction. Are they really that good?
medium.com
MEGALODON: Efficient LLM Pretraining and Inference with Unlimited Context Length
https://arxiv.org/pdf/2404.08801.pdf
'Daily-Trend-Review' 카테고리의 다른 글
24/05/12: LLM pricing (0) | 2024.05.12 |
---|---|
24/05/10: 1.58 bits, FrugalGPT (0) | 2024.05.10 |
24/04/13: LLM cost vs. Performance (0) | 2024.04.13 |
24/03/31: Transformer math 101 (0) | 2024.03.31 |
24/03/10: AGI의 정의 (0) | 2024.03.10 |