https://arxiv.org/abs/1904.10509?utm_source=pytorchkr
Generating Long Sequences with Sparse Transformers
Transformers are powerful sequence models, but require time and memory that grows quadratically with the sequence length. In this paper we introduce sparse factorizations of the attention matrix which reduce this to $O(n \sqrt{n})$. We also introduce a) a
arxiv.org
https://arxiv.org/abs/2004.05150v2?utm_source=pytorchkr
Longformer: The Long-Document Transformer
Transformer-based models are unable to process long sequences due to their self-attention operation, which scales quadratically with the sequence length. To address this limitation, we introduce the Longformer with an attention mechanism that scales linear
arxiv.org
GeekNews - 개발/기술/스타트업 뉴스 서비스
개발 뉴스, 기술 관련 새소식, 스타트업 정보와 노하우, 세상의 재미난 것들을 좋아하는 사람들을 위한 뉴스 사이트. 이메일 뉴스레터/트위터/슬랙 봇으로 구독 가능
news.hada.io
Paper Summary #8 - FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper Summary #8 - FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness | Shreyansh Singh
Paper: FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Link: https://arxiv.org/abs/2205.14135 Authors: Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré Code: https://github.com/HazyResearch/flash-attention I hav
shreyansh26.github.io
FLASHDECODING++: FASTER LARGE LANGUAGE MODEL INFERENCE ON GPUS
7 Ways To Speed Up Inference of Your Hosted LLMs
7 Ways to Speed Up Inference of Your Hosted LLMs
TLDR; techniques to speed up inference of LLMs to increase token generation speed and reduce memory consumption
betterprogramming.pub
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Open Source Sodciety University
GitHub - ossu/computer-science: :mortar_board: Path to a free self-taught education in Computer Science!
:mortar_board: Path to a free self-taught education in Computer Science! - GitHub - ossu/computer-science: :mortar_board: Path to a free self-taught education in Computer Science!
github.com
Multi-Model RAG Stack
'Daily-Trend-Review' 카테고리의 다른 글
2023/11/13: S-Lora 등 (0) | 2023.11.13 |
---|---|
MBU(Model Bandwidth Utilization) (0) | 2023.11.11 |
2023/10/27: transformer-math (0) | 2023.10.27 |
2023/10/24: attention (0) | 2023.10.24 |
2023/10/18: Long-context 최적화 방법 (0) | 2023.10.18 |