Daily-Trend-Review

2023/07/21: MQA, LLaMA2, Flashattention2

hellcat 2023. 7. 21. 08:12

Multi-Query Attention is All You Need

source: https://blog.fireworks.ai/multi-query-attention-is-all-you-need-db072e758055

 

Multi-Query Attention is All You Need

by James K Reed, Dmytro Dzhulgakov, Dmytro Ivchenko, and Lin Qiao

blog.fireworks.ai

 

LLaMA 2: The Dawn of a New Era

source: https://betterprogramming.pub/the-dawn-of-a-new-era-llama2-b0b1a9175029

 

LLaMA 2: The Dawn of a New Era

Key differences from LLaMA 1, safety & violations, Ghost Attention and model performance.

betterprogramming.pub

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

source: https://crfm.stanford.edu/2023/07/17/flash2.html

 

Stanford CRFM

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning Just within the last year, there have been several language models with much longer context than before: GPT-4 with context length 32k, MosaicML’s MPT with context length 65

crfm.stanford.edu

 

Can Longer Sequences Help Take the Next Leap in AI?

source: https://ai.stanford.edu/blog/longer-sequences-next-leap-ai/

 

Can Longer Sequences Help Take the Next Leap in AI?

Deep learning has revolutionized machine learning. To a first approximation, deeper has been better. However, there is another dimension to scale these models: the size of the input. Even the world’s most impressive models can only process long-form cont

ai.stanford.edu

How does in-context learning work? A framework for understanding the differences from traditional supervised learning

source: https://ai.stanford.edu/blog/understanding-incontext/

 

How does in-context learning work? A framework for understanding the differences from traditional supervised learning

The official Stanford AI Lab blog

ai.stanford.edu

 

Generative AI - Learn the LangChain Basics by Building a Berlin Travel Guide

source: https://medium.com/google-cloud/generative-ai-learn-the-langchain-basics-by-building-a-berlin-travel-guide-5cc0a2ce4096

 

Generative AI - Learn the LangChain Basics by Building a Berlin Travel Guide

LangChain is a framework that’s like a Swiss army knife for large language models (LLMs).

medium.com

 

Augmenting Language Models with Long-Term Memory

source: https://arxiv.org/abs/2306.07174

 

Augmenting Language Models with Long-Term Memory

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models Augmented with Lon

arxiv.org