Daily-Trend-Review

24/03/31: Transformer math 101

hellcat 2024. 3. 31. 07:44

Transformer math 101

 

Transformer Math 101

We present basic math related to computation and memory usage for transformers

blog.eleuther.ai

 

LLM inference - HW/SW Optimizations

 

LLM Inference - HW/SW Optimizations | Notion

Linkedin의 원저자(Sharada Yeluri)의 허락을 받아 원문을 번역 및 검수중입니다.

tulip-phalange-a1e.notion.site

 

Optimizing your LLM in production

 

Optimizing your LLM in production

Optimizing your LLM in production Note: This blog post is also available as a documentation page on Transformers. Large Language Models (LLMs) such as GPT3/4, Falcon, and LLama are rapidly advancing in their ability to tackle human-centric tasks, establish

huggingface.co

 

Reducing Activation Recomputation in Large Transformer Models

 

Cornell ECE 5545: ML HW & Systems. Lecture 7: Quantization

 

Nvidia Unveils Most powerful GPU Blackwell B200 Unleashes AI Performance Speed

 

NVIDIA Blackwell B200: Unveiling the Most Powerful GPU for AI Performance Speed - NADDOD Blog

Discover NVIDIA's Blackwell B200, the ultimate GPU for unleashing AI performance speed. Learn about its breakthrough technology and how it enhances data center operations. Explore NADDOD's optical module technology and its seamless integration with NVIDIA'

www.naddod.com

 

State-space LLMs: Do we need Attention?

 

State-space LLMs: Do we need Attention? 

Mamba, StripedHyena, Based, research overload, and the exciting future of many LLM architectures all at once.

www.interconnects.ai

 

Open Source AI is AI we can Trust — with Soumith Chintala of Meta AI

 

Open Source AI is AI we can Trust — with Soumith Chintala of Meta AI

Listen now | The PyTorch creator riffs on geohot's Tinygrad, Chris Lattner's Mojo, Apple's MLX, the PyTorch Mafia, the upcoming Llama 3 and MTIA ASIC, AI robotics, and what it takes for open source AI to win!

www.latent.space

 

Efficiently Serving LLMs

 

A little guide to building Large Language Models in 2024

 

 

FP8-LM: Training FP8 Large Language Models

 

 

1bitLLM 재현 근황

 

1bitLLM/bitnet_b1_58-3B · Hugging Face

This is a reproduction of the BitNet b1.58 paper. The models are trained with RedPajama dataset for 100B tokens. The hypers, as well as two-stage LR and weight decay, are implemented as suggested in their following paper. All models are open-source in the

huggingface.co