2023/12/18: Mixtral 8x7B

Daily-Trend-Review

2023/12/18: Mixtral 8x7B

hellcat 2023. 12. 18. 09:09

https://mistral.ai/news/mixtral-of-experts

Mixtral of experts

A high quality Sparse Mixture-of-Experts.

mistral.ai

Total Parameteters : 46.7B

실제 토큰 생성 시 활성화되는 파라미터는 12.9B

Perfromance

벤치마크 결과, LLaMA2 70B과 GPT-3.5에 비해 더 좋은 성능을 보여준다.

'Daily-Trend-Review' 카테고리의 다른 글

2023/12/23: RAG 101 (0)	2023.12.23
2023/12/23: how to make LLMs go fast (0)	2023.12.23
2023/12/14: Prompt Cache: Modular Attention Reuse For Low-Latency Inference (1)	2023.12.14
2023/12/12: chip cloud 논문 (0)	2023.12.14
2023/12/11: LLM and Transformers Series (0)	2023.12.11

현재글2023/12/18: Mixtral 8x7B

AI, Quant 투자 공부

글쓰기 좋아하는 AI 엔지니어의 AI와 Quant 투자 스터디를 위한 공간

llma, Generative-AI, ChatGPT, transformer, training, mdd, jupyter notebook, vscode, 강환국, 정채진프로, etf, LLaMA-Adapter, llm, 퀀트투자, QLORA, 거인의포트폴리오, State of GPT, 삼프로tv, gpt-4, GPT,

Today :
Yesterday :

일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

AI, Quant 투자 공부

2023/12/18: Mixtral 8x7B

'Daily-Trend-Review' 카테고리의 다른 글

'Daily-Trend-Review'의 다른글

티스토리툴바

2023/12/18: Mixtral 8x7B

'Daily-Trend-Review' 카테고리의 다른 글

'Daily-Trend-Review'의 다른글

관련글

티스토리툴바