https://learning-exhaust.hashnode.dev/are-all-large-language-models-really-in-158-bits?ref=twitter-share Are all LLMs really 1.58 bits? Inference at 4x the speed or more? Dive deep in to changes to the Transformer architecture to learn about how researchers have discovered a huge speedup in LLM inference. learning-exhaust.hashnode.dev https://www.latent.space/ Latent Space | swyx & Alessio | Sub..