A deep learning model with billions of parameters trained on large text corpora, capable of generating, summarizing, translating, and reasoning about natural language.
LLMs are the primary driver of AI GPU demand today. Bittensor subnets, Akash deployments, and io.net clusters are all used to host open-source LLMs (Llama, Mistral, Falcon) for inference. The VRAM requirement of a model is approximately 2 bytes per parameter in float16 format.