The on-chip memory of a GPU, measured in gigabytes, which determines the maximum model size or frame buffer that can be loaded for compute or rendering tasks.
VRAM is often the binding constraint for GPU compute jobs: an 80 GB H100 can run a 70B-parameter LLM that a 16 GB RTX 4080 cannot. DePIN compute networks use VRAM as a key node attribute for job matching, and higher VRAM cards generally command higher per-hour fees.