Can I run Llama 3.1 8B (Q4_K_M (GGUF 4-bit)) on NVIDIA A100 80GB?

check_circle
Perfect
Yes, you can run this model!
GPU VRAM
80.0GB
Required
4.0GB
Headroom
+76.0GB

VRAM Usage

0GB 5% used 80.0GB

Performance Estimate

Tokens/sec ~93.0
Batch size 32
Context 128000K