smart_toy
Qwen Large Language Models

Qwen 2.5 72B (72.00B)

Parameters
72.00B
VRAM (FP16)
144.0GB
VRAM (INT4)
36.0GB
Context
131072

tune Quantization Options

Quantization VRAM Required Min GPU
FP16 (Half Precision) 144.0GB A100 / H100
INT8 (8-bit Integer) 72.0GB A100 / H100
Q4_K_M (GGUF 4-bit) 36.0GB A6000 / 2x 4090
q3_k_m 28.8GB A6000 / 2x 4090

Model Details

Family Qwen
Category Large Language Models
Parameters 72.00B
Context Length 131072