smart_toy
Gemma Large Language Models

Gemma 2 9B (9.00B)

Parameters
9.00B
VRAM (FP16)
18.0GB
VRAM (INT4)
4.5GB
Context
8192

tune Quantization Options

Quantization VRAM Required Min GPU
FP16 (Half Precision) 18.0GB RTX 4090
INT8 (8-bit Integer) 9.0GB RTX 3060 / 4070
Q4_K_M (GGUF 4-bit) 4.5GB RTX 3070 / 4060
q3_k_m 3.6GB RTX 3070 / 4060

Model Details

Family Gemma
Category Large Language Models
Parameters 9.00B
Context Length 8192