smart_toy
Mistral Large Language Models

Mixtral 8x7B (46.70B)

Parameters
46.70B
VRAM (FP16)
93.4GB
VRAM (INT4)
23.4GB
Context
32768

tune Quantization Options

Quantization VRAM Required Min GPU
FP16 (Half Precision) 93.4GB A100 / H100
INT8 (8-bit Integer) 46.7GB A6000 / 2x 4090
Q4_K_M (GGUF 4-bit) 23.4GB RTX 4090
q3_k_m 18.7GB RTX 4090

Model Details

Family Mistral
Category Large Language Models
Parameters 46.70B
Context Length 32768