Can I run CLIP ViT-H/14 on NVIDIA Jetson AGX Orin 32GB?

check_circle
Perfect
Yes, you can run this model!
GPU VRAM
32.0GB
Required
2.0GB
Headroom
+30.0GB

VRAM Usage

0GB 6% used 32.0GB

Performance Estimate

Tokens/sec ~90.0
Batch size 32

info Technical Analysis

The NVIDIA Jetson AGX Orin 32GB is an excellent platform for running the CLIP ViT-H/14 model. With 32GB of LPDDR5 VRAM, it far exceeds the 2.0GB required by the model in FP16 precision. This substantial VRAM headroom ensures that the model can be loaded and run comfortably without memory constraints, even when dealing with larger batch sizes or more complex image processing pipelines. The Ampere architecture provides a balance of CUDA and Tensor cores, allowing for efficient computation of both the vision and text encoders within CLIP.

While VRAM is plentiful, the memory bandwidth of 0.21 TB/s on the Jetson AGX Orin is a relevant factor for overall performance. Memory bandwidth limitations can become a bottleneck when transferring data between the GPU and system memory, particularly during large batch inference or when using very high-resolution images. However, for CLIP ViT-H/14, the model size and computational demands are well-suited to the available bandwidth, leading to good performance without significant stalls. The 56 Tensor Cores accelerate matrix multiplications, a core operation in CLIP, further enhancing throughput.

lightbulb Recommendation

The Jetson AGX Orin 32GB is well-suited for running CLIP ViT-H/14 in various applications, from image search to zero-shot classification. To maximize performance, consider using TensorRT for model optimization and inference. This framework can significantly improve throughput by leveraging GPU-specific optimizations. Experiment with different batch sizes to find the optimal balance between latency and throughput, keeping in mind the 32GB VRAM allows for substantial flexibility.

For applications where latency is critical, consider quantizing the model to INT8. This can reduce memory footprint and improve inference speed, although it may come at a slight accuracy cost. Regularly monitor GPU utilization and memory usage to identify potential bottlenecks and fine-tune your inference pipeline accordingly.

tune Recommended Settings

Batch_Size
32
Context_Length
77
Other_Settings
['Enable CUDA graph capture', 'Optimize image preprocessing pipeline']
Inference_Framework
TensorRT
Quantization_Suggested
INT8 (optional, for latency-sensitive application…

help Frequently Asked Questions

Is CLIP ViT-H/14 compatible with NVIDIA Jetson AGX Orin 32GB? expand_more
Yes, CLIP ViT-H/14 is fully compatible with the NVIDIA Jetson AGX Orin 32GB, offering excellent performance due to ample VRAM and suitable compute capabilities.
What VRAM is needed for CLIP ViT-H/14? expand_more
CLIP ViT-H/14 requires approximately 2.0GB of VRAM when using FP16 precision.
How fast will CLIP ViT-H/14 run on NVIDIA Jetson AGX Orin 32GB? expand_more
You can expect an estimated throughput of around 90 tokens per second, but this can be significantly improved by using TensorRT for optimized inference.