Benchmarks, Deep Learning

RTX 3090 Benchmarks for Deep Learning – NVIDIA RTX 3090 vs 2080 Ti vs TITAN RTX vs RTX 6000/8000

NVIDIA RTX 3090 Benchmarks for TensorFlow

For this blog article, we conducted deep learning performance benchmarks for TensorFlow on NVIDIA GeForce RTX 3090 GPUs.

Our Deep Learning workstation was fitted with two RTX 3090 GPUs and we ran the standard “tf_cnn_benchmarks.py” benchmark script found in the official TensorFlow github. We tested on the the following networks: ResNet50, ResNet152, Inception v3, Inception v4. Furthermore, we ran the same tests using 1, 2, and 4 GPU configurations (for the 2x RTX 3090 vs 4x 2080Ti section). Determined batch size was the largest that could fit into available GPU memory.

Key Points and Observations

  • The NVIDIA RTX 3090 outperformed all GPUs (Images/sec) across all models.
  • A system with 2x RTX 3090 > 4x RTX 2080 Ti.
  • For deep learning, the RTX 3090 is the best value GPU on the market and substantially reduces the cost of an AI workstation.

Interested in getting faster results?
Learn more about Exxact deep learning workstations starting at $3,700


 

RTX 3090 ResNet 50 TensorFlow BenchmarkRTX 3090 Benchmarks

1x GPU 2x GPU batch size
RTX 2080 Ti 522.52 959.78 128
RTX 6000 637.56 1248.54 512
RTX 8000 604.76 1184.52 1024
TITAN RTX 646.13 1287.01 512
RTX 3090 1139.15 2153.53 512

 

RTX 3090 ResNet 152 TensorFlow Benchmark

AI Training benchmarks 3090

1x GPU 2x GPU batch size
RTX 2080 Ti 209.27 348.8 64
RTX 6000 281.94 519.76 256
RTX 8000 285.85 529.05 512
TITAN RTX 284.87 530.86 256
RTX 3090 457.45 857.14 256

 

RTX 3090 Inception V3 TensorFlow Benchmark

RTX 3090 Benchmarks Deep Learning

1x GPU 2x GPU batch size
RTX 2080 Ti 310.32 569.24 128
RTX 6000 391.08 737.77 256
RTX 8000 391.3 754.94 512
TITAN RTX 397.09 784.24 256
RTX 3090 697.98 1296.86 256

 

 

RTX 3090 Inception V4 TensorFlow Benchmark

RTX 3090 Benchmarks Deep Learning

 

1x GPU 2x GPU batch size
RTX 2080 Ti 150.59 247.16 64
RTX 6000 203.9 392.14 256
RTX 8000 203.67 384.29 512
TITAN RTX 207.98 399.16 256
RTX 3090 360 679.61 256

 

2x NVIDIA RTX 3090 Vs 4x RTX 2080 Ti – What config is Better?

Many people were curious about not only the 1:1 performance matchup of the RTX 3090 vs RTX 2080 Ti, but also the performance of a multi-GPU configuration. As it stands now, the RTX 3090 takes up 3 PCIe slots (vs the conventional 2 as the 2080 Ti has) on a typical motherboard. Because of it’s size, a regular workstation is currently only capable of housing 2 RTX 3090s (without extensive custom modification. Does the 2x RTX 3090 config match up performance wise with the 4x 2080 Ti configuration? Simply put, yes it does. see the results for yourself below. 
AI Training benchmarks

1x GPU 2x GPU 4x GPU batch size
RTX 2080 Ti 522.52 959.78 1836.61 128
RTX 3090 1139.15 2153.53 N/A 512

TF CNN Benchmark Parameters

Description Type
Number of Batches 100
Number of Epochs 0.01
Data Format NCHW
Optimizer Momentum
Variables parameter_server

More About NVIDIA RTX 3090

The NVIDIA RTX 3090 has 24GB GDDR6X memory and is built with enhanced RT Cores and Tensor Cores, new streaming multiprocessors, and super fast G6X memory for an amazing performance boost.

Compared with RTX 2080 Ti’s 4352 CUDA Cores, the RTX 3090 more than doubles it with 10496 CUDA Cores. CUDA Cores are the GPU equivalent of CPU cores, and are optimized for running a large number of calculations simultaneously (parallel processing). More CUDA Cores generally mean better performance and faster graphics-intensive processing.


Have any questions about NVIDIA GPUs or AI workstations and servers?
Contact Exxact Today


Related posts