ASUS Servers Set 26 Records in MLPerf Inference v2.0

ASUS Servers Set 26 Records in MLPerf Inference v2.0
Published on
2 min read

ASUS released its results for the first time since joining the MLCommons Association last December — instantly setting new performance records in dozens of benchmarked tasks.

Specifically, in the latest round of MLPerf Inference 2.0, ASUS servers set 26 records in the data center Closed division across six AI-benchmark tasks, outperforming all other servers with the same GPU configurations. The achievements consist of 12 records achieved with an ASUS ESC8000A-E11 server configured with eight 80 GB NVIDIA® A100 Tensor Core GPUs; and 14 records with an ASUS ESC4000A-E11 server with four 24 GB NVIDIA A30 Tensor Core GPUs.

These breakthrough results demonstrate clearly the performance dominance of ASUS servers in the AI arena — bringing significant value to organizations seeking to deploy AI and ensuring optimal performance in data centers.

ASUS set 26 records in AI inference, and dominates results tables across six tasks

The MLPerf Inference 2.0 benchmark covers six common AI-inferencing workloads, including image classification (ResNet50), object detection (SSD-ResNet34), medical image segmentation (3D-Unet), speech recognition (RNN-T), natural language processing (BERT) and recommendation (DLRM).

ESC8000A-E11 has achieved multiple leading positions for performance, including:

- Processed 298,105 images classification per second in ResNet50

- Completed the object recognition of 7,462.06 images per second in SSD-ResNet34

- Processed 24.3 medical images per second in 3D-UNet - Completed 26,005.7 questions and answers per second in BERT

- Completed 2,363,760 click predictions per second in DLRM

ESC8000A-E11 results table

Division 

Task 

Model 

Results 

Accuracy 

Scenario 

Units 

Data Center Closed

Image
classification

ResNet50

210,011

99.00

Server 

queries/s

298,105

Offline 

samples/s

Object detection
(large)

SSD-ResNet34

7,096.10

99.00

Server 

queries/s

7,462.06

Offline 

samples/s

Medical imaging

3D-UNet

24.3

99.00

Offline

samples/s

24.3

99.90

Offline

samples/s

Speech-to-text

RNN-T

94,996.9

99.00

Server 

queries/s

102,738

Offline 

samples/s

Natural-language processing

BERT

23,489.5

99.00

Server 

queries/s

26,005.7

Offline 

samples/s

11,491.3

99.90

Server 

queries/s

13,168.2

Offline 

samples/s

Recommendation

DLRM

1,601,300

99.00

Server 

queries/s

2,363,760

Offline 

samples/s

1,601,300

99.90

Server 

queries/s

2,363,760

Offline 

samples/s

ESC4000A-E11 has achieved multiple leading positions for performance, including: 

- Processed 298,105 images classification per second in ResNet50

- Completed the object recognition of 7,462.06 images per second in SSD-ResNet34

- Processed 24.3 medical images per second in 3D-UNet - Completed 26,005.7 questions and answers per second in BERT

- Completed 2,363,760 click predictions per second in DLRM

ESC4000A-E11 results table

Division 

Task 

Model 

Results 

Accuracy 

Scenario 

Units 

Data Center Closed

Image
classification

ResNet50

68,192

99.00

Server 

queries/s

73,814.5

Offline 

samples/s

Object detection
(large)

SSD-ResNet34

1,886.75

99.00

Server 

queries/s

1,957.18

Offline 

samples/s

Medical imaging

3D-UNet

6.83

99.00

Offline

samples/s

6.83

99.90

Offline

samples/s

Speech-to-text

RNN-T

17,391.4

99.00

Server 

queries/s

27,299.2

Offline 

samples/s

Natural-language processing

BERT

6,367.97

99.00

Server 

queries/s

6,896.01

Offline 

samples/s

2,917.66

99.90

Server 

queries/s

3,383.03

Offline 

samples/s

Recommendation

DLRM

560,158

99.00

Server 

queries/s

574,371

Offline 

samples/s

560,158

99.90

Server 

queries/s

574,371

Offline 

samples/s

Continuous AI performance improvement with optimized server design 

The dozen MLPerf Inference 2.0 12 records set by the NVIDIA-certified, 4U ESC8000A-E11 – configured with eight 80 GB NVIDIA A100 PCIe Tensor Core GPUs and two AMD EPYC 7763 CPUs – demonstrates its supreme scalability for AI and machine learning. Its streamlined thermal design, with independent CPU and GPU airflow tunnels, brings high-efficiency cooling solution to air-cooled data centers.

The NVIDIA-certified ESC4000A-E11, housed in the most compact 2U footprint on the market – and configured with four 24 GB NVIDIA A30 PCIe Tensor Core GPUs and two AMD EPYC 7763 CPUs – set a total of 14 MPLerf Inference 2.0 records. It offers a wide array of graphics accelerators, plus support for the NVIDIA NVLink high-speed GPU interconnect, to unleash maximum AI performance.

Related Stories

No stories found.
logo
DIGITAL TERMINAL
digitalterminal.in