ASUS Servers Set 26 Records in MLPerf Inference v2.0

Published on:

06 May 2022, 12:00 am

2 min read

ASUS released its results for the first time since joining the MLCommons Association last December — instantly setting new performance records in dozens of benchmarked tasks.

Specifically, in the latest round of MLPerf Inference 2.0, ASUS servers set 26 records in the data center Closed division across six AI-benchmark tasks, outperforming all other servers with the same GPU configurations. The achievements consist of 12 records achieved with an ASUS ESC8000A-E11 server configured with eight 80 GB NVIDIA® A100 Tensor Core GPUs; and 14 records with an ASUS ESC4000A-E11 server with four 24 GB NVIDIA A30 Tensor Core GPUs.

These breakthrough results demonstrate clearly the performance dominance of ASUS servers in the AI arena — bringing significant value to organizations seeking to deploy AI and ensuring optimal performance in data centers.

ASUS set 26 records in AI inference, and dominates results tables across six tasks

The MLPerf Inference 2.0 benchmark covers six common AI-inferencing workloads, including image classification (ResNet50), object detection (SSD-ResNet34), medical image segmentation (3D-Unet), speech recognition (RNN-T), natural language processing (BERT) and recommendation (DLRM).

ESC8000A-E11 has achieved multiple leading positions for performance, including:

- Processed 298,105 images classification per second in ResNet50

- Completed the object recognition of 7,462.06 images per second in SSD-ResNet34

- Processed 24.3 medical images per second in 3D-UNet - Completed 26,005.7 questions and answers per second in BERT

- Completed 2,363,760 click predictions per second in DLRM

ESC8000A-E11 results table

Division	Task	Model	Results	Accuracy	Scenario	Units
Data Center Closed	Image classification	ResNet50	210,011	99.00	Server	queries/s
	Image classification	ResNet50	298,105	99.00	Offline	samples/s
	Object detection (large)	SSD-ResNet34	7,096.10	99.00	Server	queries/s
	Object detection (large)	SSD-ResNet34	7,462.06	99.00	Offline	samples/s
	Medical imaging	3D-UNet	24.3	99.00	Offline	samples/s
	Medical imaging	3D-UNet	24.3	99.90	Offline	samples/s
	Speech-to-text	RNN-T	94,996.9	99.00	Server	queries/s
	Speech-to-text	RNN-T	102,738	99.00	Offline	samples/s
	Natural-language processing	BERT	23,489.5	99.00	Server	queries/s
			26,005.7	99.00	Offline	samples/s
			11,491.3	99.90	Server	queries/s
			13,168.2	99.90	Offline	samples/s
	Recommendation	DLRM	1,601,300	99.00	Server	queries/s
			2,363,760	99.00	Offline	samples/s
			1,601,300	99.90	Server	queries/s
			2,363,760	99.90	Offline	samples/s

ESC4000A-E11 has achieved multiple leading positions for performance, including:

- Processed 298,105 images classification per second in ResNet50

- Completed the object recognition of 7,462.06 images per second in SSD-ResNet34

- Processed 24.3 medical images per second in 3D-UNet - Completed 26,005.7 questions and answers per second in BERT

- Completed 2,363,760 click predictions per second in DLRM

ESC4000A-E11 results table

Division	Task	Model	Results	Accuracy	Scenario	Units
Data Center Closed	Image classification	ResNet50	68,192	99.00	Server	queries/s
	Image classification	ResNet50	73,814.5	99.00	Offline	samples/s
	Object detection (large)	SSD-ResNet34	1,886.75	99.00	Server	queries/s
	Object detection (large)	SSD-ResNet34	1,957.18	99.00	Offline	samples/s
	Medical imaging	3D-UNet	6.83	99.00	Offline	samples/s
	Medical imaging	3D-UNet	6.83	99.90	Offline	samples/s
	Speech-to-text	RNN-T	17,391.4	99.00	Server	queries/s
	Speech-to-text	RNN-T	27,299.2	99.00	Offline	samples/s
	Natural-language processing	BERT	6,367.97	99.00	Server	queries/s
			6,896.01	99.00	Offline	samples/s
			2,917.66	99.90	Server	queries/s
			3,383.03	99.90	Offline	samples/s
	Recommendation	DLRM	560,158	99.00	Server	queries/s
			574,371	99.00	Offline	samples/s
			560,158	99.90	Server	queries/s
			574,371	99.90	Offline	samples/s

Continuous AI performance improvement with optimized server design

The dozen MLPerf Inference 2.0 12 records set by the NVIDIA-certified, 4U ESC8000A-E11 – configured with eight 80 GB NVIDIA A100 PCIe Tensor Core GPUs and two AMD EPYC 7763 CPUs – demonstrates its supreme scalability for AI and machine learning. Its streamlined thermal design, with independent CPU and GPU airflow tunnels, brings high-efficiency cooling solution to air-cooled data centers.

The NVIDIA-certified ESC4000A-E11, housed in the most compact 2U footprint on the market – and configured with four 24 GB NVIDIA A30 PCIe Tensor Core GPUs and two AMD EPYC 7763 CPUs – set a total of 14 MPLerf Inference 2.0 records. It offers a wide array of graphics accelerators, plus support for the NVIDIA NVLink high-speed GPU interconnect, to unleash maximum AI performance.

ASUS India