A Deep Dive into Performance Doubling and Beyond
Linux benchmarking authority, Phoronix, recently conducted an in-depth examination of Intel's 5th Generation Xeon Emerald Rapids scalable CPU, focusing on the impact of AVX-512 instructions on performance. The results were staggering, showcasing a remarkable doubling in average performance and, in some instances, performance boosts exceeding ten times without a significant increase in power consumption.
Testing Environment
Phoronix utilized a robust server configuration featuring two of Intel's top-tier Xeon Platinum 8592+ 64-core CPUs, 1TB of cutting-edge DDR5 memory, a capacious 3TB SSD, and running on the Intel Eagle Stream platform with the Ubuntu Linux distribution. This formidable setup provided the ideal backdrop for comprehensive benchmarking across various workloads.
AVX-512 Unleashes Doubling in Performance
Enabling AVX-512 resulted in a noteworthy doubling of performance on average across diverse workloads such as Embree, OpenVKL, and Y-Cruncher. The highlight was the performance boost witnessed in OpenVINO, where AVX-512 demonstrated increases ranging from two times to an astonishing over ten times faster. This commendable acceleration can be attributed to OpenVINO's support for AVX-VNNI and BF16, particularly advantageous for AI workloads.
Minimal Impact on Peak Frequency and Power Consumption
Contrary to expectations, the difference in peak frequency with AVX-512 enabled and disabled was minimal. While the Xeon Platinum 8592+ experienced a slight reduction in frequency with AVX-512 enabled, from 3.01 GHz to 2.95 GHz on all cores, the 64-core Emerald Rapids chip maintained its 3.9 GHz boost clock consistently. Remarkably, average power usage remained unchanged, although individual workloads showed a marginal increase of up to 10%. The maximum power consumption witnessed was approximately 120 watts higher, a tradeoff not uncommon in the pursuit of enhanced performance.
AVX-512's Crucial Role in AI Workloads
The support for a broad spectrum of AVX-512 instructions emerges as a pivotal selling point for the Emerald Rapids CPU. Despite falling short in raw performance compared to AMD's 4th Generation EPYC Genoa chip with 96 cores, AVX instructions reshuffle the dynamics between Intel and AMD's server CPUs, particularly in the realm of AI. Notably, this might explain Microsoft's choice of last-generation Sapphire Rapids chips over EPYC to pair with AMD's MI300X GPUs.
Phoronix's comprehensive benchmarking of Intel's 5th Generation Xeon Emerald Rapids with AVX-512 instructions sheds light on the substantial performance gains achievable in various workloads, particularly in the AI domain. The minimal impact on power consumption and the strategic advantage in AI applications make AVX-512 a compelling feature, reaffirming Intel's position in the competitive landscape of server CPUs.
COVER IMAGE: INTEL
Linux Benchmarking Intel Xeon Emerald Rapids AVX-512 CPU Performance CPU OpenVINO AI Workloads Server CPUs Technology News RSMax
 COMMENTS