Strong Performance Improvement With AVX-512 At The Same Power AMD EPYC Genoa CPUs Leave a comment

Phoronix examined AMD’s EPYC 9004 Genoa CPUs in a variety of AVX-512 workloads, and it appears that the most recent Zen 4 parts offer a significant performance gain while using the same amount of power.

The AMD EPYC 9004 “Genoa” CPUs deliver a 35% performance boost while using the same amount of power and enabling AVX-512.

The AMD EPYC 9654 processor is one of several new server processors that were first billed as “the fastest server CPUs on the planet,” and Michael Larabel of Phoronix has put the new fourth-gen Genoa CPUs to the test in an astonishing 130 benchmarks in the Ubuntu 22.10 OS setting.

Performance, temperature, frequency restrictions, and more were all examined in these benchmarks to see how the new EPYC CPU responds to the recent addition of AVX-512 to this new chip family.

The AVX-512 instruction set was first offered by Intel and included in the Intel Xeon Phi x200, Skylake-X, and most recent Xeon Scalable CPUs. AMD Zen 4 was the first processor to use this instruction set. The AVX-512 instruction set’s extensions must all be run separately. Recently, AVX-512 has been used in a number of situations, including boosting performance. According to AMD, using AVX-512 will improve performance and data management when processing video, deriving financial equations, and running simulations for new scientific discoveries.

Other AMD processors with active AVX-512, like the Ryzen 9 7950X and the EPYC 9004 series, have been put through testing by Larabel. In his earlier experiments, AVX-512 was highly beneficial to both CPUs, demonstrating improved efficiency while maintaining lower consumption and clock frequencies, especially under heavy workloads. In Ubuntu 22.10, which makes use of the most recent Linux kernel, he conducted his most recent test using the AMD EPYC 9654 2P processor with AVX-512 turned on and off (v6.1).

His benchmarks for artificial intelligence showed that the performance with AVX-512 enabled was 35% (or even more in some circumstances) faster than the performance with the instruction set inactive. AI tasks consumed almost no processor power, yet even while they were running, AVX-512 instances were still superior because to their lower power usage.

AVX-512 in the new AMD EPYC 9654 processor did demonstrate encouraging results in one AI-related set of benchmarks, the Neural Magic DeepSparse 1.1, although they were not as striking as some other machine learning workload tests. A “sparsity-aware inference runtime” called Neural Magic DeepSparse provides graphics processing performance on processors and APIs, enabling the incorporation of machine learning.

Mobile Neural Network 2.1, another AI-based benchmark, was a “odd duck” amid the flurry of benchmarks because the AVX-512 implementation only showed poorer performance in one particular test using the model “Inception-v3.” Larabel makes the suggestion that the programme itself might be the cause but lacks a conclusive response.

Tencent’s NCNN models and cryptographic benchmarks both showed promising results, so the author shifted to Intel-specific software that emphasised AVX-512’s advantages. Again, in testing with AVX-512 enabled, AMD EPYC shone. Two instances produced negligible results when the Intel Open Image Denoise (v1.4.0) benchmark was conducted, but Larabel demonstrated that the power consumption was still lower with the active AVX-512.

Larabel concludes his tests for the time being, noting that AMD’s Zen 4 architecture continues to perform better than the present Intel Xeon Scalable processors for this generation, and that it appears that even the future Sapphire Rapids Xeon chips will struggle to compete against Genoa CPUs.

Image Source: AMD

Leave a Reply

Open chat
How can I assist you ?
Hello Welcome
How can we assist you