[ovs-dev] [v13 12/12] dpcls-avx512: Enable avx512 vector popcount instruction.
fbl at sysclose.org
Thu Jun 24 03:57:12 UTC 2021
On Thu, Jun 17, 2021 at 05:18:25PM +0100, Cian Ferriter wrote:
> From: Harry van Haaren <harry.van.haaren at intel.com>
> This commit enables the AVX512-VPOPCNTDQ Vector Popcount
> instruction. This instruction is not available on every CPU
> that supports the AVX512-F Foundation ISA, hence it is enabled
> only when the additional VPOPCNTDQ ISA check is passed.
> The vector popcount instruction is used instead of the AVX512
> popcount emulation code present in the avx512 optimized DPCLS today.
> It provides higher performance in the SIMD miniflow processing
> as that requires the popcount to calculate the miniflow block indexes.
> Signed-off-by: Harry van Haaren <harry.van.haaren at intel.com>
Acked-by: Flavio Leitner <fbl at sysclose.org>
This patch series implements low level optimizations by manually
coding instructions. I wonder if gcc couldn't get some relevant
level of vectorized optimizations refactoring and enabling
compiling flags. I assume the answer is no, but I would appreciate
some enlightenment on the matter.
More information about the dev