[ovs-dev] [PATCH net-next v2 2/2] net: ovs: use CRC32 accelerated flow hash if available

Francesco Fusco ffusco at redhat.com
Fri Dec 13 14:53:48 UTC 2013


On 12/13/2013 11:01 AM, David Laight wrote:
> My thoughts exactly.
> Given this is a hash it could crc alternate words into separate
> accumulators and the combine the values at the end.
> That way you are still doing sequential accesses to the data.
> (The crc instruction might be better than an xor for the combine.)
> If the cpu has 3 execution units that can do crc, use them all.
>
> It might be that the hash function is now an insignificant cost.
> Looking at how much hashing the data twice (discarding the first
> result - assign to global volatile data) slows things down can
> help determine this.

On i7 CPUs the crc32/crc64 instructions have a throughput
of 1 cycle and a latency of 3 cycles [1], which means that 1) with this 
code we pay 3 clocks per crc32 instruction, and 2) we could compute 
three CRCs in parallel, each processing 1/3 of the data during the same 
clock. This could in theory provide 3x the performance.

For short keys (~100 bytes and less) there is chance that the 3x 
theoretical speedup will be destroyed by the additional code required
to compute boundaries, xor the results, etc. But as I already mentioned, 
this is something to try.

[1] 
http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-paper.pdf



More information about the dev mailing list