[ovs-dev] [classifier-opt 23/28] util: New function popcount().

Ethan Jackson ethan at nicira.com
Tue Jul 31 22:18:23 UTC 2012


On Tue, Jul 31, 2012 at 3:14 PM, Ben Pfaff <blp at nicira.com> wrote:
> On Tue, Jul 31, 2012 at 10:38:21AM -0700, Ethan Jackson wrote:
>> How performance critical is this popcount implementation going to be?
>> I assume you've put all this work into testing it because the
>> classifier will be relying on it heavily?
>
> Yes, I think it's going to be at least fairly common in the
> classifier.  I didn't measure that yet, because I think that there are
> opportunities to avoid some of them.
>
>> Why do you think the gcc builtin is slow? That's surprising to me.  Is
>> it possible that in newer versions of gcc (i.e. 4.7 and later) would
>> simply generate the assembly instruction?
>
> The GCC builtin is portable.  I guess it's the same code as popcount4,
> since they run at the same speed.
>
> The assembly instruction isn't portable.  It isn't an architectural
> instruction, that is, you can't rely on say, anything newer than Core
> 2 to have it.  There is a separate CPU feature bit for it that you
> need to check before using it.  So my guess is that GCC will never
> generate it, even in the future, without some kind of specific
> compiler option that says "CPU has popcnt instruction".
>
>> If it's so performance critical, could we simply check for the
>> assembly instruction in the configure script, and if it exists use it.
>>  Of course, if it doesn't exist we would fall back to what you
>> currently have.
>
> Configure time wouldn't be good enough, because we need to know about
> the machine we're going to run on, not the one that we're building
> on.  We'd have to check at runtime instead.

Ah yes this makes sense.  Figuring out whether or not the instruction
exists at runtime would be a mess.

Ethan



More information about the dev mailing list