[ovs-dev] [PATCH v2 2/9] lib/util: Add ctz64() and popcount64().
Jarno Rajahalme
jrajahalme at nicira.com
Mon Nov 18 22:24:52 UTC 2013
Ben,
I forgot about that mail over the weekend, sorry. IO just caked your patches.
Jarno
On Nov 18, 2013, at 1:20 PM, Ben Pfaff <blp at nicira.com> wrote:
> I guess you missed this. No matter, I sent out a couple of patches:
> http://openvswitch.org/pipermail/dev/2013-November/034035.html
> http://openvswitch.org/pipermail/dev/2013-November/034036.html
>
>
> On Fri, Nov 15, 2013 at 4:06 PM, Ben Pfaff <blp at nicira.com> wrote:
> On Fri, Nov 15, 2013 at 03:48:58PM -0800, Ben Pfaff wrote:
> > On Wed, Nov 13, 2013 at 01:32:42PM -0800, Jarno Rajahalme wrote:
> > > Add raw_ctz64(), ctz64(), and popcount64() using builtins when
> > > available.
> > >
> > > I'm not sure if the "UINT64_MAX == ~0UL" and "UINT64_MAX == ~0ULL"
> > > work in all cases as I imagine they would.
> >
> > I think you could use ULONG_MAX and ULLONG_MAX, any reason not to?
> >
> > > Signed-off-by: Jarno Rajahalme <jrajahalme at nicira.com>
>
> I had another thought.
>
> It is really convenient how rightmost_1bit() and zero_rightmost_1bit()
> work with any width integer. It would be nice if we could just make
> ctz() and raw_ctz() do the same.
>
> I was able to make rightmost_1bit() and zero_rightmost_1bit() work
> that way because when I did some tests with GCC, I found that GCC was
> smart enough to compile the code with the cheapest instructions. That
> is, if you pass in a uint32_t it didn't bother to do 64-bit
> arithmetic.
>
> I see that GCC isn't smart enough to do that with __builtin_ctzll(),
> but you can fake it with a small amount of manual effort. That is,
> the following C file:
>
> #include <stdint.h>
>
> static inline int ctz(unsigned long long int x)
> {
> if (__builtin_constant_p(x <= UINT32_MAX) && x <= UINT32_MAX) {
> return __builtin_ctz(x);
> } else {
> return __builtin_ctzll(x);
> }
> }
>
> int ctzl(unsigned long int x)
> {
> return ctz(x);
> }
>
> int ctzll(unsigned long long int x)
> {
> return ctz(x);
> }
>
> compiles to this on 32-bit:
>
> 00000000 <ctzl>:
> 0: 0f bc 44 24 04 bsf 0x4(%esp),%eax
> 5: c3 ret
> 6: 8d 76 00 lea 0x0(%esi),%esi
> 9: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi
>
> 00000010 <ctzll>:
> 10: 83 ec 1c sub $0x1c,%esp
> 13: 8b 44 24 20 mov 0x20(%esp),%eax
> 17: 8b 54 24 24 mov 0x24(%esp),%edx
> 1b: 89 04 24 mov %eax,(%esp)
> 1e: 89 54 24 04 mov %edx,0x4(%esp)
> 22: e8 fc ff ff ff call 23 <ctzll+0x13>
> 23: R_386_PC32 __ctzdi2
> 27: 83 c4 1c add $0x1c,%esp
> 2a: c3 ret
>
> and to this on 64-bit:
>
> 0000000000000000 <ctzl>:
> 0: 48 0f bc c7 bsf %rdi,%rax
> 4: c3 retq
> 5: 66 66 2e 0f 1f 84 00 data32 nopw %cs:0x0(%rax,%rax,1)
> c: 00 00 00 00
>
> 0000000000000010 <ctzll>:
> 10: 48 0f bc c7 bsf %rdi,%rax
> 14: c3 retq
>
> which is just about perfect.
>
> What do you think?
>
> Thanks,
>
> Ben.
>
>
>
> --
> "I don't normally do acked-by's. I think it's my way of avoiding
> getting blamed when it all blows up." Andrew Morton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-dev/attachments/20131118/fc798eab/attachment-0003.html>
More information about the dev
mailing list