[ovs-dev] [PATCH 5/6] Re: byte-order: Make hton128() and ntoh128() behave like their counterparts.
Joe Stringer
joestringer at nicira.com
Mon Nov 23 20:49:37 UTC 2015
On 23 November 2015 at 10:06, Ben Pfaff <blp at ovn.org> wrote:
> When Joe added these types I assumed that he used the unconventional
> prototypes for hton128() and ntoh128() because the return value
> convention was inefficient. If GCC and Clang actually optimize the use
> of a return value in some kind of sensible way then I agree that the
> usual convention is nicer.
>
> Joe, did you have another reason?
This was mostly done based on an assumption that this was more
optimal, rather than actually digging into the compiled code and
seeing that it was generated differently.
Looking now, before this patch vs. after on my 64-bit system..
GCC-4.9: hton128/ntoh128 require one less MOV with this patch, but
calling conventions (in format_u128) require 3 extra MOV (+4 MOV, -1
LEA) for format_u128().
Clang-3.7: hton128/ntoh128 are roughly equivalent, although with this
patch they use some MOVUPS/MOVAPS instructions for 128-bit moves.
Calling conventions seem to require as much as 6-12 (!) extra MOVs
however, details below.
Clang-3.7 in format_u128(), before:
if (verbose || (mask && !ovs_u128_is_zero(mask))) {
e99b: f6 45 e7 01 testb $0x1,-0x19(%rbp)
e99f: 0f 85 1c 00 00 00 jne e9c1 <format_u128+0x41>
e9a5: 48 83 7d e8 00 cmpq $0x0,-0x18(%rbp)
e9aa: 0f 84 8d 00 00 00 je ea3d <format_u128+0xbd>
e9b0: 48 8b 7d e8 mov -0x18(%rbp),%rdi
e9b4: e8 c7 c8 ff ff callq b280 <ovs_u128_is_zero>
e9b9: a8 01 test $0x1,%al
e9bb: 0f 85 7c 00 00 00 jne ea3d <format_u128+0xbd>
e9c1: 48 8d 75 d0 lea -0x30(%rbp),%rsi
ovs_be128 value;
hton128(key, &value);
e9c5: 48 8b 7d f0 mov -0x10(%rbp),%rdi
e9c9: e8 82 00 00 00 callq ea50 <hton128>
e9ce: b8 10 00 00 00 mov $0x10,%eax
e9d3: 89 c2 mov %eax,%edx
e9d5: 48 8d 75 d0 lea -0x30(%rbp),%rsi
ds_put_hex(ds, &value, sizeof value);
e9d9: 48 8b 7d f8 mov -0x8(%rbp),%rdi
e9dd: e8 00 00 00 00 callq e9e2 <format_u128+0x62>
Clang-3.7, after:
if (verbose || (mask && !ovs_u128_is_zero(mask))) {
e99b: f6 45 e7 01 testb $0x1,-0x19(%rbp)
e99f: 0f 85 1c 00 00 00 jne e9c1 <format_u128+0x41>
e9a5: 48 83 7d e8 00 cmpq $0x0,-0x18(%rbp)
e9aa: 0f 84 d1 00 00 00 je ea81 <format_u128+0x101>
e9b0: 48 8b 7d e8 mov -0x18(%rbp),%rdi
e9b4: e8 c7 c8 ff ff callq b280 <ovs_u128_is_zero>
e9b9: a8 01 test $0x1,%al
e9bb: 0f 85 c0 00 00 00 jne ea81 <format_u128+0x101>
ovs_be128 value;
value = hton128(*key);
e9c1: 48 8b 45 f0 mov -0x10(%rbp),%rax
e9c5: 48 8b 38 mov (%rax),%rdi
e9c8: 48 8b 70 08 mov 0x8(%rax),%rsi
e9cc: e8 bf 00 00 00 callq ea90 <hton128>
e9d1: b9 10 00 00 00 mov $0x10,%ecx
e9d6: 89 ce mov %ecx,%esi
e9d8: 48 8d 7d d0 lea -0x30(%rbp),%rdi
e9dc: 48 89 45 c0 mov %rax,-0x40(%rbp)
e9e0: 48 89 55 c8 mov %rdx,-0x38(%rbp)
e9e4: 48 8b 45 c0 mov -0x40(%rbp),%rax
e9e8: 48 89 45 d0 mov %rax,-0x30(%rbp)
e9ec: 48 8b 45 c8 mov -0x38(%rbp),%rax
e9f0: 48 89 45 d8 mov %rax,-0x28(%rbp)
ds_put_hex(ds, &value, sizeof value);
e9f4: 48 8b 45 f8 mov -0x8(%rbp),%rax
e9f8: 48 89 7d a8 mov %rdi,-0x58(%rbp)
e9fc: 48 89 c7 mov %rax,%rdi
e9ff: 48 8b 45 a8 mov -0x58(%rbp),%rax
ea03: 48 89 75 a0 mov %rsi,-0x60(%rbp)
ea07: 48 89 c6 mov %rax,%rsi
ea0a: 48 8b 55 a0 mov -0x60(%rbp),%rdx
ea0e: e8 00 00 00 00 callq ea13 <format_u128+0x93>
More information about the dev
mailing list