[ovs-dev] [PATCH 5/6] Re: byte-order: Make hton128() and ntoh128() behave like their counterparts.

Joe Stringer joestringer at nicira.com
Mon Nov 23 20:49:37 UTC 2015


On 23 November 2015 at 10:06, Ben Pfaff <blp at ovn.org> wrote:
> When Joe added these types I assumed that he used the unconventional
> prototypes for hton128() and ntoh128() because the return value
> convention was inefficient.  If GCC and Clang actually optimize the use
> of a return value in some kind of sensible way then I agree that the
> usual convention is nicer.
>
> Joe, did you have another reason?

This was mostly done based on an assumption that this was more
optimal, rather than actually digging into the compiled code and
seeing that it was generated differently.

Looking now, before this patch vs. after on my 64-bit system..

GCC-4.9: hton128/ntoh128 require one less MOV with this patch, but
calling conventions (in format_u128) require 3 extra MOV (+4 MOV, -1
LEA) for format_u128().
Clang-3.7: hton128/ntoh128 are roughly equivalent, although with this
patch they use some MOVUPS/MOVAPS instructions for 128-bit moves.
Calling conventions seem to require as much as 6-12 (!) extra MOVs
however, details below.


Clang-3.7 in format_u128(), before:

    if (verbose || (mask && !ovs_u128_is_zero(mask))) {
    e99b:       f6 45 e7 01             testb  $0x1,-0x19(%rbp)
    e99f:       0f 85 1c 00 00 00       jne    e9c1 <format_u128+0x41>
    e9a5:       48 83 7d e8 00          cmpq   $0x0,-0x18(%rbp)
    e9aa:       0f 84 8d 00 00 00       je     ea3d <format_u128+0xbd>
    e9b0:       48 8b 7d e8             mov    -0x18(%rbp),%rdi
    e9b4:       e8 c7 c8 ff ff          callq  b280 <ovs_u128_is_zero>
    e9b9:       a8 01                   test   $0x1,%al
    e9bb:       0f 85 7c 00 00 00       jne    ea3d <format_u128+0xbd>
    e9c1:       48 8d 75 d0             lea    -0x30(%rbp),%rsi
        ovs_be128 value;

        hton128(key, &value);
    e9c5:       48 8b 7d f0             mov    -0x10(%rbp),%rdi
    e9c9:       e8 82 00 00 00          callq  ea50 <hton128>
    e9ce:       b8 10 00 00 00          mov    $0x10,%eax
    e9d3:       89 c2                   mov    %eax,%edx
    e9d5:       48 8d 75 d0             lea    -0x30(%rbp),%rsi
        ds_put_hex(ds, &value, sizeof value);
    e9d9:       48 8b 7d f8             mov    -0x8(%rbp),%rdi
    e9dd:       e8 00 00 00 00          callq  e9e2 <format_u128+0x62>




Clang-3.7, after:

    if (verbose || (mask && !ovs_u128_is_zero(mask))) {
    e99b:       f6 45 e7 01             testb  $0x1,-0x19(%rbp)
    e99f:       0f 85 1c 00 00 00       jne    e9c1 <format_u128+0x41>
    e9a5:       48 83 7d e8 00          cmpq   $0x0,-0x18(%rbp)
    e9aa:       0f 84 d1 00 00 00       je     ea81 <format_u128+0x101>
    e9b0:       48 8b 7d e8             mov    -0x18(%rbp),%rdi
    e9b4:       e8 c7 c8 ff ff          callq  b280 <ovs_u128_is_zero>
    e9b9:       a8 01                   test   $0x1,%al
    e9bb:       0f 85 c0 00 00 00       jne    ea81 <format_u128+0x101>
        ovs_be128 value;

        value = hton128(*key);
    e9c1:       48 8b 45 f0             mov    -0x10(%rbp),%rax
    e9c5:       48 8b 38                mov    (%rax),%rdi
    e9c8:       48 8b 70 08             mov    0x8(%rax),%rsi
    e9cc:       e8 bf 00 00 00          callq  ea90 <hton128>
    e9d1:       b9 10 00 00 00          mov    $0x10,%ecx
    e9d6:       89 ce                   mov    %ecx,%esi
    e9d8:       48 8d 7d d0             lea    -0x30(%rbp),%rdi
    e9dc:       48 89 45 c0             mov    %rax,-0x40(%rbp)
    e9e0:       48 89 55 c8             mov    %rdx,-0x38(%rbp)
    e9e4:       48 8b 45 c0             mov    -0x40(%rbp),%rax
    e9e8:       48 89 45 d0             mov    %rax,-0x30(%rbp)
    e9ec:       48 8b 45 c8             mov    -0x38(%rbp),%rax
    e9f0:       48 89 45 d8             mov    %rax,-0x28(%rbp)
        ds_put_hex(ds, &value, sizeof value);
    e9f4:       48 8b 45 f8             mov    -0x8(%rbp),%rax
    e9f8:       48 89 7d a8             mov    %rdi,-0x58(%rbp)
    e9fc:       48 89 c7                mov    %rax,%rdi
    e9ff:       48 8b 45 a8             mov    -0x58(%rbp),%rax
    ea03:       48 89 75 a0             mov    %rsi,-0x60(%rbp)
    ea07:       48 89 c6                mov    %rax,%rsi
    ea0a:       48 8b 55 a0             mov    -0x60(%rbp),%rdx
    ea0e:       e8 00 00 00 00          callq  ea13 <format_u128+0x93>



More information about the dev mailing list