[ovs-discuss] ipam allows address conflicts?

Ben Pfaff blp at ovn.org
Thu Mar 7 01:44:00 UTC 2019


Thanks, I applied this.

I see a lot more test failures with a modified hash function.  James,
I'm surprised that you only see one failure on big-endian.

On Wed, Mar 06, 2019 at 09:35:03AM -0500, Mark Michelson wrote:
> https://patchwork.ozlabs.org/patch/1052369/
> 
> On 3/6/19 8:59 AM, Mark Michelson wrote:
> > OK, I think I see what the problem is now.
> > 
> > The issue is with the order that ovn_ports are visited in
> > join_logical_ports. If a logical switch port is connected to a logical
> > router port, then if the logical switch port is visited first,
> > everything works fine. If the logical router port is visited first, then
> > it can result in duplicate addresses being assigned. As you pointed out,
> > the hash function on a big endian system will be different, so ports are
> > being visited in a different order on that system. However, this could
> > just as easily occur on a little endian system as well, depending on the
> > values being computed by the hash function.
> > 
> > I will submit a fix as soon as possible.
> > Thanks,
> > Mark
> > 
> > On 3/6/19 8:45 AM, Mark Michelson wrote:
> > > Hi Ben,
> > > 
> > > Thanks for reaching out. Right now it's not obvious why big endian
> > > architecture would cause duplicate IPs to be assigned. My initial
> > > thought was that the wrong byte ordering was being used when adding
> > > IP addresses to IPAM, but it appears that host byte ordering is
> > > being used consistently when performing the calculations.
> > > 
> > > James, is that the only IPAM-related test failure you are seeing? If
> > > so, that can help narrow where I need to look to fix this. Thanks.
> > > 
> > > On 3/5/19 9:04 PM, Ben Pfaff wrote:
> > > > [adding some folks who've worked on IPAM lately]
> > > > 
> > > > On Tue, Mar 05, 2019 at 10:26:58AM +0000, James Page wrote:
> > > > > Test 2768 is consistently failing on s390x in Ubuntu development.
> > > > > 
> > > > > Note that this is the only big-endian architecture we build
> > > > > for so I would
> > > > > suspect something endian-y as the root cause.
> > > > 
> > > > Yes, it's related to big-endian, in particular related to the OVS hash
> > > > function, which produces different results depending on endianness.
> > > > I can reproduce it on amd64 by changing the hash function:
> > > > 
> > > > ----------------------------------------------------------------------
> > > > diff --git a/lib/hash.h b/lib/hash.h
> > > > index eb3776500abe..c7fc6430c71d 100644
> > > > --- a/lib/hash.h
> > > > +++ b/lib/hash.h
> > > > @@ -1,5 +1,5 @@
> > > >   /*
> > > > - * Copyright (c) 2008, 2009, 2010, 2012, 2013, 2014, 2016 Nicira, Inc.
> > > > + * Copyright (c) 2008, 2009, 2010, 2012, 2013, 2014, 2016, 2019
> > > > Nicira, Inc.
> > > >    *
> > > >    * Licensed under the Apache License, Version 2.0 (the "License");
> > > >    * you may not use this file except in compliance with the License.
> > > > @@ -87,6 +87,7 @@ static inline uint32_t mhash_finish(uint32_t hash)
> > > >       hash ^= hash >> 13;
> > > >       hash *= 0xc2b2ae35;
> > > >       hash ^= hash >> 16;
> > > > +    hash = ~hash;
> > > >       return hash;
> > > >   }
> > > > @@ -184,7 +185,7 @@ static inline uint32_t hash_finish(uint64_t
> > > > hash, uint64_t final)
> > > >       /* The finishing multiplier 0x805204f3 has been experimentally
> > > >        * derived to pass the testsuite hash tests. */
> > > >       hash = _mm_crc32_u64(hash, final) * 0x805204f3;
> > > > -    return hash ^ (uint32_t)hash >> 16; /* Increase entropy in LSBs. */
> > > > +    return ~(hash ^ (uint32_t)hash >> 16); /* Increase entropy
> > > > in LSBs. */
> > > >   }
> > > >   /* Returns the hash of the 'n' 32-bit words at 'p_', starting
> > > > from 'basis'.
> > > > ----------------------------------------------------------------------
> > > > 
> > > > The actual failure might be due to a more serious bug though.  When I
> > > > apply the following patch:
> > > > 
> > > > ----------------------------------------------------------------------
> > > > diff --git a/tests/ovn.at b/tests/ovn.at
> > > > index 2af225a678e1..a53204f544e9 100644
> > > > --- a/tests/ovn.at
> > > > +++ b/tests/ovn.at
> > > > @@ -12273,6 +12273,8 @@ for i in 2 3 4; do
> > > >       ovn-nbctl --wait=sb lsp-set-addresses swp$i
> > > > "02:00:00:00:00:0$i dynamic"
> > > >       cidr=$(ovn-nbctl get logical_switch_port swp$i
> > > > dynamic_addresses |cut -f2 -d' '|cut -f1 -d\")
> > > >       ovn-nbctl lrp-add ro$i rop$i 02:00:00:00:00:0$i $cidr/24
> > > > -- set logical_switch_port swp$i type=router
> > > > options:router-port=rop$i addresses=router;
> > > > +    printf "\nadded swp$i\n"
> > > > +    ovn-nbctl list logical_switch_port
> > > >       AT_CHECK_UNQUOTED([ovn-nbctl get logical_router_port rop$i
> > > > networks], [0], [[["192.168.1.$i/24"]]
> > > >   ])
> > > >   done
> > > > ----------------------------------------------------------------------
> > > > 
> > > > the test then reports that ports swp2 and swp3 are both being assigned
> > > > dynamic address 192.168.1.2 (instead of 192.168.1.2 and 192.168.1.3,
> > > > respectively), as you can see from the "dynamic_addresses" lines
> > > > following "added swp3":
> > > > 
> > > > ----------------------------------------------------------------------
> > > > 2772. ovn.at:12264: testing ovn -- ipam router ports ...
> > > > creating ovn-sb database
> > > > creating ovn-nb database
> > > > starting ovn-northd
> > > > starting backup ovn-northd
> > > > 
> > > > added swp2
> > > > _uuid               : d78add30-9d8e-4d49-bea6-8f4bc947ab3b
> > > > addresses           : [router]
> > > > dhcpv4_options      : []
> > > > dhcpv6_options      : []
> > > > dynamic_addresses   : "02:00:00:00:00:02 192.168.1.2"
> > > > enabled             : []
> > > > external_ids        : {}
> > > > name                : "swp2"
> > > > options             : {router-port="rop2"}
> > > > parent_name         : []
> > > > port_security       : []
> > > > tag                 : []
> > > > tag_request         : []
> > > > type                : router
> > > > up                  : true
> > > > ../../tests/ovn.at:12278: ovn-nbctl get logical_router_port
> > > > rop$i networks
> > > > 
> > > > added swp3
> > > > _uuid               : d78add30-9d8e-4d49-bea6-8f4bc947ab3b
> > > > addresses           : [router]
> > > > dhcpv4_options      : []
> > > > dhcpv6_options      : []
> > > > dynamic_addresses   : "02:00:00:00:00:02 192.168.1.2"
> > > > enabled             : []
> > > > external_ids        : {}
> > > > name                : "swp2"
> > > > options             : {router-port="rop2"}
> > > > parent_name         : []
> > > > port_security       : []
> > > > tag                 : []
> > > > tag_request         : []
> > > > type                : router
> > > > up                  : true
> > > > 
> > > > _uuid               : 0542f623-43a8-407d-adbe-0927999c78b1
> > > > addresses           : [router]
> > > > dhcpv4_options      : []
> > > > dhcpv6_options      : []
> > > > dynamic_addresses   : "02:00:00:00:00:03 192.168.1.2"
> > > > enabled             : []
> > > > external_ids        : {}
> > > > name                : "swp3"
> > > > options             : {router-port="rop3"}
> > > > parent_name         : []
> > > > port_security       : []
> > > > tag                 : []
> > > > tag_request         : []
> > > > type                : router
> > > > up                  : true
> > > > ../../tests/ovn.at:12278: ovn-nbctl get logical_router_port
> > > > rop$i networks
> > > > --- -   2019-03-05 17:58:01.066306125 -0800
> > > > +++
> > > > /home/blp/nicira/ovs/_build/tests/testsuite.dir/at-groups/2772/stdout
> > > > 2019-03-05 17:58:01.060040470 -0800
> > > > @@ -1,2 +1,2 @@
> > > > -["192.168.1.3/24"]
> > > > +["192.168.1.2/24"]
> > > > ----------------------------------------------------------------------
> > > > 
> > > > Mark and Lorenzo, this seems like something you should look into?
> > > > 
> > > > Thanks,
> > > > 
> > > > Ben.
> > > > 
> > > 
> > 
> 


More information about the discuss mailing list