[ovs-discuss] bond_updelay being ignored?

Ben Pfaff blp at ovn.org
Mon Oct 8 20:45:40 UTC 2018


On Tue, Oct 02, 2018 at 10:28:52AM -0600, Daniel Leaberry via discuss wrote:
> I have Centos 7 with openvswitch 2.9.0. The server has 4 ports in an lacp bond (called allbond) connected to a set of mlagged arista switches. Here's the config
> 
> ovs-vsctl list port allbond
> _uuid               : 9f224f2d-8bb1-4cfd-84e2-d60c6d973a7a
> bond_active_slave   : "90:e2:ba:d6:1c:44"
> bond_downdelay      : 0
> bond_fake_iface     : false
> bond_mode           : balance-tcp
> bond_updelay        : 40000
> cvlans              : []
> external_ids        : {}
> fake_bridge         : false
> interfaces          : [61b9a345-2f3d-4127-b9cd-eaca8a749574, 89ce3480-d62d-4291-9a84-bdf711016793, 941c9393-1021-490c-84ac-311250ba0343, dc49ffd3-c259-43b6-8072-2ce12c52d1b1]
> lacp                : active
> mac                 : []
> name                : allbond
> other_config        : {}
> protected           : false
> qos                 : []
> rstp_statistics     : {}
> rstp_status         : {}
> statistics          : {}
> status              : {}
> tag                 : []
> trunks              : []
> vlan_mode           : []
> 
> 
> ---- allbond ----
> bond_mode: balance-tcp
> bond may use recirculation: yes, Recirc-ID : 3
> bond-hash-basis: 0
> updelay: 40000 ms
> downdelay: 0 ms
> next rebalance: 3229 ms
> lacp_status: negotiated
> lacp_fallback_ab: false
> active slave mac: 90:e2:ba:d6:1c:44(eth5)
> 
> slave eth3: enabled
> 	may_enable: true
> 	hash 50: 1 kB load
> 	hash 162: 1 kB load
> 	hash 170: 1 kB load
> 
> slave eth4: enabled
> 	may_enable: true
> 	hash 123: 4 kB load
> 	hash 221: 12 kB load
> 
> slave eth5: enabled
> 	active slave
> 	may_enable: true
> 	hash 94: 1 kB load
> 	hash 177: 1 kB load
> 	hash 245: 1 kB load
> 
> slave eth6: enabled
> 	may_enable: true
> 	hash 97: 46 kB load
> 
> As you can see updelay is set to 40 seconds. I go to the switch and shutdown the port for eth6. It's immediately pulled from the bond. I then clear the switch counters and wait a few minutes. I would expect when the port is "no shutdown" that 40 seconds will go by before openvswitch brings it back into the bond. But that doesn't happen.
> 
> 2018-10-02T15:31:32.885Z|00349|bond|INFO|interface eth6: link state down
> 2018-10-02T15:31:32.885Z|00350|bond|INFO|interface eth6: disabled
> 2018-10-02T15:35:45.861Z|00352|bond|INFO|interface eth6: link state up
> 2018-10-02T15:35:45.861Z|00353|bond|INFO|interface eth6: enabled
> 2018-10-02T15:35:51.286Z|00354|bond|INFO|bond allbond: shift 93kB of load (with hash 97) from eth3 to eth6 (now carrying 6kB and 93kB load, respectively)
> 
> Immediately after link is re-established the port (eth6) is enabled again and traffic as shown in the switch counters begins to flow again. It feels like I'm doing something wrong but I've googled for hours and can't find anything that explains why the bond_updelay is being ignored.

I spent some time looking through the history here.  Ethan (CCed) added
LACP support to OVS in January 2011.  From that point forward, OVS has
always ignored updelay and downdelay for a bond when LACP is enabled.  I
don't know why, exactly.  Maybe Ethan remembers.

It would be easy to enable updelay and downdelay for LACP bonds:

diff --git a/ofproto/bond.c b/ofproto/bond.c
index f87cdba7908f..8a90ba2686af 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -1717,8 +1717,7 @@ bond_link_status_update(struct bond_slave *slave)
             VLOG_INFO_RL(&rl, "interface %s: will not be %s",
                          slave->name, up ? "disabled" : "enabled");
         } else {
-            int delay = (bond->lacp_status != LACP_DISABLED ? 0
-                         : up ? bond->updelay : bond->downdelay);
+            int delay = up ? bond->updelay : bond->downdelay;
             slave->delay_expires = time_msec() + delay;
             if (delay) {
                 VLOG_INFO_RL(&rl, "interface %s: will be %s if it stays %s "



More information about the discuss mailing list