[ovs-dev] [PATCH RFC] lacp: Prefer slaves with running partner when selecting lead

Zoltan Kiss zoltan.kiss at citrix.com
Tue Jul 1 19:52:18 UTC 2014


This patch modifies the LACP selection logic by prefering a slaves with up and
running partners when looking for a lead.
That fixes the following scenario:
- bond has 2 ports, A and B, their other ends are in separate chassis with
  MC-LAG sync
- the partner of port A is restarted
- port B is still working
- the partner on port A comes back, but temporarily it is using a default
  config, as MC-LAG haven't synced yet
- apparently that default config has a sys_priority which is smaller than the
  other, still running port, plus completely different sys_id
- therefore OVS choose port A despite it won't ever comes up into
  collecting-distributing state
- and port B is disabled, causing the whole bond goes down

Checking through the 802.1ax standard, when port A comes up again, the two
links fall apart due to the different LAG IDs. They should be attached to
different Aggregators, and the Aggregators should live separately. In OVS there
is no such concept as Aggregator, but I think it should be said that it has only
one Aggregator, and it has an unique policy to choose which ports can join.
Although changing the chassis' default config can also fix this, detecting
such problems quite hard, therefore I think it is still valid to improve things
in OVS side.
Btw. the Linux kernel bonding drivers' LACP implementation allows more
aggregators, and therefore it could handle this situation properly.

Signed-off-by: Zoltan Kiss <zoltan.kiss at citrix.com>
---
diff --git a/lib/lacp.c b/lib/lacp.c
index 0d30e51..da590a7 100644
--- a/lib/lacp.c
+++ b/lib/lacp.c
@@ -579,6 +579,12 @@ lacp_wait(struct lacp *lacp) OVS_EXCLUDED(mutex)
     lacp_unlock();
 }
 
+static bool
+lacp_partner_ready(struct slave *slave)
+{
+    return slave->partner.state & (LACP_STATE_COL & LACP_STATE_DIST);
+}
+
 /* Static Helpers. */
 
 /* Updates the attached status of all slaves controlled by 'lacp' and sets its
@@ -616,7 +622,8 @@ lacp_update_attached(struct lacp *lacp) OVS_REQUIRES(mutex)
         slave->attached = true;
         slave_get_priority(slave, &pri);
 
-        if (!lead || memcmp(&pri, &lead_pri, sizeof pri) < 0) {
+        if (!lead || (memcmp(&pri, &lead_pri, sizeof pri) < 0 &&
+                      lacp_partner_ready(lead))) {
             lead = slave;
             lead_pri = pri;
         }



More information about the dev mailing list