[ovs-dev] [PATCH RFC v2] lacp: Prefer slaves with running partner when selecting lead
fbl at redhat.com
Tue Aug 5 21:06:44 UTC 2014
On Mon, Aug 04, 2014 at 12:08:48PM -0700, Andy Zhou wrote:
> Sorry it took a while to get back to you. I am just coming up to
> speed on OVS LACP implementation, so my understanding may not be
> correct. Please feel free to point them out If I am wrong.
> According to wikipeida MC-LAG entry, there is no standard for it, they
> are mostly designed and implemented by vendors.
> After reading through the commit message, and comparing with the
> 802.1AX spec, I feel this seems like there is a bug in the MC-LAG
> implementation/configuration issue. When the partner on port A comes
> back again, should it wait for MC-LAG sync before using the default
> profile to exchange states with OVS?
> On Mon, Jul 14, 2014 at 3:11 PM, Ben Pfaff <blp at nicira.com> wrote:
> > On Tue, Jul 08, 2014 at 05:35:57PM +0100, Zoltan Kiss wrote:
> >> This patch modifies the LACP selection logic by prefering a slaves with up and
> >> running partners when looking for a lead.
> >> That fixes the following scenario:
> >> - bond has 2 ports, A and B, their other ends are in separate chassis with
> >> MC-LAG sync
> >> - the partner of port A is restarted
> >> - port B is still working
> >> - the partner on port A comes back, but temporarily it is using a default
> >> config, as MC-LAG haven't synced yet
> >> - apparently that default config has a sys_priority which is smaller than the
> >> other, still running port, plus completely different sys_id
> >> - therefore OVS choose port A despite it won't ever comes up into
> >> collecting-distributing state
> >> - and port B is disabled, causing the whole bond goes down
> >> Checking through the 802.1ax standard, when port A comes up again, the two
> >> links fall apart due to the different LAG IDs. They should be attached to
> >> different Aggregators, and the Aggregators should live separately. In OVS there
> >> is no such concept as Aggregator, but I think it should be said that it has only
> >> one Aggregator, and it has an unique policy to choose which ports can join.
> >> Although changing the chassis' default config can also fix this, detecting
> >> such problems quite hard, therefore I think it is still valid to improve things
> >> in OVS side.
> >> Btw. the Linux kernel bonding drivers' LACP implementation allows more
> >> aggregators, and therefore it could handle this situation properly.
> >> Signed-off-by: Zoltan Kiss <zoltan.kiss at citrix.com>
> > I verified that the unit tests still pass with this applied.
> > Andy Zhou said he'd review the patch.
> dev mailing list
> dev at openvswitch.org
More information about the dev