[ovs-dev] [PATCH RFC v2] lacp: Prefer slaves with running partner when selecting lead

Ethan Jackson ethan at nicira.com
Tue Aug 12 05:08:10 UTC 2014

In OVS we only support a single aggregator per port.  I understand
that Linux supports multiple, and that perhaps this problem would be
easier to work around if we did as well.  But today we don't, and IMO
implementing this feature would be rather complex.  Of course, an
implementation would be welcome.

The spec is quite clear about which slaves should enabled within a
particular LACP aggregator.  I'm not saying that the spec should be
followed in all cases, but the bar for making non standard changes
should be extremely high.  Basically in order to merge something like
this, I would like evidence that either my interpretation of the spec
is incorrect, or that some major LACP implementation (i.e. cisco) has
similar non standard behavior.  Protocol changes like this can have
significant unforeseen consequences which is driving my inclination to
be conservative in this case.

I understand that this change would be a convenient workaround to a
buggy m-lag implementation.  However, based on my reading of this
thread so far, I'm not totally convinced that it's prudent.  I could
be convinced, but I'm not quite there yet.  I think it's best to
shelve it for now.


On Mon, Aug 11, 2014 at 6:09 PM, Flavio Leitner <fbl at redhat.com> wrote:
> On Mon, Aug 11, 2014 at 05:47:48PM +0100, Zoltan Kiss wrote:
>> On 08/08/14 02:38, Flavio Leitner wrote:
>> >On Thu, Aug 07, 2014 at 04:19:17PM +0100, Zoltan Kiss wrote:
>> >>On 05/08/14 22:16, Flavio Leitner wrote:
>> >>>On Mon, Aug 04, 2014 at 12:08:48PM -0700, Andy Zhou wrote:
>> >>>>Zoltan,
>> >>>>
>> >>>>Sorry it took a while to get back to you.  I am just coming up to
>> >>>>speed on OVS LACP implementation, so my understanding may not be
>> >>>>correct.  Please feel free to point them out If I am wrong.
>> >>>>
>> >>>>According to wikipeida MC-LAG entry, there is no standard for it, they
>> >>>>are mostly designed and implemented by vendors.
>> >>>>
>> >>>>After reading through the commit message, and comparing with the
>> >>>>802.1AX spec, I feel this seems like there is a bug in the MC-LAG
>> >>>>implementation/configuration issue. When the partner on port A comes
>> >>>>back again, should it wait for MC-LAG sync before using the default
>> >>>>profile to exchange states with OVS?
>> >>>
>> >>>I agree that it sounds like a problem in the MC-LAG.  However, I also
>> >>>agree that OVS could do better.
>> >>>
>> >>>The aggregation selection policy is somewhat a gray area not defined
>> >>>in any spec. The bonding driver offers ad_select= parameter which
>> >>>allows to switch to the new aggregator only if, for instance, all the
>> >>>ports are down in an active aggregator.
>> >>>
>> >>>The Team driver implementing 802.3ad also provides the policy selection
>> >>>parameter.  The default is to consider the prio in the LACPDU, but you
>> >>>can also tell to not select any other aggregator if the current one is
>> >>>still usable, or per bandwidth or per number of ports available.
>> >>>
>> >>>My suggestion if we want to change something is to stick with bonding
>> >>>driver default behavior regarding to select a new aggregator:
>> >>>"""
>> >>>table or 0
>> >>>
>> >>>   The active aggregator is chosen by largest aggregate
>> >>>   bandwidth.
>> >>>
>> >>>   Reselection of the active aggregator occurs only when all
>> >>>   slaves of the active aggregator are down or the active
>> >>>   aggregator has no slaves.
>> >>>
>> >>>   This is the default value.
>> >>>"""
>> >>>Documentation/networking/bonding.txt
>> >>As far as I understood, there isn't really a concept of aggregator in OVS,
>> >>but "lead" in lacp_update_attached() is something similar. However it is
>> >>recalculated at every iteration, and it is simply just the slave with the
>> >>lower priority.
>> >
>> >Yes, that's my understanding as well. Rhe slave with lower priority
>> >carries the information about the partner which then is used to
>> >compare with other slaves.  The ones matching with lead's partner info
>> >are in the same trunk, so they are attached, otherwise it's either not
>> >established or is in another trunk (not attached).
>> >
>> >>Do you mean we should save "lead" into a new field in struct lacp, and
>> >>select a new one only if lacp_partner_ready(lead) == false?
>> >
>> >I think it will require a bit more than that because if the "lead"
>> >slave fails for some reason, the slave is not CURRENT anymore but the
>> >LACP might be still valid because of another slave in CURRENT state, so
>> >the "lead" might need to be replaced by another slave in the same trunk
>> >to keep it up. Otherwise, the failing "lead" might receive another
>> >partner info (the default config in your case) and that would break
>> >the established trunk.
>> In my scenario one of the slaves reboot, so it becomes DEFAULTED, and the
>> another one carry on, and becomes lead. And when this rebooting slave comes
>> back with the default config, it becomes CURRENT and then lead immediately
>> because it's priority is lower. However its partner is not ready, it doesn't
>> send the collecting and distributing bits.
> Yes, exactly. Changing the "lead" from the first failing port to the second
> one was correct because they were part of the same trunk. After that,
> the first port comes up again with different partner info, so the
> current "lead" shouldn't change unless it fails.
> My point was that the "lead" must always be a port with a valid trunk,
> otherwise its partner's info will be used to decide about other ports
> to be either enabled or disabled.  If it is going to change the
> "lead", then it should give preference to another port within the same
> trunk.
> fbl

More information about the dev mailing list