[ovs-dev] communicatin failed with combination of ovs and linux native bonding
fbl at sysclose.org
Tue Oct 13 19:43:53 UTC 2015
On Wed, Sep 23, 2015 at 01:58:37PM +0800, 渔舟 wrote:
> On 9/23/15 13:46, 渔舟 wrote:
> > On 9/23/15 10:35, Jesse Gross wrote:
> >> On Tue, Sep 22, 2015 at 7:28 PM, 渔舟 <yuzhou at mogujie.com> wrote:
> >>> On 9/23/15 09:35, Jesse Gross wrote:
> >>>> On Fri, Sep 18, 2015 at 7:30 PM, 渔舟 <yuzhou at mogujie.com> wrote:
> >>>>> Hi, all
> >>>>> Communication failed happened sometimes (about fifty-fifty) with combination of ovs
> >>>>> and linux native bonding( 802.3ad mode), but if I use ovs own bonding(lacp=active) instead,
> >>>>> the failure disappeared, any ideas?
> >>>>> I read below post, which mentioned that "it may not be possible to use the Linux bonding at
> >>>>> the same time as Open vSwitch for Linux before 2.6.36", but I failed to get deeper info about that .
> >>>>> http://openvswitch.org/pipermail/discuss/2010-September/004407.html
> >>>> That post was referring to a different mode of bonding other than LACP
> >>>> (which did not exist in OVS at the time). The type of issues referred
> >>>> there should not affect LACP so I don't think there is an inherent
> >>>> problem.
> >>>> What is the issue?
> >>> Network disconnection happened sometimes after reboot the host.
> >>> The network configuration was right by ethtool, cat /proc/net/bonding/bond0,
> >>> ifconfig, route, and /etc/sysconfig/network-scripts.
> >>> And, one clue is that the hardware switch's ports status were pause when the network disconnection happened.
> >>> But after I use ovs own bonding(lacp = active) instead, the disconnection never happened.
> >> That sounds like an issue with how the Linux implementation negotiates
> >> the LACP session. It doesn't seem like there is much that OVS can do
> >> to improve the situation.
> > Yes, it should be a issue with lacp negotiation,
> if changing the linux bonding mode from 802.3ad(4) to active-backup(1),
> even using the combination of ovs and linux native bonding, the network disconnection never happened too,
> > but if using the combination of linux native bridge and linux bonding instead, the disconnection never happened.
Sounds like a LACP issue. Perhaps you can mirror the switch's port
and see the traffic dump? It might tell you is the packets are there
and if the contents are fine.
Another option is to enable the debug messages in the linux bond function
bond_3ad_rx_indication(). That should give you a confirmation that
the LACPDU is received.
More information about the dev