[ovs-discuss] Possible bug with OVS LACP + VPC

Shu Shen shu.shen at gmail.com
Wed Jan 18 05:30:39 UTC 2017


On Tue, Jan 17, 2017 at 04:54:51PM -0600, Chad Norgan wrote:
> Given that the partner port_id on the rogue packet matches the slave
> it's sent out. I lean towards #1, that the LACP implementation is
> somehow mixing up the status for the slave's pdu, rather than leaking
> eth1's pdu out the eth0 interface.
> 
> -Chad

Hi Chad,

A few observations and questions as below:

1) I wrote an additional testcase for the slave down and back up case,
which appears to be working fine. I put additional debug messages (not
in the commit referred below thought) to trace the lacpdu being sent by
all slaves and did not see any rogue package. Of course, the testcase
uses two ovs switches and patch ports, so it may well be far away from
reproducing the problem you are having.  You may find the test case
here:

    https://github.com/shushen/ovs/commit/72aa0afc6b61d5135ea9253b8aaf31a57c7c4734

And travis-ci builds with the above test case included are passing:
    https://travis-ci.org/shushen/ovs/builds/192922935

2) Could you please elaborate a bit more about how you "manually down
the eth1 interface" and "bring eth1 back up"? Did you unplug a physical
link or did you use any ovs/Linux CLI to do so? This may help me refine
the test case to reproduce what you are doing.

3) I find it interesting in the packet trace from the gist you posted,
where the source mac address from the peer switch is all zeros, see

    https://gist.github.com/beardymcbeards/7bd9feca87c0574e996a397d90d5ff98#file-2_tcpdump-L81

If I read correctly, in Section 6.2.11.1 of 802.1AX-2014, it says:

    Protocol entities sourcing frames from within the Link Aggregation
    sublayer (e.g., LACP and the Marker protocol) use the MAC address of
    the MAC within an underlying Aggregation Port as the SA in frames
    transmitted through that Aggregation Port.

I'm not sure why the peer switch is using the all-zero MAC address but
it probably shouldn't. I don't know how ovs datapath handles such
packets. If when eth1 is coming back up and the source MAC address is
also all zeros, could this affect how the LACPDU from eth1 being
handled? I welcome comments from you and the list.

I'd appreciate if you could provide a bit more information on 2) or any
other thoughts. My intention is to investigate a bit more on this
problem.

/Shu

> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


More information about the discuss mailing list