[ovs-dev] Issue with OVS lacp-fallback-ab option

Arun Navasivasakthivelsamy arunkum.navasiv at nutanix.com
Tue Oct 9 22:58:49 UTC 2018


Ben,

I was able to test it out, and you were right - disabling the recirculation fixed the problem! Since I’m on OVS 2.5.2, the patch I tried was slightly different:


diff --git a/ofproto/bond.c b/ofproto/bond.c

old mode 100644

new mode 100755

index 9c6079f..118a7d6

--- a/ofproto/bond.c

+++ b/ofproto/bond.c

@@ -914,6 +914,9 @@ bond_may_recirc(const struct bond *bond, uint32_t *recirc_id,

                 uint32_t *hash_bias)

 {

     if (bond->balance == BM_TCP && bond->recirc_id) {

+        if (bond->lacp_fallback_ab && bond->lacp_status == LACP_CONFIGURED) {

+            return false;

+        }

         if (recirc_id) {

             *recirc_id = bond->recirc_id;

         }

Could you elaborate on what the underlying issue with recirculation and lacp bonding?

Thanks
-Arun


On 10/8/18, 1:57 PM, "Arun Navasivasakthivelsamy" <arunkum.navasiv at nutanix.com<mailto:arunkum.navasiv at nutanix.com>> wrote:

Thanks Ben. Let me try it out this week and report back.

On 10/8/18, 1:15 PM, "Ben Pfaff" <blp at ovn.org<mailto:blp at ovn.org>> wrote:

I think I see a problem in the implementation of bonding when
recirculation is available.  Are you able to try out a patch?  If so,
try the following.  It is not a good way to solve the issue, but it
should illustrate whether recirculation is the problem.

diff --git a/ofproto/bond.c b/ofproto/bond.c
index f87cdba7908f..bb6a80411de5 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -927,7 +928,7 @@ bond_recirculation_account(struct bond *bond)
static bool
bond_may_recirc(const struct bond *bond)
{
-    return bond->balance == BM_TCP && bond->recirc_id;
+    return bond->balance == BM_TCP && bond->recirc_id && false;
}
static void


On Fri, Oct 05, 2018 at 04:45:21AM +0000, Arun Navasivasakthivelsamy
wrote:
Also, ofproto/trace is suggesting that the packet will be hashed to a
slave (instead of just to the active port) with lacp-fallback-ab option.
This is with active-backup bond mode:
[root at frankfurter02-4 ~]# ovs-appctl ofproto/trace br0
in_port=6,dl_dst=28:99:3a:08:7a:cf
Bridge: br0
Flow:
in_port=6,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=28:99:3a:08:7a:
cf,dl_type=0x0000
Rule: table=0 cookie=0 priority=0
OpenFlow actions=NORMAL
forwarding to learned port
Final flow: unchanged
Megaflow:
recirc_id=0,in_port=6,vlan_tci=0x0000/0x1fff,dl_src=00:00:00:00:00:00,dl_
dst=28:99:3a:08:7a:cf,dl_type=0x0000
Datapath actions: 5
This is with balance-tcp with lacp-fallback-ab mode (LACP is not
negotiated):
[root at frankfurter02-4 ~]# ovs-appctl ofproto/trace br0
in_port=6,dl_dst=28:99:3a:08:7a:cf
Bridge: br0
Flow:
in_port=6,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=28:99:3a:08:7a:
cf,dl_type=0x0000
Rule: table=0 cookie=0 priority=0
OpenFlow actions=NORMAL
forwarding to learned port
Final flow: unchanged
Megaflow:
recirc_id=0,in_port=6,vlan_tci=0x0000/0x1fff,dl_src=00:00:00:00:00:00,dl_
dst=28:99:3a:08:7a:cf,dl_type=0x0000
Datapath actions: hash(hash_l4(0)),recirc(0x1)
From: Arunkumar Navasiva
<arunkum.navasiv at nutanix.com<mailto:arunkum.navasiv at nutanix.com><mailto:arunkum.navasiv at nutanix.com>>
Date: Thursday, October 4, 2018 at 4:18 PM
To: "ovs-dev at openvswitch.org<mailto:ovs-dev at openvswitch.org><mailto:ovs-dev at openvswitch.org>"
<ovs-dev at openvswitch.org<mailto:ovs-dev at openvswitch.org><mailto:ovs-dev at openvswitch.org>>
Subject: Issue with OVS lacp-fallback-ab option
Hello folks,
We¹re seeing an issue with lacp-fallback-ab option on ovs 2.5/2.6/2.8.
It looks like when LACP is not enabled on the TOR switch ports, OVS on
the centos server is not falling back cleanly to active-backup, and
continues to send some portions of the traffic through the backup
interface (we¹ve seen this occur with various TOR vendor switches).
Please see the attached screenshot which shows that some traffic still
hashes to backup interface. We looked at the TOR forwarding table, and
MAC addresses of VMs running on this server flaps between the two
corresponding TOR switch ports of the bond . I¹m still in the early
stages of debugging this, but wanted to reach out to you to see if this
is already a known issue? If not, any help on how to debug this further
will be helpful.
Thanks
-Arun
_______________________________________________
dev mailing list
dev at openvswitch.org<mailto:dev at openvswitch.org>
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org
_mailman_listinfo_ovs-2Ddev&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=3XXybm-J2
6tdZghk4AB2Q6VwG-xD4UIstn2FwmI-3DQ&m=yn0Qe0gUppRi3JbdHCjsoa21myUOJTO9tU5M
s6ysj7s&s=iZmArQ2KxmCSrUeNKdCcJndIqugchI9rhK4s9iq4TrA&e=




More information about the dev mailing list