[ovs-discuss] OVS 2.5.1 in an LACP bond does not correctly handle unicast flooding

Kris G. Lindgren klindgren at godaddy.com
Sat Feb 25 17:35:56 UTC 2017


We recently upgraded from OVS 2.3.3 to OVS 2.5.1 After upgrading we started getting mac’s for VM’s and HV’s learned on ports that they were not connected to.  After a long investigation we were able to see that OVS does not correctly handle unicast flooding.  As we would see OVS flood traffic that was not destined to a local mac back out on of the bond members.  In the switch we see:

2017 Feb 23 12:11:20 lfassi0114-02 %FWM-6-MAC_MOVE_NOTIFICATION: Host fa16.3ead.e6cf in vlan 413 is flapping between port Po19 and port Po22
2017 Feb 23 12:11:21 lfassi0114-02 %FWM-6-MAC_MOVE_NOTIFICATION: Host fa16.3ead.e6cf in vlan 413 is flapping between port Po22 and port Po19

On the host connected to Po22 (which is not where fa16.e3ad.e6cf lives we see:
12:11:20.374794 fa:16:3e:ad:e6:cf > 00:00:0c:9f:f0:01, ethertype 802.1Q (0x8100), length 64: vlan 413, p 0, ethertype ARP, Request who-has 10.198.39.254 tell 10.198.38.178, length 46
12:11:20.374941 fa:16:3e:ad:e6:cf > 00:00:0c:9f:f0:01, ethertype 802.1Q (0x8100), length 64: vlan 413, p 0, ethertype ARP, Request who-has 10.198.39.254 tell 10.198.38.178, length 46
12:11:20.376145 00:00:0c:9f:f0:01 > fa:16:3e:ad:e6:cf, ethertype 802.1Q (0x8100), length 64: vlan 413, p 0, ethertype ARP, Reply 10.198.39.254 is-at 00:00:0c:9f:f0:01, length 46
12:11:21.374628 fa:16:3e:ad:e6:cf > 00:00:0c:9f:f0:01, ethertype 802.1Q (0x8100), length 64: vlan 413, p 0, ethertype ARP, Request who-has 10.198.39.254 tell 10.198.38.178, length 46
12:11:21.375057 00:00:0c:9f:f0:01 > fa:16:3e:ad:e6:cf, ethertype 802.1Q (0x8100), length 64: vlan 413, p 0, ethertype ARP, Reply 10.198.39.254 is-at 00:00:0c:9f:f0:01, length 46
12:11:22.374578 fa:16:3e:ad:e6:cf > 00:00:0c:9f:f0:01, ethertype 802.1Q (0x8100), length 64: vlan 413, p 0, ethertype ARP, Request who-has 10.198.39.254 tell 10.198.38.178, length 46

By using a span port in the network spanning only traffic sent that is sent from the server we were also able to see that traffic destined to: 00:00:0c:9f:f0:01 was sent back out.  In this case 00:00:0c:9f:f0:01 is the virtual mac of the HSRP gateway.  Under cisco nexus 3k (I assume other nexus products as well) when configured with VPC/LACP/HSRP any traffic destined to the virtual mac of the hsrp gateway, that ends up on the non-active hsrp side, will get flooded to all ports on the non-active side.  This is done so that arp packet is seen by the active side.  This is how this config from cisco has worked since day one.  We have also seen this happen in bursts where the switch will see 26k+ mac moves in a minute and go into defense mode and stop mac-learning.  We haven’t been able to specifically catch a large storm event, but due to the way OVS is handling unicast flooding of ARP packets, we have no reason to believe it won’t treat unicast flooding of other traffic the exact same way.

Under OVS 2.3.3 the unicast flooding behavior was correctly handled where it was correctly dropped and packets were not flooded back out the bond member.  Changing bonding mode from balance-slb, to active-backup or balance-tcp makes no difference the unicast traffic is still flooded back out the bond.

Our OVS config is as follows:
ovs-vsctl:
ac83a7ff-0157-437c-bfba-8c038ec77c74
    Bridge br-ext
        Port br-ext
            Interface br-ext
                type: internal
        Port "bond0"
            Interface "p3p1"
            Interface "p3p2"
        Port "mgmt0"
            Interface "mgmt0"
                type: internal
        Port "ext-vlan-215"
            tag: 215
            Interface "ext-vlan-215"
                type: patch
                options: {peer="br215-ext"}
    Bridge br-int
        fail_mode: secure
        Port "int-br215"
            Interface "int-br215"
                type: patch
                options: {peer="phy-br215"}
        Port "qvo99ae272d-f8"
            tag: 1
            Interface "qvo99ae272d-f8"
        Port "qvo1d5492c0-df"
            tag: 1
            Interface "qvo1d5492c0-df"
        Port br-int
            Interface br-int
                type: internal
        Port "qvo6b7f3219-90"
            tag: 1
            Interface "qvo6b7f3219-90"
        Port "qvo3b4f81ed-f4"
            tag: 1
            Interface "qvo3b4f81ed-f4"
    Bridge "br215"
        Port "br215"
            Interface "br215"
                type: internal
        Port "phy-br215"
            Interface "phy-br215"
                type: patch
                options: {peer="int-br215"}
        Port "br215-ext"
            Interface "br215-ext"
                type: patch
                options: {peer="ext-vlan-215"}
    ovs_version: "2.5.1"

# ovs-appctl bond/show
---- bond0 ----
bond_mode: balance-slb
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
updelay: 0 ms
downdelay: 0 ms
next rebalance: 2426 ms
lacp_status: negotiated
active slave mac: 00:8c:fa:eb:2b:74(p3p1)

slave p3p1: enabled
                active slave
                may_enable: true
                hash 140: 154 kB load

slave p3p2: enabled
                may_enable: true
                hash 199: 69 kB load
                hash 220: 40 kB load
                hash 234: 21 kB load

# ovs-appctl lacp/show
---- bond0 ----
                status: active negotiated
                sys_id: 00:8c:fa:eb:2b:74
                sys_priority: 65534
                aggregation key: 9
                lacp_time: slow

slave: p3p1: current attached
                port_id: 9
                port_priority: 65535
                may_enable: true

                actor sys_id: 00:8c:fa:eb:2b:74
                actor sys_priority: 65534
                actor port_id: 9
                actor port_priority: 65535
                actor key: 9
                actor state: activity aggregation synchronized collecting distributing

                partner sys_id: 02:1c:73:87:60:cd
                partner sys_priority: 32768
                partner port_id: 52
                partner port_priority: 32768
                partner key: 52
                partner state: activity aggregation synchronized collecting distributing

slave: p3p2: current attached
                port_id: 10
                port_priority: 65535
                may_enable: true

                actor sys_id: 00:8c:fa:eb:2b:74
                actor sys_priority: 65534
                actor port_id: 10
                actor port_priority: 65535
                actor key: 9
                actor state: activity aggregation synchronized collecting distributing

                partner sys_id: 02:1c:73:87:60:cd
                partner sys_priority: 32768
                partner port_id: 32820
                partner port_priority: 32768
                partner key: 52
                partner state: activity aggregation synchronized collecting distributing

# ovs-ofctl dump-flows br-ext
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=713896.614s, table=0, n_packets=1369078301, n_bytes=130805436786, idle_age=0, hard_age=65534, priority=0 actions=NORMAL

# ovs-ofctl dump-flows br-int
NXST_FLOW reply (xid=0x4):
cookie=0xb367eed8ac0e9e7d, duration=713933.475s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=10,icmp6,in_port=2,icmp_type=136 actions=resubmit(,24)
cookie=0xb367eed8ac0e9e7d, duration=713932.943s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=10,icmp6,in_port=3,icmp_type=136 actions=resubmit(,24)
cookie=0xb367eed8ac0e9e7d, duration=713929.414s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=10,icmp6,in_port=5,icmp_type=136 actions=resubmit(,24)
cookie=0xb367eed8ac0e9e7d, duration=713928.888s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=10,icmp6,in_port=4,icmp_type=136 actions=resubmit(,24)
cookie=0xb367eed8ac0e9e7d, duration=713933.280s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=10,arp,in_port=2 actions=resubmit(,24)
cookie=0xb367eed8ac0e9e7d, duration=713932.660s, table=0, n_packets=149398, n_bytes=6274716, idle_age=4, hard_age=65534, priority=10,arp,in_port=3 actions=resubmit(,24)
cookie=0xb367eed8ac0e9e7d, duration=713929.218s, table=0, n_packets=102577, n_bytes=4308234, idle_age=7, hard_age=65534, priority=10,arp,in_port=5 actions=resubmit(,24)
cookie=0xb367eed8ac0e9e7d, duration=713928.620s, table=0, n_packets=61321, n_bytes=2575482, idle_age=8, hard_age=65534, priority=10,arp,in_port=4 actions=resubmit(,24)
cookie=0xb367eed8ac0e9e7d, duration=713935.656s, table=0, n_packets=1274428312, n_bytes=105873932966, idle_age=0, hard_age=65534, priority=3,in_port=1,vlan_tci=0x0000 actions=mod_vlan_vid:1,NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713945.070s, table=0, n_packets=7817, n_bytes=707680, idle_age=65534, hard_age=65534, priority=2,in_port=1 actions=drop
cookie=0xb367eed8ac0e9e7d, duration=713945.999s, table=0, n_packets=82510417, n_bytes=17955154731, idle_age=0, hard_age=65534, priority=0 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713945.936s, table=23, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=0 actions=drop
cookie=0xb367eed8ac0e9e7d, duration=713933.544s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,icmp6,in_port=2,icmp_type=136,nd_target=fe80::f816:3eff:fe49:4dff actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713933.009s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,icmp6,in_port=3,icmp_type=136,nd_target=fe80::f816:3eff:fec7:82b9 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713929.482s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,icmp6,in_port=5,icmp_type=136,nd_target=fe80::f816:3eff:fe07:d92e actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713928.951s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,icmp6,in_port=4,icmp_type=136,nd_target=fe80::f816:3eff:fe17:9919 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713933.410s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,arp,in_port=2,arp_spa=10.26.87.153 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713933.344s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,arp,in_port=2,arp_spa=10.26.52.87 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713932.877s, table=24, n_packets=149394, n_bytes=6274548, idle_age=4, hard_age=65534, priority=2,arp,in_port=3,arp_spa=10.26.53.163 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713932.807s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,arp,in_port=3,arp_spa=10.26.85.208 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713932.728s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,arp,in_port=3,arp_spa=10.26.85.209 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713929.349s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,arp,in_port=5,arp_spa=10.26.85.218 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713929.284s, table=24, n_packets=102573, n_bytes=4308066, idle_age=7, hard_age=65534, priority=2,arp,in_port=5,arp_spa=10.26.53.86 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713928.817s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,arp,in_port=4,arp_spa=10.26.87.99 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713928.752s, table=24, n_packets=61317, n_bytes=2575314, idle_age=8, hard_age=65534, priority=2,arp,in_port=4,arp_spa=10.26.53.197 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713928.686s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,arp,in_port=4,arp_spa=198.71.248.104 actions=NORMAL
cookie=0xb367eed8ac0e9e7d, duration=713945.871s, table=24, n_packets=16, n_bytes=672, idle_age=65534, hard_age=65534, priority=0 actions=drop


___________________________________________________________________
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20170225/950a3c55/attachment-0001.html>


More information about the discuss mailing list