[ovs-dev] [Bug Report] physical switch is randomly set err-disabled while inter-connects with ovs-2.4.0

Mon Jul 25 02:00:57 UTC 2016

Hi all

Cisco Switch port will be randomly set err-disabled while connected to ovs-2.4.0,
it is very hard to reproduce this problem.
>From Cisco switch perspective, the port was disabled because receiving a keepalive
packet which was originated from the Cisco switch port. 

Cisco switch port generates one keepalive packet every 10s, and the keepalive
packet is like below:
13:54:34.300201 50:1c:bf:71:08:0c > 50:1c:bf:71:08:0c, ethertype Loopback (0x9000), length 60

Normally ovs will discard this keepalive packet because its src mac is identical
with its dst mac.

[Suspection based on the result of reproduction]
This problem is happened in the gap between the corresponding old datapath flow is invalidated 
and corresponding new datapath flow was generated failed, 
maybe caused by some kind of concurrent processing.

[Reproduce environment]
$ ovs-vsctl show
b554244c-fc80-47ac-8355-955c9ac6667f
    Bridge br-int
        fail_mode: secure
        Port "tap4f633e04-f0"
            tag: 1
            Interface "tap4f633e04-f0"
        Port br-int
            Interface br-int
                type: internal
        Port "tapbf1e3f37-07"
            tag: 1
            Interface "tapbf1e3f37-07"
        Port "int-br100"
            Interface "int-br100"
                type: patch
                options: {peer="phy-br100"}
        Port "tape88a9db6-3d"
            tag: 1
            Interface "tape88a9db6-3d"
        Port "tap04376fe9-43"
            tag: 1
            Interface "tap04376fe9-43"
    Bridge "br100"
        Port "veth0"
            Interface "veth0"
        Port "phy-br100"
            Interface "phy-br100"
                type: patch
                options: {peer="int-br100"}
        Port "veth2"
            Interface "veth2"
        Port "enp2s0"
            Interface "enp2s0"
        Port "br100"
            Interface "br100"
                type: internal
    ovs_version: "2.4.0"

$ ovs-vsctl list mirror
_uuid               : d766e658-f2d3-4339-8f33-b4a780f6b357
external_ids        : {}
name                : "mirror2"
output_port         : 855fefdc-fa82-4fdd-9836-44b269bb094c
output_vlan         : []
select_all          : false
select_dst_port     : [e1ab14ff-c40e-4b47-b863-8d85c9b8b34c]
select_src_port     : []
select_vlan         : []
statistics          : {tx_bytes=2610235744, tx_packets=5419691}

_uuid               : f4dd02ad-ce02-46db-b1aa-b9bbcc24af58
external_ids        : {}
name                : "mirror0"
output_port         : 0459827e-d955-4f97-99cc-4f7b13249c90
output_vlan         : []
select_all          : false
select_dst_port     : []
select_src_port     : [e1ab14ff-c40e-4b47-b863-8d85c9b8b34c]
select_vlan         : []
statistics          : {tx_bytes=44966840328, tx_packets=249376758}

Additional remarks:
enp2s0 is the physical nic which connects the Cisco switch and ovs br100,
veth0 port in br100 is the ingress mirror port of enp2s0,
veth2 port in br100 is the egress mirror port of enp2s0.

[Reproduce actions]
1) tcpdump on enp2s0 to capture the keepalive packet, 
tcpdump -i enp2s0 -e -nn "not ip and not arp" -w /home/enp2s0.pcap

2) create port mirror for select_src_port of port enp2s0 in br100, and veth0 connecting to the mirror port,
then tcpdump on veth0, 
tcpdump -i veth0 -e -nn not ip and not arp -w /home/veth0.pcap

create port mirror for select_dst_port of port enp2s0 in br100, and veth2 connecting to the mirror port,
then tcpdump on veth2,
tcpdump -i veth0 -e -nn not ip and not arp -w /home/veth2.pcap

When the physical switch port was set to err-disabled, means that it received the keepalive packet
returned from the host,
then we can check enp2s0.pcap, veth0.pcap, veth1.pcap.
If the keepalive packet found in veth2.pcap, we can conclude that this problem is caused by ovs,
if the keepalive packet not found in veth0.pcap, but found two successive keepalive packet with very 
short interval in enp2s0.pcap, we can conclude thatthis problem is caused by physical nic or nic driver, 
even kernel.

3) ovs-dpctl dump-flows | grep <dst=mac-of-keepalive-packet> | grep -v "actions:drop"
We perform above command in the interval of 1s, and store the result,
so we can find where the keepalive packet comes from, and where it to go.

4) ovs-appctl fdb/show br100 | grep <mac-of-keepalive-packet>
We perform above command in the interval of 1s, and store the result,
so we can find the in_port of the keepalive packet, if the in_port is not enp2s0, something was wrong.

[Reproduce result]
When the problem happened, we got some info.
1) kernel message in /var/log/messages
Jul 13 13:54:35 juanniu035013 kernel: e1000e: enp2s0 NIC Link is Down

2) tcpdump on enp2s0 found two successive keepalive packet with vey short the interval, less than 1ms, 
shown as blow
13:54:34.300201 50:1c:bf:71:08:0c > 50:1c:bf:71:08:0c, ethertype Loopback (0x9000), length 60:
13:54:34.300396 50:1c:bf:71:08:0c > 50:1c:bf:71:08:0c, ethertype Loopback (0x9000), length 60:

3) tcpdump on veth0
13:54:34.300407 50:1c:bf:71:08:0c > 50:1c:bf:71:08:0c, ethertype Loopback (0x9000), length 60:
so, at 13:54:34, the keepalive packet entered ovs.

4) tcpdump on veth2
Got nothing.

5) datapath flow from 13:54:23 to 13:54:45
2016-07-13 13:54:23
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:25, bytes:1500, used:9.650s, actions:8
2016-07-13 13:54:24
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:0.661s, actions:8
2016-07-13 13:54:25
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:1.676s, actions:8
2016-07-13 13:54:26
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:2.694s, actions:8
2016-07-13 13:54:27
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:3.709s, actions:8
2016-07-13 13:54:28
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:4.726s, actions:8
2016-07-13 13:54:30
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:5.744s, actions:8
2016-07-13 13:54:31
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:6.761s, actions:8
2016-07-13 13:54:32
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:7.777s, actions:8
2016-07-13 13:54:33
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:8.794s, actions:8
2016-07-13 13:54:34
recirc_id(0),in_port(2),eth(src=50:1c:bf:71:08:0c,dst=50:1c:bf:71:08:0c),eth_type(0x9000), packets:26, bytes:1560, used:9.812s, actions:8
2016-07-13 13:54:35
2016-07-13 13:54:36
2016-07-13 13:54:37
2016-07-13 13:54:38
2016-07-13 13:54:39
2016-07-13 13:54:40
2016-07-13 13:54:41
2016-07-13 13:54:42
2016-07-13 13:54:43
2016-07-13 13:54:44
2016-07-13 13:54:45

It's very strange, from the result of tcpdump on veth0, we knows that the keepalive packet entered the ovs,
but from the result of datapath flow, the used time is not updated, and invalidated immediately.
So I suspect that this problem is happened in the gap between the corresponding datapath flow is invalidated 
and corresponding new datapath flow generated failed, maybe caused by some kind of concurrent processing.

5) fdb show result,
2016-07-13 13:54:23
    1     0  50:1c:bf:71:08:0c   10
2016-07-13 13:54:24
    1     0  50:1c:bf:71:08:0c    1
2016-07-13 13:54:25
    1     0  50:1c:bf:71:08:0c    2
2016-07-13 13:54:26
    1     0  50:1c:bf:71:08:0c    3
2016-07-13 13:54:27
    1     0  50:1c:bf:71:08:0c    4
2016-07-13 13:54:29
    1     0  50:1c:bf:71:08:0c    5
2016-07-13 13:54:30
    1     0  50:1c:bf:71:08:0c    6
2016-07-13 13:54:31
    1     0  50:1c:bf:71:08:0c    7
2016-07-13 13:54:32
    1     0  50:1c:bf:71:08:0c    8
2016-07-13 13:54:33
    1     0  50:1c:bf:71:08:0c    9
2016-07-13 13:54:34
    1     0  50:1c:bf:71:08:0c   10
2016-07-13 13:54:35
    1     0  50:1c:bf:71:08:0c    1
2016-07-13 13:54:36
    1     0  50:1c:bf:71:08:0c    2
2016-07-13 13:54:37
    1     0  50:1c:bf:71:08:0c    3
2016-07-13 13:54:38
    1     0  50:1c:bf:71:08:0c    4
2016-07-13 13:54:39
    1     0  50:1c:bf:71:08:0c    5
2016-07-13 13:54:40
    1     0  50:1c:bf:71:08:0c    6
2016-07-13 13:54:41
    1     0  50:1c:bf:71:08:0c    7
2016-07-13 13:54:42
    1     0  50:1c:bf:71:08:0c    8
2016-07-13 13:54:43
    1     0  50:1c:bf:71:08:0c    9
2016-07-13 13:54:44
    1     0  50:1c:bf:71:08:0c   10
2016-07-13 13:54:45
    1     0  50:1c:bf:71:08:0c   11

No exception found.

Thanks,
Zhang Haoyu