[ovs-dev] [bug][crash][ovs-dpdk] crash BUG of two hardware nic in OVS-DPDK when ONLY using `scp`.

Simon Jones batmanustc at gmail.com
Fri Nov 26 07:42:22 UTC 2021


Hi all,

Now I'm using OVS-DPDK version 2.13 in openstack environment, just like
what I do in emails before ("Why could OVS-DPDK bridge use veth-pair nic?
Why OVS-DPDK internal port could add IP?").

But I found a bug: 1) if I deploy OVS-DPDK like scenario-1 (detail is
below), it's OK; 2) if I deploy OVS-DPDK like scenario-2, ovs-vswitchd will
crash ONLY in `scp xxx` to IP associated on br_mgmt of OVS-DPDK.

Bug detail of scenario-2:
1. deploy like scenario-2.
2. (From another server) `scp big-file root at 10.33.36.2:/root/`, which is
scp from another server to the server deploy OVS-DPDK as sceno-2.
3. The docker container of ovsdpdk-vswitchd is restarted, this is because
vswitchd is crash.
crash log: no log in /var/log/kolla/openvswitch/ovs-vswitchd.log when crash.

As compare:
1. (From another server) `ping 10.33.36.2`, it's OK. `ssh -l root
10.33.36.2`, it's OK.
2. This bug NOT exist in scene-1.

scenario-1:
[image: image.png]
scenario-1 detail:
```
[root at host01 ~]# docker exec ovsdpdk_db bash -c 'ovs-vsctl show'
427f600b-7a06-46e9-b273-5e63e08b1c72
    Manager "ptcp:6640:127.0.0.1"
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port vhu620cc176-e5
            tag: 2
            Interface vhu620cc176-e5
                type: dpdkvhostuserclient
                options:
{vhost-server-path="/var/run/openvswitch/vhu620cc176-e5"}
        Port tap36432023-6a
            tag: 1
            Interface tap36432023-6a
                type: internal
        Port tapd60becd2-ca
            tag: 2
            Interface tapd60becd2-ca
                type: internal
        Port int-br_mgmt
            Interface int-br_mgmt
                type: patch
                options: {peer=phy-br_mgmt}
        Port int-br_ctrl
            Interface int-br_ctrl
                type: patch
                options: {peer=phy-br_ctrl}
        Port vhu97b6c2c5-20
            tag: 1
            Interface vhu97b6c2c5-20
                type: dpdkvhostuserclient
                options:
{vhost-server-path="/var/run/openvswitch/vhu97b6c2c5-20"}
        Port br-int
            Interface br-int
                type: internal
        Port vhuab74cd48-12
            tag: 1
            Interface vhuab74cd48-12
                type: dpdkvhostuserclient
                options:
{vhost-server-path="/var/run/openvswitch/vhuab74cd48-12"}
        Port int-br_data
            Interface int-br_data
                type: patch
                options: {peer=phy-br_data}
        Port vhub2ce3e95-86
            tag: 2
            Interface vhub2ce3e95-86
                type: dpdkvhostuserclient
                options:
{vhost-server-path="/var/run/openvswitch/vhub2ce3e95-86"}
    Bridge br_data
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port phy-br_data
            Interface phy-br_data
                type: patch
                options: {peer=int-br_data}
        Port enp5s0f0
            Interface enp5s0f0
                type: dpdk
                options: {dpdk-devargs="0000:05:00.0"}
        Port br_data
            Interface br_data
                type: internal
    Bridge br_ctrl
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port enp5s0f1.1000
            Interface enp5s0f1.1000
        Port phy-br_ctrl
            Interface phy-br_ctrl
                type: patch
                options: {peer=int-br_ctrl}
        Port br_ctrl
            Interface br_ctrl
                type: internal
    Bridge br_mgmt
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port enp5s0f1.1001
            Interface enp5s0f1.1001
        Port phy-br_mgmt
            Interface phy-br_mgmt
                type: patch
                options: {peer=int-br_mgmt}
        Port br_mgmt
            Interface br_mgmt
                type: internal

[root at host01 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
    link/ether b8:2a:72:d0:11:d6 brd ff:ff:ff:ff:ff:ff
    inet 10.33.36.1/16 brd 10.33.255.255 scope global em1
       valid_lft forever preferred_lft forever
    inet6 fd52:65cb:592e:0:ba2a:72ff:fed0:11d6/64 scope global mngtmpaddr
dynamic
       valid_lft forever preferred_lft forever
    inet6 fe80::ba2a:72ff:fed0:11d6/64 scope link
       valid_lft forever preferred_lft forever
3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
qlen 1000
    link/ether b8:2a:72:d0:11:d7 brd ff:ff:ff:ff:ff:ff
4: em3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
qlen 1000
    link/ether b8:2a:72:d0:11:d8 brd ff:ff:ff:ff:ff:ff
5: em4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
qlen 1000
    link/ether b8:2a:72:d0:11:d9 brd ff:ff:ff:ff:ff:ff
6: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
group default qlen 1000
    link/ether 04:3f:72:a4:99:80 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::63f:72ff:fea4:9980/64 scope link
       valid_lft forever preferred_lft forever
7: enp5s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
group default qlen 1000
    link/ether 04:3f:72:a4:99:81 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::63f:72ff:fea4:9981/64 scope link
       valid_lft forever preferred_lft forever
8: enp5s0f1.1000 at enp5s0f1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu
1500 qdisc noqueue state UP group default qlen 1000
    link/ether 04:3f:72:a4:99:81 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::63f:72ff:fea4:9981/64 scope link
       valid_lft forever preferred_lft forever
9: enp5s0f1.1001 at enp5s0f1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu
1500 qdisc noqueue state UP group default qlen 1000
    link/ether 04:3f:72:a4:99:81 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::63f:72ff:fea4:9981/64 scope link
       valid_lft forever preferred_lft forever
10: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP group default
    link/ether 02:42:45:8a:8d:6e brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:45ff:fe8a:8d6e/64 scope link
       valid_lft forever preferred_lft forever
11: ovs-netdev: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state
DOWN group default qlen 1000
    link/ether 2a:b8:3a:2d:82:6a brd ff:ff:ff:ff:ff:ff
12: br_data: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UP group default qlen 1000
    link/ether 04:3f:72:a4:99:80 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::63f:72ff:fea4:9980/64 scope link
       valid_lft forever preferred_lft forever
13: br_mgmt: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UP group default qlen 1000
    link/ether 04:3f:72:a4:99:81 brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.10/24 brd 192.168.2.255 scope global br_mgmt
       valid_lft forever preferred_lft forever
    inet6 fe80::63f:72ff:fea4:9981/64 scope link
       valid_lft forever preferred_lft forever
15: br-int: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state DOWN
group default qlen 1000
    link/ether ba:95:6f:60:f9:4d brd ff:ff:ff:ff:ff:ff
17: br_ctrl: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UP group default qlen 1000
    link/ether 04:3f:72:a4:99:81 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.10/24 brd 192.168.1.255 scope global br_ctrl
       valid_lft forever preferred_lft forever
    inet6 fe80::63f:72ff:fea4:9981/64 scope link
       valid_lft forever preferred_lft forever
67: tapd60becd2-ca: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state
DOWN group default qlen 1000
    link/ether d6:0f:db:41:8b:1a brd ff:ff:ff:ff:ff:ff
68: tap36432023-6a: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state
DOWN group default qlen 1000
    link/ether 72:a0:a3:14:c8:03 brd ff:ff:ff:ff:ff:ff
84: vetheb40bd6 at if83: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
noqueue master docker0 state UP group default
    link/ether 1e:58:0a:98:ed:9e brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::1c58:aff:fe98:ed9e/64 scope link
       valid_lft forever preferred_lft forever
```

scenario-2:
[image: image.png]
scenario-2 detail:
```
[root at edge02 ~]# docker exec ovsdpdk_db bash -c 'ovs-vsctl show'
6dcf1184-3a44-4be3-9d9d-f94d2f215062
    Manager "ptcp:6640:127.0.0.1"
    Bridge br_mgmt
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port phy-br_mgmt
            Interface phy-br_mgmt
                type: patch
                options: {peer=int-br_mgmt}
        Port br_mgmt
            Interface br_mgmt
                type: internal
        Port em1
            Interface em1
    Bridge br-nsp
        datapath_type: system
        Port br-nsp
            Interface br-nsp
                type: internal
    Bridge br_data
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port phy-br_data
            Interface phy-br_data
                type: patch
                options: {peer=int-br_data}
        Port br_data
            Interface br_data
                type: internal
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port tapf7febc89-44
            tag: 4
            Interface tapf7febc89-44
                type: internal
        Port int-br_data
            Interface int-br_data
                type: patch
                options: {peer=phy-br_data}
        Port br-int
            Interface br-int
                type: internal
        Port tape92bc241-0a
            tag: 2
            Interface tape92bc241-0a
                type: internal
        Port int-br_mgmt
            Interface int-br_mgmt
                type: patch
                options: {peer=phy-br_mgmt}

[root at edge02 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc mq state
UP group default qlen 1000
    link/ether 90:b1:1c:50:52:10 brd ff:ff:ff:ff:ff:ff
    inet6 fd52:65cb:592e:0:92b1:1cff:fe50:5210/64 scope global mngtmpaddr
dynamic
       valid_lft forever preferred_lft forever
3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
qlen 1000
    link/ether 90:b1:1c:50:52:11 brd ff:ff:ff:ff:ff:ff
4: em3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
qlen 1000
    link/ether 90:b1:1c:50:52:12 brd ff:ff:ff:ff:ff:ff
5: em4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
qlen 1000
    link/ether 90:b1:1c:50:52:13 brd ff:ff:ff:ff:ff:ff
7: enp5s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
group default qlen 1000
    link/ether 04:3f:72:a4:99:ad brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.11/24 scope global enp5s0f1
       valid_lft forever preferred_lft forever
    inet6 fe80::63f:72ff:fea4:99ad/64 scope link
       valid_lft forever preferred_lft forever
8: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
state DOWN group default
    link/ether 02:42:4a:ab:61:5c brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:4aff:feab:615c/64 scope link
       valid_lft forever preferred_lft forever
13: ovs-netdev: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state
DOWN group default qlen 1000
    link/ether e6:39:f2:c4:f9:45 brd ff:ff:ff:ff:ff:ff
14: br_mgmt: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UP group default qlen 1000
    link/ether 90:b1:1c:50:52:10 brd ff:ff:ff:ff:ff:ff
    inet 10.33.36.2/16 brd 10.33.255.255 scope global br_mgmt
       valid_lft forever preferred_lft forever
    inet 10.33.36.2/32 scope global br_mgmt
       valid_lft forever preferred_lft forever
    inet6 fd52:65cb:592e:0:92b1:1cff:fe50:5210/64 scope global mngtmpaddr
dynamic
       valid_lft forever preferred_lft forever
    inet6 fe80::92b1:1cff:fe50:5210/64 scope link
       valid_lft forever preferred_lft forever
15: br_data: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UP group default qlen 1000
    link/ether ee:fa:36:6e:bd:4d brd ff:ff:ff:ff:ff:ff
    inet6 fe80::ecfa:36ff:fe6e:bd4d/64 scope link
       valid_lft forever preferred_lft forever
16: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
default qlen 1000
    link/ether 6a:0a:ed:7d:fe:1a brd ff:ff:ff:ff:ff:ff
17: br-nsp: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
default qlen 1000
    link/ether 8e:8f:0f:41:55:40 brd ff:ff:ff:ff:ff:ff
18: br-int: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state DOWN
group default qlen 1000
    link/ether 9e:cc:c5:dd:ad:47 brd ff:ff:ff:ff:ff:ff
22: tape92bc241-0a: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state
DOWN group default qlen 1000
    link/ether 06:98:a1:d7:f0:ab brd ff:ff:ff:ff:ff:ff
24: tapf7febc89-44: <BROADCAST,MULTICAST,PROMISC> mtu 1500 qdisc noop state
DOWN group default qlen 1000
    link/ether 12:3f:d0:dd:69:13 brd ff:ff:ff:ff:ff:ff
25: enp5s0f1.1001 at enp5s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
qdisc noqueue state UP group default qlen 1000
    link/ether 04:3f:72:a4:99:ad brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.11/24 scope global enp5s0f1.1001
       valid_lft forever preferred_lft forever
    inet6 fe80::63f:72ff:fea4:99ad/64 scope link
       valid_lft forever preferred_lft forever
```

Different of two scenario:
1. scenario-1 br_mgmt use vlan sub interface of enp5s0f1, enp5s0f1 is
hardware netdevice. scenario-2(BUG) br_mgmt use `em1`, which is hardware
netdevice.
2. /var/log/kolla/openvswitch/ovs-vswitchd.log of scenario-2(BUG) has a lot
of log like this:
```
2021-11-26T07:21:00.866Z|00109|memory|INFO|handlers:2 ofconns:3 ports:11
revalidators:2 rules:19 udpif keys:6
2021-11-26T07:25:48.500Z|00001|dpif_netdev(revalidator15)|ERR|internal
error parsing flow key
skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(1),packet_type(ns=0,id=0),eth(src=e2:6d:a3:03:20:00,dst=01:00:5e:00:00:fb),eth_type(0x0800),ipv4(src=10.33.0.222,dst=224.0.0.251,proto=2,tos=0,ttl=1,frag=no)
2021-11-26T07:25:48.500Z|00002|dpif_netdev(revalidator15)|ERR|internal
error parsing flow key
skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(1),packet_type(ns=0,id=0),eth(src=e2:6d:a3:03:20:00,dst=01:00:5e:00:00:02),eth_type(0x0800),ipv4(src=10.33.0.222,dst=224.0.0.2,proto=2,tos=0,ttl=1,frag=no)
2021-11-26T07:25:48.500Z|00003|dpif(revalidator15)|WARN|Dropped 2 log
messages in last 297 seconds (most recently, 297 seconds ago) due to
excessive rate
2021-11-26T07:25:48.500Z|00004|dpif(revalidator15)|WARN|netdev at ovs-netdev:
failed to put[modify] (Invalid argument)
ufid:7aa84b8c-9fd6-4f68-b9e7-a58bd03e7ad5
skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(1),packet_type(ns=0,id=0),eth(src=e2:6d:a3:03:20:00,dst=01:00:5e:00:00:fb),eth_type(0x0800),ipv4(src=
10.33.0.222/0.0.0.0,dst=224.0.0.251/0.0.0.0,proto=2/0,tos=0/0,ttl=1/0,frag=no),
actions:userspace(pid=0,slow_path(match))
2021-11-26T07:25:48.500Z|00005|dpif(revalidator15)|WARN|netdev at ovs-netdev:
failed to put[modify] (Invalid argument)
ufid:662056c6-7e7d-446d-92bf-7c8f924439f2
skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(1),packet_type(ns=0,id=0),eth(src=e2:6d:a3:03:20:00,dst=01:00:5e:00:00:02),eth_type(0x0800),ipv4(src=
10.33.0.222/0.0.0.0,dst=224.0.0.2/0.0.0.0,proto=2/0,tos=0/0,ttl=1/0,frag=no),
actions:userspace(pid=0,slow_path(match))
2021-11-26T07:25:49.403Z|00006|dpif_netdev(revalidator15)|ERR|internal
error parsing flow key
skb_priority(0),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),dp_hash(0),in_port(1),packet_type(ns=0,id=0),eth(src=3e:cd:a3:ec:55:f3,dst=01:00:5e:00:00:16),eth_type(0x0800),ipv4(src=10.33.0.76,dst=224.0.0.22,proto=2,tos=0,ttl=1,frag=no)
```

So what's root cause of this BUG?
Is someone got same BUG?

Thank you~


----
Simon Jones


More information about the dev mailing list