[ovs-discuss] crash when restart openvswitch with huge vxlan traffic running

Gregory Rose gvrose8192 at gmail.com
Fri Dec 28 18:03:01 UTC 2018


On 12/27/2018 1:38 PM, Gregory Rose wrote:
>
> On 12/27/2018 11:40 AM, Ben Pfaff wrote:
>> Greg, this is a kernel issue.  If you have the time, will you take a
>> look at it sometime?
>
> Yep, will do.
>
> - Greg

I looked into this and there is not much for us to do wrt our kernel 
datapath.  It is a kernel GRO fix and if it
is not in the kernel you're using then you'll have to talk to your 
vendor about getting the fix into their next
distribution kernel package upgrade or else roll (i.e. custom build) 
your own kernel.

I checked all the recent 3.10.x kernel sources using the lexer at 
access.redhat.com and did not find any
of them that carried Lorenzo's fix (commit 8e1da73acded on Dave Miller's 
current net tree) all the way
up to 3.10.0-957.el7.  That is RHEL 7.6 and is the latest so it looks 
like RHEL 7 is stuck with the bug.

Thanks,

- Greg

>
>>
>> On Thu, Dec 20, 2018 at 12:42:43PM +0000, 王志克 wrote:
>>> Hi All,
>>>
>>> I did below test, and found system crash, does anyone knows whether 
>>> there are already some fix for it?
>>>
>>> Setup:
>>> CentOS7.4 3.10.0-693.el7.x86_64,
>>> OVS: 2.10.1
>>>
>>> Step:
>>> 1.  Build OVS only for userspace, and reuse kernel-builtin 
>>> openvswitch module.
>>> 2.  On Host1, create 1 vxlan interface and add 1 VF_rep to OVS.
>>> 3.  Attach the VF to one VM, and the VM will do 5 tuples swap using 
>>> DPDK app.
>>> 4.  using traffic generator to send huge traffic (7Mpps with 
>>> serveral k connetions)to Host1 PF.
>>> 5.  The OVS rue are configured as below.
>>>
>>> VM1_PORTNAME=$1
>>> VXLAN_PORTNAME=$2
>>> VM1_PORT=$(ovs-vsctl list interface | grep $VM1_PORTNAME -A1 | grep 
>>> ofport | sed 's/ofport *: \([0-9]*\)/\1/g')
>>> VXLAN_PORT=$(ovs-vsctl list interface | grep $VXLAN_PORTNAME -A1 | 
>>> grep ofport | sed 's/ofport *: \([0-9]*\)/\1/g')
>>> ZONE=8
>>> ovs-ofctl del-flows ovs-sriov
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=1000 
>>> table=0,arp, actions=NORMAL"
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 
>>> table=0,ip,in_port=$VM1_PORT,action=set_field:$VM1_PORT->reg6,goto_table:5"
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 
>>> table=0,ip,in_port=$VXLAN_PORT, tun_id=0x242, 
>>> action=set_field:$VXLAN_PORT->reg6,set_field:$VM1_PORT->reg7,goto_table:5"
>>>
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=5, priority=100, 
>>> ip,actions=ct(table=10,zone=$ZONE)"
>>>
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
>>> priority=100,ip,ct_state=-new+est-rel-inv+trk actions= goto_table:15"
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
>>> priority=100,ip,ct_state=-new-est-rel+inv+trk actions=drop"
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
>>> priority=100,ip,ct_state=-new-est-rel-inv-trk actions=drop"
>>>
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
>>> priority=100,ip,ct_state=+new-rel-inv+trk actions= 
>>> ct(commit,table=15,zone=$ZONE)"
>>>
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 
>>> table=15,ip, in_port=$VM1_PORT, 
>>> action=set_field:0x242->tun_id,set_field:$VXLAN_PORT->reg7,goto_table:20"
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 
>>> table=15,ip, in_port=$VXLAN_PORT, actions=goto_table:20"
>>>
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=20, priority=100, 
>>> ip,action=output:NXM_NX_REG7[0..15]"
>>> ovs-ofctl add-flow ovs-sriov -O openflow13 "table=200, 
>>> priority=100,action=drop"
>>> 6. execute serveral times “systemctl restart openvswitch”, then crash.
>>>
>>> Crash stack (2 kinds):
>>> One
>>> [  575.459905] device vxlan_sys_4789 left promiscuous mode
>>> [  575.460103] BUG: unable to handle kernel NULL pointer dereference 
>>> at 0000000000000008
>>> [  575.460133] IP: [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan]
>>> [  575.460210] PGD 0
>>> [  575.460226] Oops: 0002 [#1] SMP
>>> [  575.460254] Modules linked in: vhost_net vhost macvtap macvlan 
>>> vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 
>>> nf_nat_ipv6 nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio 
>>> xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 
>>> iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 
>>> xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp 
>>> llc ebtable_filter ebtables ip6table_filter ip6_tables 
>>> iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) 
>>> ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) 
>>> ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) 
>>> ib_core(OE) mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt 
>>> iTCO_vendor_support dcdbas sb_edac edac_core intel_powerclamp 
>>> coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul
>>> [  575.460619]  ghash_clmulni_intel aesni_intel lrw gf128mul 
>>> glue_helper ablk_helper cryptd ipmi_ssif joydev pcspkr sg mei_me mei 
>>> lpc_ich ipmi_si shpchp ipmi_devintf ipmi_msghandler wmi 
>>> acpi_power_meter knem(OE) nfsd auth_rpcgss nfs_acl lockd grace 
>>> sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic 
>>> mgag200 drm_kms_helper syscopyarea ixgbe sysfillrect igb sysimgblt 
>>> fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm 
>>> ahci libahci megaraid_sas libata i2c_algo_bit i2c_core mdio ptp dca 
>>> pps_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
>>> devlink]
>>> [  575.460885] CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G           
>>> OE  ------------   3.10.0-693.el7.x86_64 #1
>>> [  575.460912] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 
>>> 1.3.6 06/03/2015
>>> [  575.460933] task: ffff880152ef1fa0 ti: ffff880152efc000 task.ti: 
>>> ffff880152efc000
>>> [  575.460954] RIP: 0010:[<ffffffffc09b330b>] [<ffffffffc09b330b>] 
>>> gro_cell_poll+0x4b/0x80 [vxlan]
>>> [  575.460990] RSP: 0018:ffff880152effd68  EFLAGS: 00010202
>>> [  575.461004] RAX: 0000000000000000 RBX: ffffe8dfff448818 RCX: 
>>> 0000000000000000
>>> [  575.461024] RDX: 0000000000000001 RSI: ffff881fa42ebf00 RDI: 
>>> ffffe8dfff448818
>>> [  575.461042] RBP: ffff880152effd88 R08: 0000000000019c40 R09: 
>>> ffffffff815710d7
>>> [  575.461061] R10: ffff881ffec59c40 R11: ffffea007e90ba00 R12: 
>>> 0000000000000002
>>> [  575.461079] R13: 0000000000000040 R14: ffffe8dfff448800 R15: 
>>> 0000000000000001
>>> [  575.461098] FS:  0000000000000000(0000) GS:ffff881ffec40000(0000) 
>>> knlGS:0000000000000000
>>> [  575.461119] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  575.461134] CR2: 0000000000000008 CR3: 00000000019f2000 CR4: 
>>> 00000000001427e0
>>> [  575.461153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>>> 0000000000000000
>>> [  575.461172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
>>> 0000000000000400
>>> [  575.461190] Stack:
>>> [  575.461198]  ffffe8dfff448818 0000000000000000 0000000000000040 
>>> 0000000000000000
>>> [  575.461221]  ffff880152effe08 ffffffff8158799d ffff881ffec57950 
>>> ffff881ffec57940
>>> [  575.461254]  00000001000432b7 0000012c52f09428 ffff881ffd57eb40 
>>> ffff881ffd57eb40
>>> [  575.461277] Call Trace:
>>> [  575.461290]  [<ffffffff8158799d>] net_rx_action+0x16d/0x380
>>> [  575.461308]  [<ffffffff81090b3f>] __do_softirq+0xef/0x280
>>> [  575.461324]  [<ffffffff81090d08>] run_ksoftirqd+0x38/0x50
>>> [  575.462074]  [<ffffffff810b909f>] smpboot_thread_fn+0x12f/0x180
>>> [  575.462780]  [<ffffffff810b8f70>] ? lg_double_unlock+0x40/0x40
>>> [  575.463464]  [<ffffffff810b098f>] kthread+0xcf/0xe0
>>> [  575.464169]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
>>> [  575.464862]  [<ffffffff816b4f18>] ret_from_fork+0x58/0x90
>>> [  575.465497]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
>>> [  575.466192] Code: 49 39 f6 74 40 48 85 f6 74 3b 83 6b f8 01 48 89 
>>> df 41 83 c4 01 48 8b 0e 48 8b 46 08 48 c7 06 00 00 00 00 48 c7 46 08 
>>> 00 00 00 00 <48> 89 41 08 48 89 08 e8 29 4f bd c0 45 39 ec 74 14 48 
>>> 8b 73 e8
>>> [  575.467663] RIP  [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 
>>> [vxlan]
>>> [  575.468412]  RSP <ffff880152effd68>
>>> [  575.469197] CR2: 0000000000000008
>>>
>>> TWO:
>>> [  390.626080] device vxlan_sys_4789 left promiscuous mode
>>> [  390.626345] BUG: unable to handle kernel NULL pointer dereference 
>>> at 0000000000000008
>>> [  390.626411] IP: [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan]
>>> [  390.626462] PGD 0
>>> [  390.626499] Oops: 0002 [#1] SMP
>>> [  390.626529] Modules linked in: vhost_net vhost macvtap macvlan 
>>> vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 
>>> nf_nat_ipv6 nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio 
>>> xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 
>>> iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 
>>> xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp 
>>> llc ebtable_filter ebtables ip6table_filter ip6_tables 
>>> iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) 
>>> ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) 
>>> ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) 
>>> ib_core(OE) mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt 
>>> iTCO_vendor_support dcdbas sb_edac edac_core intel_powerclamp 
>>> coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul
>>> [  390.627152]  ghash_clmulni_intel ipmi_ssif aesni_intel lrw 
>>> gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr joydev 
>>> ipmi_devintf ipmi_msghandler mei_me mei sg lpc_ich shpchp 
>>> acpi_power_meter wmi nfsd auth_rpcgss nfs_acl lockd knem(OE) grace 
>>> sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic 
>>> mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops 
>>> ttm drm crct10dif_pclmul crct10dif_common ixgbe crc32c_intel ahci 
>>> igb libahci libata megaraid_sas mdio i2c_algo_bit ptp i2c_core 
>>> pps_core dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
>>> devlink]
>>> [  390.627626] CPU: 11 PID: 6303 Comm: ovs-vswitchd Tainted: 
>>> G           OE  ------------   3.10.0-693.el7.x86_64 #1
>>> [  390.627690] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 
>>> 1.3.6 06/03/2015
>>> [  390.627738] task: ffff881fe0e89fa0 ti: ffff881fa3590000 task.ti: 
>>> ffff881fa3590000
>>> [  390.627786] RIP: 0010:[<ffffffffc09c8b4a>] [<ffffffffc09c8b4a>] 
>>> vxlan_dellink+0x9a/0xf0 [vxlan]
>>> [  390.627848] RSP: 0018:ffff881fa3593888  EFLAGS: 00010206
>>> [  390.627883] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 
>>> 0000000000000000
>>> [  390.627929] RDX: 0000000000000000 RSI: ffffea007fd7f600 RDI: 
>>> ffff881ff5fd8c00
>>> [  390.627975] RBP: ffff881fa35938b0 R08: ffff881ff5fd8b00 R09: 
>>> 000000018040000d
>>> [  390.628020] R10: 00000000f5fd8a01 R11: ffffea007fd7f600 R12: 
>>> ffff88015270e000
>>> [  390.628066] R13: ffffffff81b1caa0 R14: ffff881fa35938c0 R15: 
>>> ffffe8dfff60a1d8
>>> [  390.628112] FS:  00007f4ea1168ac0(0000) GS:ffff883ffe540000(0000) 
>>> knlGS:0000000000000000
>>> [  390.628163] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  390.628201] CR2: 0000000000000008 CR3: 0000001ff9055000 CR4: 
>>> 00000000001427e0
>>> [  390.628246] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>>> 0000000000000000
>>> [  390.628292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
>>> 0000000000000400
>>> [  390.628337] Stack:
>>> [  390.628354]  ffff881fa35938c0 ffffffff81ad9d40 0000000000000001 
>>> 0000000000000000
>>> [  390.628411]  ffffffff81ad9d40 ffff881fa35938e0 ffffffff81599023 
>>> ffff881fa35938c0
>>> [  390.628468]  ffff881fa35938c0 000000001d8239fc ffff883ffd864a00 
>>> ffff881fa3593a70
>>> [  390.628535] Call Trace:
>>> [  390.628561]  [<ffffffff81599023>] rtnl_delete_link+0x43/0x80
>>> [  390.628610]  [<ffffffff8159b761>] rtnl_dellink+0x91/0xf0
>>> [  390.628649]  [<ffffffff81599bd4>] rtnetlink_rcv_msg+0xa4/0x270
>>> [  390.630373]  [<ffffffff815bacd0>] ? __netlink_lookup+0xc0/0x110
>>> [  390.632066]  [<ffffffff81599b30>] ? rtnetlink_rcv+0x30/0x30
>>> [  390.633751]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
>>> [  390.635426]  [<ffffffff81599b28>] rtnetlink_rcv+0x28/0x30
>>> [  390.637081]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
>>> [  390.638721]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
>>> [  390.640371]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
>>> [  390.642037]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
>>> [  390.643722]  [<ffffffff8156a88f>] ? sock_recvmsg+0xbf/0x100
>>> [  390.645411]  [<ffffffff8132c312>] ? put_dec+0x72/0x90
>>> [  390.647075]  [<ffffffff8132d303>] ? number.isra.2+0x323/0x360
>>> [  390.648724]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
>>> [  390.650362]  [<ffffffff811de9d2>] ? kmem_cache_free+0x1e2/0x200
>>> [  390.652010]  [<ffffffff81217af5>] ? __d_free+0x35/0x40
>>> [  390.653623]  [<ffffffff812181b0>] ? d_free+0x60/0x70
>>> [  390.655181]  [<ffffffff812186b4>] ? dentry_kill+0x154/0x1b0
>>> [  390.656702]  [<ffffffff81222744>] ? mntput+0x24/0x40
>>> [  390.658173]  [<ffffffff81203053>] ? __fput+0x183/0x260
>>> [  390.659606]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
>>> [  390.660988]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
>>> [  390.662325]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
>>> [  390.663624] Code: a0 bb c0 49 8b 3f 49 39 ff 74 be 48 85 ff 74 b9 
>>> 41 83 6f 10 01 48 8b 0f 48 8b 57 08 48 c7 07 00 00 00 00 48 c7 47 08 
>>> 00 00 00 00 <48> 89 51 08 48 89 0a e8 8a a2 ba c0 49 8b 3f 49 39 ff 
>>> 75 cc eb
>>> [  390.666406] RIP  [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 
>>> [vxlan]
>>> [  390.667674]  RSP <ffff881fa3593888>
>>> [  390.668892] CR2: 0000000000000008
>>>
>>> Br,
>>> Zhike Wang
>>> JDCloud, Product Development, IaaS
>>> ------------------------------------------------------------------------------------------------ 
>>>
>>> Mobile/+86 13466719566
>>> E- mail/wangzhike at jd.com<mailto:wangzhike at jd.com>
>>> Address/5F Building A,North-Star Century Center,8 Beichen West 
>>> Street,Chaoyang District Beijing
>>> Https://JDCloud.com<https://jdcloud.com/>
>>> ------------------------------------------------------------------------------------------------ 
>>>
>>> [cid:image002.jpg at 01D404D3.6724C2E0]
>>>
>>
>>
>>> _______________________________________________
>>> discuss mailing list
>>> discuss at openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>



More information about the discuss mailing list