[ovs-discuss] crash when restart openvswitch with huge vxlan traffic running

Lorenzo Bianconi lorenzo.bianconi at redhat.com
Thu Dec 27 20:33:09 UTC 2018


> Greg, this is a kernel issue.  If you have the time, will you take a
> look at it sometime?
>

Hi all,

I worked on a pretty similar issue a couple of weeks ago. Could you
please take a look to the commit below (it is already in Linus's
tree):

commit 8e1da73acded4751a93d4166458a7e640f37d26c
Author: Lorenzo Bianconi <lorenzo.bianconi at redhat.com>
Date:   Wed Dec 19 23:23:00 2018 +0100

    gro_cell: add napi_disable in gro_cells_destroy

   Add napi_disable routine in gro_cells_destroy since starting from
   commit c42858eaf492 ("gro_cells: remove spinlock protecting receive
   queues") gro_cell_poll and gro_cells_destroy can run concurrently on
   napi_skbs list producing a kernel Oops if the tunnel interface is
   removed while gro_cell_poll is running. The following Oops has been
   triggered removing a vxlan device while the interface is receiving
   traffic

Regards,
Lorenzo

> On Thu, Dec 20, 2018 at 12:42:43PM +0000, 王志克 wrote:
> > Hi All,
> >
> > I did below test, and found system crash, does anyone knows whether there are already some fix for it?
> >
> > Setup:
> > CentOS7.4 3.10.0-693.el7.x86_64,
> > OVS: 2.10.1
> >
> > Step:
> > 1.  Build OVS only for userspace, and reuse kernel-builtin openvswitch module.
> > 2.  On Host1, create 1 vxlan interface and add 1 VF_rep to OVS.
> > 3.  Attach the VF to one VM, and the VM will do 5 tuples swap using DPDK app.
> > 4.  using traffic generator to send huge traffic (7Mpps with serveral k connetions)to Host1 PF.
> > 5.  The OVS rue are configured as below.
> >
> > VM1_PORTNAME=$1
> > VXLAN_PORTNAME=$2
> > VM1_PORT=$(ovs-vsctl list interface | grep $VM1_PORTNAME -A1 | grep ofport | sed 's/ofport *: \([0-9]*\)/\1/g')
> > VXLAN_PORT=$(ovs-vsctl list interface | grep $VXLAN_PORTNAME -A1 | grep ofport | sed 's/ofport *: \([0-9]*\)/\1/g')
> > ZONE=8
> > ovs-ofctl del-flows ovs-sriov
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=1000 table=0,arp, actions=NORMAL"
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=0,ip,in_port=$VM1_PORT,action=set_field:$VM1_PORT->reg6,goto_table:5"
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=0,ip,in_port=$VXLAN_PORT, tun_id=0x242, action=set_field:$VXLAN_PORT->reg6,set_field:$VM1_PORT->reg7,goto_table:5"
> >
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=5, priority=100, ip,actions=ct(table=10,zone=$ZONE)"
> >
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, priority=100,ip,ct_state=-new+est-rel-inv+trk actions= goto_table:15"
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, priority=100,ip,ct_state=-new-est-rel+inv+trk actions=drop"
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, priority=100,ip,ct_state=-new-est-rel-inv-trk actions=drop"
> >
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, priority=100,ip,ct_state=+new-rel-inv+trk actions= ct(commit,table=15,zone=$ZONE)"
> >
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, in_port=$VM1_PORT, action=set_field:0x242->tun_id,set_field:$VXLAN_PORT->reg7,goto_table:20"
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, in_port=$VXLAN_PORT, actions=goto_table:20"
> >
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=20, priority=100, ip,action=output:NXM_NX_REG7[0..15]"
> > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=200, priority=100,action=drop"
> > 6. execute serveral times “systemctl restart openvswitch”, then crash.
> >
> > Crash stack (2 kinds):
> > One
> > [  575.459905] device vxlan_sys_4789 left promiscuous mode
> > [  575.460103] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > [  575.460133] IP: [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan]
> > [  575.460210] PGD 0
> > [  575.460226] Oops: 0002 [#1] SMP
> > [  575.460254] Modules linked in: vhost_net vhost macvtap macvlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul
> > [  575.460619]  ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif joydev pcspkr sg mei_me mei lpc_ich ipmi_si shpchp ipmi_devintf ipmi_msghandler wmi acpi_power_meter knem(OE) nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea ixgbe sysfillrect igb sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm ahci libahci megaraid_sas libata i2c_algo_bit i2c_core mdio ptp dca pps_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: devlink]
> > [  575.460885] CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G           OE  ------------   3.10.0-693.el7.x86_64 #1
> > [  575.460912] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 06/03/2015
> > [  575.460933] task: ffff880152ef1fa0 ti: ffff880152efc000 task.ti: ffff880152efc000
> > [  575.460954] RIP: 0010:[<ffffffffc09b330b>]  [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan]
> > [  575.460990] RSP: 0018:ffff880152effd68  EFLAGS: 00010202
> > [  575.461004] RAX: 0000000000000000 RBX: ffffe8dfff448818 RCX: 0000000000000000
> > [  575.461024] RDX: 0000000000000001 RSI: ffff881fa42ebf00 RDI: ffffe8dfff448818
> > [  575.461042] RBP: ffff880152effd88 R08: 0000000000019c40 R09: ffffffff815710d7
> > [  575.461061] R10: ffff881ffec59c40 R11: ffffea007e90ba00 R12: 0000000000000002
> > [  575.461079] R13: 0000000000000040 R14: ffffe8dfff448800 R15: 0000000000000001
> > [  575.461098] FS:  0000000000000000(0000) GS:ffff881ffec40000(0000) knlGS:0000000000000000
> > [  575.461119] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  575.461134] CR2: 0000000000000008 CR3: 00000000019f2000 CR4: 00000000001427e0
> > [  575.461153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  575.461172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [  575.461190] Stack:
> > [  575.461198]  ffffe8dfff448818 0000000000000000 0000000000000040 0000000000000000
> > [  575.461221]  ffff880152effe08 ffffffff8158799d ffff881ffec57950 ffff881ffec57940
> > [  575.461254]  00000001000432b7 0000012c52f09428 ffff881ffd57eb40 ffff881ffd57eb40
> > [  575.461277] Call Trace:
> > [  575.461290]  [<ffffffff8158799d>] net_rx_action+0x16d/0x380
> > [  575.461308]  [<ffffffff81090b3f>] __do_softirq+0xef/0x280
> > [  575.461324]  [<ffffffff81090d08>] run_ksoftirqd+0x38/0x50
> > [  575.462074]  [<ffffffff810b909f>] smpboot_thread_fn+0x12f/0x180
> > [  575.462780]  [<ffffffff810b8f70>] ? lg_double_unlock+0x40/0x40
> > [  575.463464]  [<ffffffff810b098f>] kthread+0xcf/0xe0
> > [  575.464169]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
> > [  575.464862]  [<ffffffff816b4f18>] ret_from_fork+0x58/0x90
> > [  575.465497]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
> > [  575.466192] Code: 49 39 f6 74 40 48 85 f6 74 3b 83 6b f8 01 48 89 df 41 83 c4 01 48 8b 0e 48 8b 46 08 48 c7 06 00 00 00 00 48 c7 46 08 00 00 00 00 <48> 89 41 08 48 89 08 e8 29 4f bd c0 45 39 ec 74 14 48 8b 73 e8
> > [  575.467663] RIP  [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan]
> > [  575.468412]  RSP <ffff880152effd68>
> > [  575.469197] CR2: 0000000000000008
> >
> > TWO:
> > [  390.626080] device vxlan_sys_4789 left promiscuous mode
> > [  390.626345] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > [  390.626411] IP: [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan]
> > [  390.626462] PGD 0
> > [  390.626499] Oops: 0002 [#1] SMP
> > [  390.626529] Modules linked in: vhost_net vhost macvtap macvlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul
> > [  390.627152]  ghash_clmulni_intel ipmi_ssif aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_si pcspkr joydev ipmi_devintf ipmi_msghandler mei_me mei sg lpc_ich shpchp acpi_power_meter wmi nfsd auth_rpcgss nfs_acl lockd knem(OE) grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common ixgbe crc32c_intel ahci igb libahci libata megaraid_sas mdio i2c_algo_bit ptp i2c_core pps_core dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: devlink]
> > [  390.627626] CPU: 11 PID: 6303 Comm: ovs-vswitchd Tainted: G           OE  ------------   3.10.0-693.el7.x86_64 #1
> > [  390.627690] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 06/03/2015
> > [  390.627738] task: ffff881fe0e89fa0 ti: ffff881fa3590000 task.ti: ffff881fa3590000
> > [  390.627786] RIP: 0010:[<ffffffffc09c8b4a>]  [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan]
> > [  390.627848] RSP: 0018:ffff881fa3593888  EFLAGS: 00010206
> > [  390.627883] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 0000000000000000
> > [  390.627929] RDX: 0000000000000000 RSI: ffffea007fd7f600 RDI: ffff881ff5fd8c00
> > [  390.627975] RBP: ffff881fa35938b0 R08: ffff881ff5fd8b00 R09: 000000018040000d
> > [  390.628020] R10: 00000000f5fd8a01 R11: ffffea007fd7f600 R12: ffff88015270e000
> > [  390.628066] R13: ffffffff81b1caa0 R14: ffff881fa35938c0 R15: ffffe8dfff60a1d8
> > [  390.628112] FS:  00007f4ea1168ac0(0000) GS:ffff883ffe540000(0000) knlGS:0000000000000000
> > [  390.628163] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  390.628201] CR2: 0000000000000008 CR3: 0000001ff9055000 CR4: 00000000001427e0
> > [  390.628246] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  390.628292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [  390.628337] Stack:
> > [  390.628354]  ffff881fa35938c0 ffffffff81ad9d40 0000000000000001 0000000000000000
> > [  390.628411]  ffffffff81ad9d40 ffff881fa35938e0 ffffffff81599023 ffff881fa35938c0
> > [  390.628468]  ffff881fa35938c0 000000001d8239fc ffff883ffd864a00 ffff881fa3593a70
> > [  390.628535] Call Trace:
> > [  390.628561]  [<ffffffff81599023>] rtnl_delete_link+0x43/0x80
> > [  390.628610]  [<ffffffff8159b761>] rtnl_dellink+0x91/0xf0
> > [  390.628649]  [<ffffffff81599bd4>] rtnetlink_rcv_msg+0xa4/0x270
> > [  390.630373]  [<ffffffff815bacd0>] ? __netlink_lookup+0xc0/0x110
> > [  390.632066]  [<ffffffff81599b30>] ? rtnetlink_rcv+0x30/0x30
> > [  390.633751]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
> > [  390.635426]  [<ffffffff81599b28>] rtnetlink_rcv+0x28/0x30
> > [  390.637081]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
> > [  390.638721]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
> > [  390.640371]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
> > [  390.642037]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
> > [  390.643722]  [<ffffffff8156a88f>] ? sock_recvmsg+0xbf/0x100
> > [  390.645411]  [<ffffffff8132c312>] ? put_dec+0x72/0x90
> > [  390.647075]  [<ffffffff8132d303>] ? number.isra.2+0x323/0x360
> > [  390.648724]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
> > [  390.650362]  [<ffffffff811de9d2>] ? kmem_cache_free+0x1e2/0x200
> > [  390.652010]  [<ffffffff81217af5>] ? __d_free+0x35/0x40
> > [  390.653623]  [<ffffffff812181b0>] ? d_free+0x60/0x70
> > [  390.655181]  [<ffffffff812186b4>] ? dentry_kill+0x154/0x1b0
> > [  390.656702]  [<ffffffff81222744>] ? mntput+0x24/0x40
> > [  390.658173]  [<ffffffff81203053>] ? __fput+0x183/0x260
> > [  390.659606]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
> > [  390.660988]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
> > [  390.662325]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
> > [  390.663624] Code: a0 bb c0 49 8b 3f 49 39 ff 74 be 48 85 ff 74 b9 41 83 6f 10 01 48 8b 0f 48 8b 57 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 89 51 08 48 89 0a e8 8a a2 ba c0 49 8b 3f 49 39 ff 75 cc eb
> > [  390.666406] RIP  [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan]
> > [  390.667674]  RSP <ffff881fa3593888>
> > [  390.668892] CR2: 0000000000000008
> >
> > Br,
> > Zhike Wang
> > JDCloud, Product Development, IaaS
> > ------------------------------------------------------------------------------------------------
> > Mobile/+86 13466719566
> > E- mail/wangzhike at jd.com<mailto:wangzhike at jd.com>
> > Address/5F Building A,North-Star Century Center,8 Beichen West Street,Chaoyang District Beijing
> > Https://JDCloud.com<https://jdcloud.com/>
> > ------------------------------------------------------------------------------------------------
> > [cid:image002.jpg at 01D404D3.6724C2E0]
> >
>
>
>
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


More information about the discuss mailing list