[ovs-dev] kernel crash bug caused by ixgbevf kernel module of centos-3.10.0-229.20.1.el7

Sam batmanustc at gmail.com
Tue Jan 30 02:23:23 UTC 2018


detail as below, bug is happened on bond of enp1s16 and enp1s16f1

[huanghuai-test at yf-mos-test-net14 ~]$ sudo
/usr/local/share/openvswitch/scripts/dpdk_nic_bind --status

Network devices using DPDK-compatible driver
============================================
0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2' drv=igb_uio
unused=ixgbe
0000:01:00.1 'Ethernet Controller 10-Gigabit X540-AT2' drv=igb_uio
unused=ixgbe

Network devices using kernel driver
===================================
0000:01:10.0 'X540 Ethernet Controller Virtual Function' if=enp1s16
drv=ixgbevf unused=bak,igb_uio
0000:01:10.1 'X540 Ethernet Controller Virtual Function' if=enp1s16f1
drv=ixgbevf unused=bak,igb_uio
0000:08:00.0 'I350 Gigabit Network Connection' if=eth2 drv=igb
unused=igb_uio
0000:08:00.1 'I350 Gigabit Network Connection' if=eth3 drv=igb
unused=igb_uio

Other network devices
=====================
<none>

2018-01-30 10:19 GMT+08:00 Sam <batmanustc at gmail.com>:

> I found a bug about ixgbevf kernel module in centos-3.10.0-229.20.1.el7.
> And this bug is also in 3.10.0-514.10.2.el7.
>
> How to produce this bug: use SRIOV first, then add lots of network traffic
> on vf port, and then ifdow/ifup vf port, after many times, this bug happens.
>
> BUG:
>
> [308026.586026] ixgbevf 0000:01:10.0: NIC Link is Down
> [308026.586037] ixgbevf 0000:01:10.1: NIC Link is Down
> [308026.683724] bonding: bond1: link status definitely down for interface enp1s16, disabling it
> [308026.683728] bonding: bond1: now running without any active interface !
> [308026.683729] bonding: bond1: link status definitely down for interface enp1s16f1, disabling it
> [308028.266060] bonding: bond1: Removing slave enp1s16.
> [308028.266135] bonding: bond1: Warning: the permanent HWaddr of enp1s16 - 4e:cd:a6:59:26:2c - is still in use by bond1. Set the HWaddr of enp1s16 to a different address to avoid conflicts.
> [308028.266139] bonding: bond1: releasing active interface enp1s16
> [308028.359872] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> [308028.361319] IP: [<ffffffffa0494970>] ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf]
> [308028.362049] PGD 0
> [308028.362777] Oops: 0000 [#1] SMP
> [308028.363481] Modules linked in: ixgbevf(OF) igb_uio(OF) iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter nbd(OF) vhost_net macvtap macvlan udp_diag unix_diag af_packet_diag netlink_diag tun tcp_diag inet_diag uio bonding ext4 mbcache jbd2 intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel mgag200 aesni_intel iTCO_wdt lrw dcdbas gf128mul syscopyarea sysfillrect iTCO_vendor_support glue_helper sysimgblt ablk_helper ttm cryptd ipmi_devintf igb ixgbe drm_kms_helper drm i2c_algo_bit ptp i2c_core ipmi_si pps_core sg mdio ipmi_msghandler dca sb_edac mei_me mei shpchp lpc_ich pcspkr mfd_core edac_core wmi acpi_power_meter acpi_pad ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_common ahci libahci
> [308028.368487]  libata megaraid_sas [last unloaded: ixgbevf]
> [308028.369345] CPU: 0 PID: 21971 Comm: kworker/0:1 Tainted: GF       W  O--------------   3.10.0-229.el7.x86_64 #1
> [308028.370226] Hardware name: Dell Inc. PowerEdge R720/068CDY, BIOS 2.5.2 01/28/2015
> [308028.371132] Workqueue: events ixgbevf_service_task [ixgbevf]
> [308028.372038] task: ffff88022b0dad80 ti: ffff88010905c000 task.ti: ffff88010905c000
> [308028.372965] RIP: 0010:[<ffffffffa0494970>]  [<ffffffffa0494970>] ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf]
> [308028.373949] RSP: 0018:ffff88010905fd10  EFLAGS: 00010287
> [308028.374900] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000000
> [308028.375895] RDX: 0000000000000000 RSI: 00000000000001ff RDI: ffff8800b82061c0
> [308028.376841] RBP: ffff88010905fd48 R08: 0000000000000282 R09: 0000000000000001
> [308028.377780] R10: 0000000000000004 R11: 0000000000000005 R12: 0000000000000000
> [308028.378702] R13: 00000000fffffe00 R14: 00000000000001ff R15: ffff8800b82061c0
> [308028.379628] FS:  0000000000000000(0000) GS:ffff882f7fa00000(0000) knlGS:0000000000000000
> [308028.380540] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [308028.381471] CR2: 0000000000000008 CR3: 000000000190a000 CR4: 00000000001427f0
> [308028.382376] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [308028.383291] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [308028.384180] Stack:
> [308028.385051]  ffff8832d1b58bc0 ffff88010905fd28 ffff8832d1b588c0 0000000000000009
> [308028.385933]  ffff8832d1b58bc0 ffff8800b82061c0 0000000000001028 ffff88010905fdb8
> [308028.386804]  ffffffffa0496ba3 ffff8832d1b58e58 000000022b1e2000 00000000819e2108
> [308028.387693] Call Trace:
> [308028.388520]  [<ffffffffa0496ba3>] ixgbevf_configure+0x5d3/0x7d0 [ixgbevf]
> [308028.389363]  [<ffffffffa0498135>] ixgbevf_reinit_locked+0x65/0x90 [ixgbevf]
> [308028.390213]  [<ffffffffa049a3e4>] ixgbevf_service_task+0x324/0x420 [ixgbevf]
> [308028.391043]  [<ffffffff8108f1db>] process_one_work+0x17b/0x470
> [308028.391888]  [<ffffffff8108ffbb>] worker_thread+0x11b/0x400
> [308028.392728]  [<ffffffff8108fea0>] ? rescuer_thread+0x400/0x400
> [308028.393576]  [<ffffffff8109739f>] kthread+0xcf/0xe0
> [308028.394434]  [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
> [308028.395339]  [<ffffffff8161497c>] ret_from_fork+0x7c/0xb0
> [308028.396205]  [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
> [308028.397068] Code: c5 41 89 f6 49 89 c4 48 8d 14 40 48 8b 47 28 49 c1 e4 04 4c 03 67 20 48 8d 1c d0 0f b7 47 4c 41 29 c5 66 0f 1f 84 00 00 00 00 00 <48> 83 7b 08 00 74 73 8b 53 10 48 8b 03 48 01 d0 49 83 c4 10 48
> [308028.398959] RIP  [<ffffffffa0494970>] ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf]
> [308028.399910]  RSP <ffff88010905fd10>
> [308028.400846] CR2: 0000000000000008
>
>
>


More information about the dev mailing list