[ovs-dev] [PATCH V2] datapath: Prevent panic

Gregory Rose gvrose8192 at gmail.com
Mon Apr 23 16:20:47 UTC 2018



On 4/23/2018 9:19 AM, Gregory Rose wrote:
>
> On 4/20/2018 9:36 AM, Gregory Rose wrote:
>> On 4/20/2018 9:03 AM, Gregory Rose wrote:
>>> On 4/20/2018 5:39 AM, Eric Garver wrote:
>>>> On Thu, Apr 19, 2018 at 08:07:33PM -0700, Gregory Rose wrote:
>>>> Fantastic, I'll test this and whip up a patch.
>>>>> Thanks!
>>>>>
>>>>> - Greg
>>>> I'll be on the lookout for it. Thanks.
>>>>
>>>> [..]
>>>
>>> Eric,
>>>
>>> with the above patch I'm getting this on a stock RHEL 7.4 kernel:
>>>
>>> [  599.659110] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
>>> [ovstest:23936]
>>> [  599.659464] Modules linked in: nf_nat_tftp nf_conntrack_tftp 
>>> nf_nat_ftp nf_conntrack_ftp nf_conntrack_netlink bonding 8021q garp 
>>> mrp veth netconsole ip6t_rpfilter ipt_REJECT nf_reject_ipv4 
>>> ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat 
>>> ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 
>>> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security 
>>> ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
>>> nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security 
>>> iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables 
>>> iptable_filter snd_hda_codec_generic snd_hda_intel snd_hda_codec 
>>> snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm iosf_mbi 
>>> snd_timer ppdev snd crc32_pclmul ghash_clmulni_intel aesni_intel sg 
>>> lrw gf128mul glue_helper ablk_helper pcspkr i2c_piix4 joydev cryptd 
>>> soundcore virtio_balloon parport_pc parport ip_tables xfs libcrc32c 
>>> sr_mod cdrom ata_generic pata_acpi qxl drm_kms_helper syscopyarea 
>>> sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk 
>>> virtio_net ata_piix libata crct10dif_pclmul crct10dif_common 
>>> crc32c_intel serio_raw floppy i2c_core virtio_pci virtio_ring virtio 
>>> dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip6_udp_tunnel]
>>> [  599.662118] CPU: 3 PID: 23936 Comm: ovstest Tainted: G OE 
>>> ------------   3.10.0-693.el7.x86_64 #1
>>> [  599.662351] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>> [  599.662573] task: ffff880235871fa0 ti: ffff880218668000 task.ti: 
>>> ffff880218668000
>>> [  599.662823] RIP: 0010:[<ffffffffc0437210>] [<ffffffffc0437210>] 
>>> nf_ct_iterate_cleanup+0xb0/0x210 [nf_conntrack]
>>> [  599.663082] RSP: 0018:ffff88021866b870  EFLAGS: 00000246
>>> [  599.663325] RAX: 0000000000000001 RBX: ffff88017cebec00 RCX: 
>>> 0000000000000000
>>> [  599.663583] RDX: 0000000000006b35 RSI: 0000000000000000 RDI: 
>>> ffff88022de5e140
>>> [  599.663828] RBP: ffff88021866b8b0 R08: 0000000000000000 R09: 
>>> 0000000000000000
>>> [  599.664070] R10: ffff88017cebec00 R11: 0000000000000400 R12: 
>>> 0000000000000000
>>> [  599.664310] R13: ffff88021866b8b0 R14: 0000000000000000 R15: 
>>> 0000000000000000
>>> [  599.664548] FS:  00007f3c91f91a00(0000) GS:ffff88023fd80000(0000) 
>>> knlGS:0000000000000000
>>> [  599.664790] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  599.665030] CR2: 00000000004be7a0 CR3: 000000021d745000 CR4: 
>>> 00000000001406e0
>>> [  599.665274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>>> 0000000000000000
>>> [  599.665514] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
>>> 0000000000000400
>>> [  599.665753] Stack:
>>> [  599.665984]  00005d8000000000 ffffffffc044c594 ffffffff81ad9d40 
>>> ffff88021866b948
>>> [  599.666251]  ffffffff81ad9d40 ffff8802347ed400 0000000000000014 
>>> ffffffff81ad9d40
>>> [  599.666535]  ffff88021866b8c0 ffffffffc043738c ffff88021866b938 
>>> ffffffffc053688b
>>> [  599.666809] Call Trace:
>>> [  599.667056]  [<ffffffffc043738c>] 
>>> nf_conntrack_flush_report+0x1c/0x20 [nf_conntrack]
>>> [  599.667310]  [<ffffffffc053688b>] 
>>> ctnetlink_del_conntrack+0x19b/0x1d0 [nf_conntrack_netlink]
>>> [  599.667587]  [<ffffffff8134bfa2>] ? nla_parse+0x32/0x120
>>> [  599.667842]  [<ffffffffc04e050f>] nfnetlink_rcv_msg+0x24f/0x260 
>>> [nfnetlink]
>>> [  599.668105]  [<ffffffffc04e02c0>] ? nfnetlink_bind+0x60/0x60 
>>> [nfnetlink]
>>> [  599.668356]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
>>> [  599.668622]  [<ffffffffc04e088f>] nfnetlink_rcv+0x27f/0x50a 
>>> [nfnetlink]
>>> [  599.668921]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
>>> [  599.669179]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
>>> [  599.669434]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
>>> [  599.669690]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
>>> [  599.669939]  [<ffffffff81328da2>] ? radix_tree_lookup_slot+0x22/0x50
>>> [  599.670199]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
>>> [  599.670438]  [<ffffffff815baac0>] ? netlink_insert+0x1a0/0x2f0
>>> [  599.670701]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
>>> [  599.670950]  [<ffffffff815691b2>] ? move_addr_to_user+0xb2/0xd0
>>> [  599.671207]  [<ffffffff8156929c>] ? SYSC_getsockname+0xcc/0xe0
>>> [  599.671458]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
>>> [  599.671704]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
>>> [  599.671943]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
>>> [  599.672186] Code: 74 16 e9 3c 01 00 00 0f 1f 40 00 4d 8b 3f 41 f6 
>>> c7 01 0f 85 2b 01 00 00 41 80 7f 37 00 75 ec 4d 8d 6f f0 4c 89 e6 4c 
>>> 89 ef ff d3 <85> c0 74 dc f0 41 ff 45 00 48 8b 7d c8 ff 14 25 50 31 
>>> a1 81 e8
>>>
>>> So perhaps there's something else going on?
>>>
>>
>> I added back this part of the original patch:
>>
>> diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
>> index f23a406..2b9b691 100644
>> --- a/tests/system-kmod-macros.at
>> +++ b/tests/system-kmod-macros.at
>> @@ -23,6 +23,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START],
>>                 on_exit 'modprobe -q -r mod'
>>                ])
>>     on_exit 'ovs-dpctl del-dp ovs-system'
>> +   on_exit 'ovs-appctl dpctl/flush-conntrack'
>>     _OVS_VSWITCHD_START([])
>>     dnl Add bridges, ports, etc.
>>     AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], 
>> [| uuidfilt]
>> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
>> index 4dc3151..4c3ad0f 100644
>> --- a/utilities/ovs-lib.in
>> +++ b/utilities/ovs-lib.in
>> @@ -616,6 +616,7 @@ force_reload_kmod () {
>>      for dp in `ovs-dpctl dump-dps`; do
>>          action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
>>      done
>> +    action "ovs-appctl dpctl/flush-conntrack"
>>
>>      for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
>>          action "Removing $vport module" rmmod $vport
>>
>> And now the test is running successfully since then.  So I'll keep 
>> that part of the patch as well.
>>
>> - Greg
>
> The suggested changes to delete the conntrack timer has not worked.  I 
> would still like this version of the patch to be used as it is more 
> reliable.
>
> Pravin,
>
> unless there is some other objection can we get this patch committed?
>
> Thanks,
>
> - Greg

Oops, Actually adding Pravin.

- Greg


More information about the dev mailing list