[ovs-dev] Hypervisor down during upgrade OVS 2.10.x to 2.10.y

Han Zhou zhouhan at gmail.com
Sat Sep 7 00:05:34 UTC 2019


Good finding. So it is a known kernel bug and it won't get fixed in that
kernel version. :(
>From OVS point of view, the problem is, if such bug exists in kernel, it
prevents ovs-vswitchd to be killed (even by SIGKILL) because the process is
in system call which got blocked because of the kernel bug. At that point,
there is no way to either proceed or rollback the OVS reload.
Does anyone have a good idea how to deal with such problem during OVS
reload?

Thanks,
Han

On Fri, Sep 6, 2019 at 4:17 PM aginwala <aginwala at asu.edu> wrote:

> Hi:
>
> Adding correct ovs-discuss ML. I did get a chance to take a look on it a
> bit. I think this is the bug in 4.4.0-104-generic kernel on ubuntu 16.04 as
> its being discussed on ubuntu forum
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407  where it
> can be hit all of a sudden as per the kernel logs shared
> "unregister_netdevice: waiting for br0 to become free. Usage count = 1".
> Folks are proposing on the forum to upgrade to higher kernel to get rid of
> this issue. Upstream linux proposed relevant fixes @
> https://github.com/torvalds/linux/commit/ee60ad219f5c7c4fb2f047f88037770063ef785f to
> address related issues. I guess kernel folks can comment on this more. Not
> sure if I missed anything else.
>
> May be we can we can do some improvements in force-reload-kmod where right
> now stop_forwarding causes stop ovs-vswitchd but system stalls because br0
> (eth0 is added to br0) is busy causing network connectivity loss. Host
> recovers only after host restart in current case. Not sure, if we need to
> handle this corner case in ovs?
>
> On Wed, Aug 28, 2019 at 2:21 PM Jin, Liang via dev <
> ovs-dev at openvswitch.org> wrote:
>
>>
>> Hi,
>> We upgrade the OVS recently from one version 2.10 to another version
>> 2.10.  on some HV upgrade, the HV is down when running force reload kernel.
>> In the ovs-ctl log, kill ovs-vswitch is failed, but the script is still
>> going to reload the modules.
>> ```
>> ovsdb-server is running with pid 2431
>> ovs-vswitchd is running with pid 2507
>> Thu Aug 22 23:13:49 UTC 2019:stop
>> 2019-08-22T23:13:50Z|00001|fatal_signal|WARN|terminating with signal 14
>> (Alarm clock)
>> Alarm clock
>> 2019-08-22T23:13:51Z|00001|fatal_signal|WARN|terminating with signal 14
>> (Alarm clock)
>> Alarm clock
>> * Exiting ovs-vswitchd (2507)
>> * Killing ovs-vswitchd (2507)
>> * Killing ovs-vswitchd (2507) with SIGKILL
>> * Killing ovs-vswitchd (2507) failed
>> * Exiting ovsdb-server (2431)
>> Thu Aug 22 23:14:58 UTC 2019:load-kmod
>> Thu Aug 22 23:14:58 UTC 2019:start --system-id=random --no-full-hostname
>> /usr/share/openvswitch/scripts/ovs-ctl: unknown option
>> "--no-full-hostname" (use --help for help)
>> * Starting ovsdb-server
>> * Configuring Open vSwitch system IDs
>> * ovs-vswitchd is already running
>> * Enabling remote OVSDB managers
>> ovsdb-server is running with pid 3860447
>> ovs-vswitchd is running with pid 2507
>> ovsdb-server is running with pid 3860447
>> ovs-vswitchd is running with pid 2507
>> Thu Aug 22 23:15:09 UTC 2019:load-kmod
>> Thu Aug 22 23:15:09 UTC 2019:force-reload-kmod --system-id=random
>> --no-full-hostname
>> /usr/share/openvswitch/scripts/ovs-ctl: unknown option
>> "--no-full-hostname" (use --help for help)
>> * Detected internal interfaces: br-int
>> Thu Aug 22 23:37:08 UTC 2019:stop
>> 2019-08-22T23:37:09Z|00001|fatal_signal|WARN|terminating with signal 14
>> (Alarm clock)
>> Alarm clock
>> 2019-08-22T23:37:10Z|00001|fatal_signal|WARN|terminating with signal 14
>> (Alarm clock)
>> Alarm clock
>> * Exiting ovs-vswitchd (2507)
>> * Killing ovs-vswitchd (2507)
>> * Killing ovs-vswitchd (2507) with SIGKILL
>> * Killing ovs-vswitchd (2507) failed
>> * Exiting ovsdb-server (3860447)
>> Thu Aug 22 23:40:42 UTC 2019:load-kmod
>> * Inserting openvswitch module
>> Thu Aug 22 23:40:42 UTC 2019:start --system-id=random --no-full-hostname
>> /usr/share/openvswitch/scripts/ovs-ctl: unknown option
>> "--no-full-hostname" (use --help for help)
>> * Starting ovsdb-server
>> * Configuring Open vSwitch system IDs
>> * Starting ovs-vswitchd
>> * Enabling remote OVSDB managers
>> ovsdb-server is running with pid 2399
>> ovs-vswitchd is running with pid 2440
>> ovsdb-server is running with pid 2399
>> ovs-vswitchd is running with pid 2440
>> Thu Aug 22 23:46:18 UTC 2019:load-kmod
>> Thu Aug 22 23:46:18 UTC 2019:force-reload-kmod --system-id=random
>> --no-full-hostname
>> /usr/share/openvswitch/scripts/ovs-ctl: unknown option
>> "--no-full-hostname" (use --help for help)
>> * Detected internal interfaces: br-int br0
>> * Saving flows
>> * Exiting ovsdb-server (2399)
>> * Starting ovsdb-server
>> * Configuring Open vSwitch system IDs
>> * Flush old conntrack entries
>> * Exiting ovs-vswitchd (2440)
>> * Saving interface configuration
>> * Removing datapath: system at ovs-system
>> * Removing openvswitch module
>> rmmod: ERROR: Module vxlan is in use by: i40e
>> * Forcing removal of vxlan module
>> * Inserting openvswitch module
>> * Starting ovs-vswitchd
>> * Restoring saved flows
>> * Enabling remote OVSDB managers
>> * Restoring interface configuration
>> ```
>>
>> But in kern.log, we see the log as below, the process could not exit
>> because waiting br0 release,  and then, the ovs-ctl try to `kill term` and
>> `kill -9` the process, it does not work, because kernel is in infinity
>> loop.  Then, ovs-ctl try to save the flows, when save flow, core dump
>> happened in kernel. Then HV is down until restart it.
>> ```
>> Aug 22 16:13:45 slx11c-9gjm kernel: [21177057.998961] device br0 left
>> promiscuous mode
>> Aug 22 16:13:55 slx11c-9gjm kernel: [21177068.044859]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:14:05 slx11c-9gjm kernel: [21177078.068429]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:14:15 slx11c-9gjm kernel: [21177088.312001]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:14:25 slx11c-9gjm kernel: [21177098.359556]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:14:35 slx11c-9gjm kernel: [21177108.603058]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:14:45 slx11c-9gjm kernel: [21177118.658643]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:14:55 slx11c-9gjm kernel: [21177128.678211]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:15:05 slx11c-9gjm kernel: [21177138.721732]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:15:15 slx11c-9gjm kernel: [21177148.817317]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:15:25 slx11c-9gjm kernel: [21177158.828853]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:15:35 slx11c-9gjm kernel: [21177168.860459]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:15:46 slx11c-9gjm kernel: [21177179.104006]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:15:56 slx11c-9gjm kernel: [21177189.143518]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:16:06 slx11c-9gjm kernel: [21177199.183075]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:16:16 slx11c-9gjm kernel: [21177209.262617]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:16:26 slx11c-9gjm kernel: [21177219.510207]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:16:36 slx11c-9gjm kernel: [21177229.521757]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:16:46 slx11c-9gjm kernel: [21177239.765278]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:16:57 slx11c-9gjm kernel: [21177250.008826]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.016400]
>> unregister_netdevice: waiting for br0 to become free. Usage count = 1
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.256456] INFO: task
>> revalidator221:3859855 blocked for more than 120 seconds.
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.264121]       Tainted: G
>>        OE   4.4.0-104-generic #127-Ubuntu
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271176] "echo 0 >
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271179] revalidator221  D
>> ffff8826844eba48     0 3859855   2506 0x00000084
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271180]  ffff8826844eba48
>> 0000000000000000 ffff884fc2e69540 ffff882758688000
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271181]  ffff8826844ec000
>> ffffffffc0463564 ffff882758688000 00000000ffffffff
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271182]  ffffffffc0463568
>> ffff8826844eba60 ffffffff81840765 ffffffffc0463560
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271182] Call Trace:
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271191]
>> [<ffffffff81840765>] schedule+0x35/0x80
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271192]
>> [<ffffffff81840a0e>] schedule_preempt_disabled+0xe/0x10
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271193]
>> [<ffffffff81842649>] __mutex_lock_slowpath+0xb9/0x130
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271195]
>> [<ffffffff818426df>] mutex_lock+0x1f/0x30
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271207]
>> [<ffffffffc042ef56>] ovs_dp_cmd_get+0x46/0xf0 [openvswitch]
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271211]
>> [<ffffffff8176d287>] genl_family_rcv_msg+0x1e7/0x3c0
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271213]
>> [<ffffffff8176d460>] ? genl_family_rcv_msg+0x3c0/0x3c0
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271214]
>> [<ffffffff8176d4d6>] genl_rcv_msg+0x76/0xb0
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271215]
>> [<ffffffff8176a97c>] ? netlink_lookup+0xdc/0x140
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271217]
>> [<ffffffff8176c9a4>] netlink_rcv_skb+0xa4/0xc0
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271218]
>> [<ffffffff8176d088>] genl_rcv+0x28/0x40
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271219]
>> [<ffffffff8176c37a>] netlink_unicast+0x18a/0x240
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271221]
>> [<ffffffff8176c72b>] netlink_sendmsg+0x2fb/0x3a0
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271225]
>> [<ffffffff813a1c11>] ? aa_sock_msg_perm+0x61/0x150
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271227]
>> [<ffffffff8171bad8>] sock_sendmsg+0x38/0x50
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271228]
>> [<ffffffff8171c581>] ___sys_sendmsg+0x281/0x290
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271231]
>> [<ffffffff810badf7>] ? should_numa_migrate_memory+0x57/0x130
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271233]
>> [<ffffffff811c2713>] ? handle_mm_fault+0xbc3/0x1820
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271235]
>> [<ffffffff812101b4>] ? new_sync_read+0x94/0xd0
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271237]
>> [<ffffffff8122e9e5>] ? __fget_light+0x25/0x60
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271238]
>> [<ffffffff8171ced1>] __sys_sendmsg+0x51/0x90
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271239]
>> [<ffffffff8171cf22>] SyS_sendmsg+0x12/0x20
>> Aug 22 16:17:07 slx11c-9gjm kernel: [21177260.271240]
>> [<ffffffff81844872>] entry_SYSCALL_64_fastpath+0x16/0x71
>> ```
>> My question is:
>>
>>   1.  Why we need to stop the openvswitch service before back up flows.
>>   2.  If the openvswitch is exiting status, why it still can receive
>> request.
>>   3.  Why the br0 took so long time to exit.
>>
>> I think this is a bug need to improve during upgrading.
>>
>> Thanks
>> L.J.
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>


More information about the dev mailing list