[ovs-discuss] dpdk watchdog stuck? (was: ovsrcu_synchronize() blocking while indefinitely waiting for thread to quiesce)

Ben Pfaff blp at ovn.org
Mon Jan 25 18:50:38 UTC 2016


On Mon, Jan 25, 2016 at 03:09:09PM +0100, Patrik Andersson R wrote:
> during robustness testing, where VM:s are booted and deleted using nova
> boot/delete in rather rapid succession, VMs get stuck in spawning state
> after
> a few test cycles. Presumably this is due to the OVS not responding to port
> additions and deletions anymore, or rather that responses to these requests
> become painfully slow. Other requests towards the vswitchd fail to complete
> in any reasonable time frame as well, ovs-appctl vlog/set is one example.
> 
> The only conclusion I can draw at the moment is that some thread (I've
> observed main and dpdk_watchdog3) is blocking the ovsrcu_synchronize()
> operation for "infinite" time and there is no fall-back to get out of this.
> To
> recover, the minimum operation seems to be a service restart of the
> openvswitch-switch service but that seems to cause other issues longer term.
> 
> In the vswitch log when this happens the following can be observed:
> 
> 2016-01-24T20:36:14.601Z|02742|ovs_rcu(vhost_thread2)|WARN|blocked 1000 ms
> waiting for dpdk_watchdog3 to quiesce

This looks like a bug somewhere in the DPDK code.  The watchdog code is
really simple:

    static void *
    dpdk_watchdog(void *dummy OVS_UNUSED)
    {
        struct netdev_dpdk *dev;

        pthread_detach(pthread_self());

        for (;;) {
            ovs_mutex_lock(&dpdk_mutex);
            LIST_FOR_EACH (dev, list_node, &dpdk_list) {
                ovs_mutex_lock(&dev->mutex);
                check_link_status(dev);
                ovs_mutex_unlock(&dev->mutex);
            }
            ovs_mutex_unlock(&dpdk_mutex);
            xsleep(DPDK_PORT_WATCHDOG_INTERVAL);
        }

        return NULL;
    }

Although it looks at first glance like it doesn't quiesce, xsleep() does
that internally, so I guess check_link_status() must be hanging.



More information about the discuss mailing list