[ovs-discuss] Restarting network kills ovs-vswitchd (and network)... ?

SCHAER Frederic frederic.schaer at cea.fr
Thu May 16 09:34:28 UTC 2019

I'm facing an issue with openvswitch, which I think is new (not even sure).
here is the description :

* What you did that make the problem appear.

I am configuring openstack (compute, network) nodes using OVS networks for main interfaces and RHEL network scripts, basically using openvswitch to create bridges, set the bridges IPs, and include the real Ethernet devices in the bridges.
On a compute machine (not in production, so not using 3 or more interfaces), I have for instance brflat -> em1.
Brflat has multiple IPs defined using IPADDR1, IPADDR2, etc..
Now : at boot, machine has network. Bur if I ever change anything in network scripts and issue either a network restart, an ifup or an ifdown : network breaks and connectivity is lost.

Also, on network restarts, I'm getting these logs in the network journal :
May 16 10:26:41 cloud1 ovs-vsctl[1766678]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-br brflat
May 16 10:26:51 cloud1 ovs-vsctl[1766678]: ovs|00002|fatal_signal|WARN|terminating with signal 14 (Alarm clock)
May 16 10:26:51 cloud1 network[1766482]: Bringing up interface brflat:  2019-05-16T08:26:51Z|00002|fatal_signal|WARN|terminating with signal 14 (Alarm clock)

* What you expected to happen.

On network restart... to get back a working network. Not be forced to log in using ipmi console and fix network manually.

* What actually happened.

What actually happens is that on ifup/ifdown/network restart, the ovs-vswitchd daemon stops working. According to systemctl, it is actually exiting with code 0.
If I do a ifdown on one interface, then ovs-vswitchd is down.
After ovs-vswitchd restart, I then can ifup that interface : network is still down (no ping, nothing).
Ovs-vswitchd is again dead/stopped/exited 0.
Then : manually starting ovs-vswitchd restores connectivity.

Please also include the following information:
* The Open vSwitch version number (as output by ovs-vswitchd --version).
ovs-vswitchd (Open vSwitch) 2.10.1

The following are also handy sometimes:
* The kernel version on which Open vSwitch is running (from /proc/version) and the distribution and version number of your OS (e.g. "Centos 5.0").
# cat /proc/version
Linux version 3.10.0-957.12.1.el7.x86_64 (mockbuild at kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Mon Apr 29 14:59:59 UTC 2019

(CentOS 7.1810 host)

* The contents of the vswitchd configuration database (usually /etc/openvswitch/conf.db).
Attaching file with redacted IPs. Quite a big file...

* The output of ovs-dpctl show.
2019-05-16T09:01:08Z|00001|dpif_netlink|INFO|The kernel module does not support meters.
system at ovs-system:
  lookups: hit:31178159 missed:7298072 lost:13985
  flows: 816
  masks: hit:259641852 total:15 hit/pkt:6.75
  port 0: ovs-system (internal)
  port 1: br-int (internal)
  port 2: gre_sys (gre: packet_type=ptap)
  port 3: br-tun (internal)
  port 4: brflat (internal)
  port 5: em1
  port 6: qvoe59e1a9b-54
  port 7: qvoaf90276f-fa

* If you have Open vSwitch configured to connect to an OpenFlow controller, the output of ovs-ofctl show <bridge> for each <bridge> configured in the vswitchd configuration database.
I haven'd done that

* A fix or workaround, if you have one.
Manually restart ovs-vswitchd after any ifup/ifdown, but that really is not a workaround.

* Any other information that you think might be relevant.

I have tried to set debug on ovs-vswitchd using this command : ovs-appctl vlog/set ANY:file:DBG
I actually ran :

#interface member of brflat
ifdown em1 systemctl start ovs-vswitchd
ovs-appctl vlog/set ANY:file:DBG
ifup em1
#ovs-vswitchd exits/dies => logs have a hole
#and again :
systemctl start ovs-vswitchd

I am attaching the logfile.
These are the log lines when vswitchd "stops" (why ?) and before I start it again (it loses the debug mode) :

#not dead yet, but no goodbye and no DBG after this.
2019-05-16T09:08:49.236Z|00242|poll_loop(urcu9)|DBG|wakeup due to [POLLIN] on fd 45 (FIFO pipe:[8645312]) at lib/ovs-rcu.c:235 (0% CPU usage)
#the 2nd vswitchd start
2019-05-16T09:09:37.300Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log

Note : my interfaces configuration files :
[root at cloud1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-brflat
# File Managed by Puppet
[root at cloud1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-em1
# File Managed by Puppet

===> Am I doing something wrong ? Seems like centos repos have openvswitch 2.11, but only for openstack stein - I haven-t had time yet to try an upgrade... so I can try to "play" with a node and try to only upgrade openvswitch using the stein version... but the CentOS version is not the latest 2.11.1, "only" the 2.11.0.

Thank you and best regards
Frederic Schaer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ovs.conf.db
Type: application/octet-stream
Size: 65295 bytes
Desc: ovs.conf.db
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20190516/6e522744/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ovs-vswitchd.log
Type: application/octet-stream
Size: 270526 bytes
Desc: ovs-vswitchd.log
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20190516/6e522744/attachment-0003.obj>

More information about the discuss mailing list