[ovs-discuss] systemd ovs-vswitchd starts too early
fbl at redhat.com
Wed Feb 3 20:09:05 UTC 2016
On Tue, 2 Feb 2016 02:06:43 -0500
Mark Mielke <mark.mielke at gmail.com> wrote:
> On Mon, Feb 1, 2016 at 1:56 PM, Flavio Leitner <fbl at redhat.com> wrote:
> > On Mon, 1 Feb 2016 07:54:46 -0800
> > Guru Shetty <guru at ovn.org> wrote:
> > > On 31 January 2016 at 14:47, Mark Mielke <mark.mielke at gmail.com>
> > > wrote:
> > > > This is now working... By *not* enabling openvswitch.service,
> > > > and letting ifup-ovs start up openvswitch on demand, the system
> > > > is coming up reliably whether clean shutdown or force reset (I
> > > > want the server to be crash-safe, so I explicitly test this
> > > > case).... But, I'm now concerned about the direction of Fedora
> > > > and openvswitch-nonetwork.service, and I am wondering if my
> > > > work-around of not enabling openvswitch.service makes sense,
> > > > and is part of the design of ifup-ovs that will be supported
> > > > going forwards, or is just lucky that it works, and this could
> > > > break with a future openvswitch update, or a future version of
> > > > Fedora?
> > It's designed since migration to systemd to start the service on
> > demand and that can't change anymore.
> > You are supposed to be able to enable 'openvswitch' service and that
> > should make no differences in your setup since the interfaces are
> > brought up by 'network.target' which 'openvswitch' runs after. By
> > that time, all interfaces are up including the OVS ones which
> > started the service on demand. In summary, enabling openvswitch
> > should be a no-op.
> I think it is not a no-op, at least on Fedora. Now that you have
> explained the expectation a little bit more, I think I am seeing that
> this is probably a race condition. Mainly, I think that on my
> hardware, with "openvswitch.service" enabled, System sees that both
> "openvswitch-nonetwork.service" and "network.service" should both be
> started, "openvswitch-nonetwork.service" is being started at the same
> time as "network.service", and "openvswitch" is just *faster* than
> "network.service" to try to access the network interfaces. Because
> "network.service" is taking a while to startup, and begin to start the
> modprobe and other activities that activate the network interface,
> "openvswitch" gets ahead of it and encounters several failures.
OK, the issue seems to be the fact that network.service is a sysV
script which gets translated to systemd service but it doesn't have
any relation to network.target.
Could you please try this patch? It will require openvswitch.service
to run after network.service which was the initial intention.
diff --git a/rhel/usr_lib_systemd_system_openvswitch.service b/rhel/usr_lib_systemd_system_openvswitch.service
index f0bc16f..a391dfe 100644
@@ -1,6 +1,6 @@
-After=syslog.target network.target openvswitch-nonetwork.service
+After=syslog.target network.target openvswitch-nonetwork.service network.service
> With the second commit you referred to, it may be able to work around
> this problem, but I think there is still a race problem here that
> should be discussed further?
> Thanks for considering this.
More information about the discuss