[ovs-discuss] systemd ovs-vswitchd starts too early
fbl at redhat.com
Thu Feb 4 02:22:22 UTC 2016
On Wed, 3 Feb 2016 20:15:11 -0500
Mark Mielke <mark.mielke at gmail.com> wrote:
> > On Tue, 2 Feb 2016 02:06:43 -0500 Mark Mielke
> > <mark.mielke at gmail.com> wrote:
> >> I think it is not a no-op, at least on Fedora. Now that you have
> >> explained the expectation a little bit more, I think I am seeing
> >> that this is probably a race condition. Mainly, I think that on my
> >> hardware, with "openvswitch.service" enabled, System sees that both
> >> "openvswitch-nonetwork.service" and "network.service" should both
> >> be started, "openvswitch-nonetwork.service" is being started at
> >> the same time as "network.service", and "openvswitch" is just
> >> *faster* than "network.service" to try to access the network
> >> interfaces. Because "network.service" is taking a while to
> >> startup, and begin to start the modprobe and other activities that
> >> activate the network interface, "openvswitch" gets ahead of it and
> >> encounters several failures.
> On Wed, Feb 3, 2016 at 3:09 PM, Flavio Leitner <fbl at redhat.com> wrote:
> > OK, the issue seems to be the fact that network.service is a sysV
> > script which gets translated to systemd service but it doesn't have
> > any relation to network.target.
> > Could you please try this patch? It will require
> > openvswitch.service to run after network.service which was the
> > initial intention.
> > diff --git a/rhel/usr_lib_systemd_system_openvswitch.service
> > b/rhel/usr_lib_systemd_system_openvswitch.service index
> > f0bc16f..a391dfe 100644 ---
> > a/rhel/usr_lib_systemd_system_openvswitch.service +++
> > b/rhel/usr_lib_systemd_system_openvswitch.service @@ -1,6 +1,6 @@
> > [Unit]
> > Description=Open vSwitch
> > -After=syslog.target network.target openvswitch-nonetwork.service
> > +After=syslog.target network.target openvswitch-nonetwork.service
> > network.service Requires=openvswitch-nonetwork.service
> > [Service]
> I updated the /usr/lib/systemd/system/openvswitch.service in place,
> and it appears to have no effect. The network.service may be a SysV
> script, but it does define:
> ### BEGIN INIT INFO
> # Provides: $network
> ### END INIT INFO
> I believe this is a hint to systemd to do the right thing, and that
> this helps it align with network.target which has:
> I think this has no effect, because although it causes
> openvswitch.service to be delayed until after network.service (which
> is actually less restricted than after network.target?), it does not
> prevent openvswitch-nonetwork.service from starting too early. The
> only constraint on openvswitch-nonetwork.service is:
> As soon as syslog.target is achieved, openvswitch-nonetwork.service
> may start. With a fast enough machine, and a slow enough
> network.service-initiated physical interface initiation,
> openvswitch-nonetwork.service will start before the physical
> interfaces are ready, which results in the problems we are discussion.
> If this is a correct summary of the situation, then it makes me think
> that that openvswitch-nonetwork.service isn't really working as
> intended, at least not with network.service? Or, it only works as
> intended if your use case does not include openvswitch managing your
> physical network interfaces?
> I'm getting a headache... :-)
It should _not_ prevent openvswitch-nonetwork.service to start.
Actually, that is the reason for its name. :-)
The openvswitch-nonetwork is meant to be started by demand. So, either
openvswitch.service will start it at appropriate time or ifup-ovs when
configuring an OVS port because 'network.service' is running and
processing all ifcfg- files.
We need to split the issues: One is that enabling openvswitch.service
causes it to run too soon. I think the proposed patch should resolve
that (check journalctl -xe). Another issue is having stale interfaces in
the DB when openvswitch-nonetwork.service starts because of ifup-ovs.
That will cause the interfaces to appear before they are actually
available. My patch doesn't fix the second one.
The only way to fix the second issue is to clean what needs to be
cleaned when the ifcfg- is being processed. That's what we did for
OVSPort, but it is missing for OVSBond. So, when configuring the bond
interface, it should delete if exists first, then start from scratch
with the parameters in the ifcfg- file. Same for the physical ports
attached to the bond.
More information about the discuss