[ovs-git] [openvswitch/ovs] 2110d0: Make pid_exists() more robust against empty pid ar...

Michele Baldessari noreply at github.com
Thu Aug 29 14:41:56 UTC 2019


  Branch: refs/heads/branch-2.7
  Home:   https://github.com/openvswitch/ovs
  Commit: 2110d00f3a1d52a49522854276744f981628a268
      https://github.com/openvswitch/ovs/commit/2110d00f3a1d52a49522854276744f981628a268
  Author: Michele Baldessari <michele at acksyn.org>
  Date:   2019-08-29 (Thu, 29 Aug 2019)

  Changed paths:
    M utilities/ovs-lib.in

  Log Message:
  -----------
  Make pid_exists() more robust against empty pid argument

In some of our destructive testing of ovn-dbs inside containers managed
by pacemaker we reached a situation where /var/run/openvswitch had
empty .pid files. The current code does not deal well with them
and pidfile_is_running() returns true in such a case and this confuses
the OCF resource agent.

- Before this change:
Inside a container run:
  killall ovsdb-server;
  echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > /var/run/openvswitch/ovnsb_db.pid

We will observe that the cluster is unable to ever recover because
it believes the ovn processes to be running when they really aren't and
eventually just fails:
 podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0     (ocf::ovn:ovndb-servers):       Master controller-0
   ovn-dbs-bundle-1     (ocf::ovn:ovndb-servers):       Stopped controller-1
   ovn-dbs-bundle-2     (ocf::ovn:ovndb-servers):       Slave controller-2

Let's make sure pid_exists() returns false when the pid is an empty
string.

- After this change the cluster is able to recover from this state and
correctly start the resource:
 podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0     (ocf::ovn:ovndb-servers):       Master controller-0
   ovn-dbs-bundle-1     (ocf::ovn:ovndb-servers):       Slave controller-1
   ovn-dbs-bundle-2     (ocf::ovn:ovndb-servers):       Slave controller-2

Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as non-root.")

Signed-off-by: Michele Baldessari <michele at acksyn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>




More information about the git mailing list