[ovs-dev] [PATCH] Make pidfile_is_running more robust against empty pidfiles

Ilya Maximets i.maximets at samsung.com
Tue Aug 20 09:22:24 UTC 2019


On 20.08.2019 12:16, Ilya Maximets wrote:
> On 20.08.2019 11:48, Numan Siddique wrote:
>>
>>
>> On Wed, Aug 14, 2019 at 9:21 PM Michele Baldessari <michele at acksyn.org <mailto:michele at acksyn.org>> wrote:
>>
>>     On Wed, Aug 14, 2019 at 02:28:13PM +0300, Ilya Maximets wrote:
>>     > On 14.08.2019 11:39, Michele Baldessari wrote:
>>     > > In some of our destructive testing of ovn-dbs inside containers managed
>>     > > by pacemaker we reached a situation where /var/run/openvswitch had
>>     > > empty .pid files. The current code does not deal well with them
>>     > > and pidfile_is_running() returns true in such a case and this confuses
>>     > > the OCF resource agent.
>>     > >
>>     > > - Before this change:
>>     > > Inside a container run:
>>     > >   killall ovsdb-server;
>>     > >   echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > /var/run/openvswitch/ovnsb_db.pid
>>     > >
>>     > > We will observe that the cluster is unable to ever recover because
>>     > > it believes the ovn processes to be running when they really aren't and
>>     > > eventually just fails:
>>     > >  podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest <http://192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest>]
>>     > >    ovn-dbs-bundle-0     (ocf::ovn:ovndb-servers):       Master controller-0
>>     > >    ovn-dbs-bundle-1     (ocf::ovn:ovndb-servers):       Stopped controller-1
>>     > >    ovn-dbs-bundle-2     (ocf::ovn:ovndb-servers):       Slave controller-2
>>     > >
>>     > > - After this change the cluster is able to recover from this state and
>>     > > correctly start the resource:
>>     > >  podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest <http://192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest>]
>>     > >    ovn-dbs-bundle-0     (ocf::ovn:ovndb-servers):       Master controller-0
>>     > >    ovn-dbs-bundle-1     (ocf::ovn:ovndb-servers):       Slave controller-1
>>     > >    ovn-dbs-bundle-2     (ocf::ovn:ovndb-servers):       Slave controller-2
>>     > >
>>     > > Signed-off-by: Michele Baldessari <michele at acksyn.org <mailto:michele at acksyn.org>>
>>     > > ---
>>     > >  ovn/utilities/ovn-ctl | 2 +-
>>     > >  1 file changed, 1 insertion(+), 1 deletion(-)
>>     > >
>>     > > diff --git a/ovn/utilities/ovn-ctl b/ovn/utilities/ovn-ctl
>>     > > index 7e5cd469c83c..65f03e28ddba 100755
>>     > > --- a/ovn/utilities/ovn-ctl
>>     > > +++ b/ovn/utilities/ovn-ctl
>>     > > @@ -35,7 +35,7 @@ ovn_northd_db_conf_file="$etcdir/ovn-northd-db-params.conf"
>>     > > 
>>     > >  pidfile_is_running () {
>>     > >      pidfile=$1
>>     > > -    test -e "$pidfile" && pid=`cat "$pidfile"` && pid_exists "$pid"
>>     > > +    test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && pid_exists "$pid"
>>     >
>>     > Hi. Thanks for the fix!
>>     >
>>     > Maybe it's better to add additional check for an empty argument to
>>     > 'pid_exists' function instead? This will cover more cases like invocations
>>     > from the utilities/ovs-lib.in <https://protect2.fireeye.com/url?k=64579ecddf75e065.64561582-0f6314ee43bf7086&u=http://ovs-lib.in>.
>>     >
>>     > I think, you may also add following tag to commit-message in this case:
>>     > Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as non-root.")
>>     >
>>     > This patch also will be needed in ovn-org/ovn repository too.
>>     > (Use 'PATCH ovn' subject prefix while sending patches targeted for ovn repo.)
>>     >
>>     > Best regards, Ilya Maximets.
>>
>>     Thanks for the feedback Ilya, I have amended things (hopefully correctly) in
>>     http://patchwork.ozlabs.org/patch/1147111/ (I could not figure out how
>>     to update an existing patch in patchwork, I hope this is okay)
>>
>>
>> Hi Michele and Ilya,
>>
>> I applied this fix to the OVN repo. It's possible that the fix to address this issue in ovs-lib.in <https://protect2.fireeye.com/url?k=64579ecddf75e065.64561582-0f6314ee43bf7086&u=http://ovs-lib.in> could
>> be missing in some deployments if older ovs version is used. I thought its no harm in having
>> the fixes in both ovn-ctl and ovs-lib.in <https://protect2.fireeye.com/url?k=64579ecddf75e065.64561582-0f6314ee43bf7086&u=http://ovs-lib.in>.
>>
> 
> Hi Numan,
> 
> There was already v2 for this patch (a bit renamed):
> OVS: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361678.html
> OVN: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361679.html

Sorry, maybe I misunderstood what you wanted to do.
Do you suggest to apply v1 to OVN repo and v2 to OVS repo?
What about applying v2 to OVN repo?

> 
> Best regards, Ilya Maximets.
> 
> 


More information about the dev mailing list