[ovs-dev] [PATCH] ovn pacemaker: Provide the option to configure inactivity probe value

Miguel Angel Ajo Pelayo majopela at redhat.com
Tue Oct 17 15:26:18 UTC 2017


Acked-By: Miguel Angel Ajo <majopela at redhat.com>

It makes sense to be able to configure the inactive probe time, also
disabling the echo requests on server, as Ben said I agree would also make
sense in any future patch.

On Mon, Oct 16, 2017 at 9:48 PM, Ben Pfaff <blp at ovn.org> wrote:

> On Mon, Oct 16, 2017 at 10:58:43AM -0700, Ben Pfaff wrote:
> > On Mon, Oct 16, 2017 at 02:50:48PM +0530, Numan Siddique wrote:
> > > On Sat, Oct 14, 2017 at 2:56 AM, Ben Pfaff <blp at ovn.org> wrote:
> > >
> > > > On Fri, Oct 13, 2017 at 12:06:56PM -0400, Russell Bryant wrote:
> > > > > On Fri, Oct 13, 2017 at 8:30 AM, Numan Siddique <
> nusiddiq at redhat.com>
> > > > wrote:
> > > > > > On Fri, Oct 13, 2017 at 6:05 AM, Andy Zhou <azhou at ovn.org>
> wrote:
> > > > > >
> > > > > >> Hi, Numan,
> > > > > >>
> > > > > >> I am curious why default 5 seconds inactivity time does not
> work? Do
> > > > > >> you have more details?
> > > > > >>
> > > > > >> Does the glitch usually happen around the HA switch over?  If
> this
> > > > > >> happens during normal operation,
> > > > > >> Then this is not HA specific issue, but an indication of some
> > > > > >> connectivity issues.
> > > > > >>
> > > > > >
> > > > > > Hi Andy. This happens in the openstack deployment and when the
> > > > > > neutron-server is busy handling lots of API requests.
> > > > > > Normally the deployment would be having 3 controller nodes and
> > > > > > neutron-server would be running in each node.  On each
> controller node,
> > > > > > neutron-server starts around 10 - 12 neutron workers (which are
> > > > separate
> > > > > > processes).  Number of API workers is a configuration option and
> > > > normally
> > > > > > number of cores = no of neutron works if not configured.
> > > > > >
> > > > > > I have tested  in both physical nodes deployment and virtual
> > > > deployment (3
> > > > > > controllers running as vms in a node). Around 40 connections are
> > > > opened to
> > > > > > the OVN north ovsdb-server by all the neutron workers in the
> physical
> > > > > > deployment and around 15 connections are opened in the virtual
> > > > deployment.
> > > > > > When neutron-server is loaded with many API requests, I have
> noticed
> > > > that,
> > > > > > ovsdb-server drops the connections when it doesn't get the echo
> reply
> > > > every
> > > > > > 5 seconds. This leads to lot of reconnections to the
> ovsdb-server and
> > > > the
> > > > > > response from the neutron-server is very slow and bad.  With this
> > > > patch it
> > > > > > seems to work fine.
> > > > > >
> > > > > > The issue is not because of any network issues but because of
> lots of
> > > > > > connections from the neutron-server workers to the ovsdb-server
> and
> > > > failure
> > > > > > by the idl clients to reply to the echo request every 5 seconds
> when
> > > > the
> > > > > > neutron-server is loaded.
> > > > >
> > > > > We have to disable the inactivity probe everywhere each time we
> have
> > > > > done performance testing so far.
> > > >
> > > > Really this seems that it's a bug (or inadequacy) in ovsdb-server.
> It's
> > > > pretty sad that ovsdb-server can't reply within 5 seconds
> > >
> > >
> > > It's actually the ovsdb python idl client which is not able to reply
> within
> > > 5 seconds for the
> > > echo request from ovsdb-server.
> >
> > Oh, I'm surprised that ovsdb-server is doing the echo-requests, I
> > thought that we generally did them from the client end.
>
> One perfectly acceptable approach might be to simply disable
> echo-requests on the server side entirely and do them from the client.
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list