[ovs-dev] [PATCH] ovn pacemaker: Provide the option to configure inactivity probe value

Numan Siddique nusiddiq at redhat.com
Mon Oct 16 08:32:54 UTC 2017


On Fri, Oct 13, 2017 at 9:36 PM, Russell Bryant <russell at ovn.org> wrote:

> On Fri, Oct 13, 2017 at 8:30 AM, Numan Siddique <nusiddiq at redhat.com>
> wrote:
> > On Fri, Oct 13, 2017 at 6:05 AM, Andy Zhou <azhou at ovn.org> wrote:
> >
> >> Hi, Numan,
> >>
> >> I am curious why default 5 seconds inactivity time does not work? Do
> >> you have more details?
> >>
> >> Does the glitch usually happen around the HA switch over?  If this
> >> happens during normal operation,
> >> Then this is not HA specific issue, but an indication of some
> >> connectivity issues.
> >>
> >
> > Hi Andy. This happens in the openstack deployment and when the
> > neutron-server is busy handling lots of API requests.
> > Normally the deployment would be having 3 controller nodes and
> > neutron-server would be running in each node.  On each controller node,
> > neutron-server starts around 10 - 12 neutron workers (which are separate
> > processes).  Number of API workers is a configuration option and normally
> > number of cores = no of neutron works if not configured.
> >
> > I have tested  in both physical nodes deployment and virtual deployment
> (3
> > controllers running as vms in a node). Around 40 connections are opened
> to
> > the OVN north ovsdb-server by all the neutron workers in the physical
> > deployment and around 15 connections are opened in the virtual
> deployment.
> > When neutron-server is loaded with many API requests, I have noticed
> that,
> > ovsdb-server drops the connections when it doesn't get the echo reply
> every
> > 5 seconds. This leads to lot of reconnections to the ovsdb-server and the
> > response from the neutron-server is very slow and bad.  With this patch
> it
> > seems to work fine.
> >
> > The issue is not because of any network issues but because of lots of
> > connections from the neutron-server workers to the ovsdb-server and
> failure
> > by the idl clients to reply to the echo request every 5 seconds when the
> > neutron-server is loaded.
>
> We have to disable the inactivity probe everywhere each time we have
> done performance testing so far.
>
> > I can make the patch to provide the configuration option to override the
> > inactivity probe value so that it doesn't affect others who use the OVN
> OCF
> > pacemaker script.
> >
> > Let me know your comments.
>
> I think the default through this script should match the normal
> default.  It looks like it defaults to 60s in this patch instead of
> 5s?  I would make it match.


Ack. Will do that in the next patch.

Thanks


> I do like exposing the ability to change
> it, though.  We could consider setting a different default through our
> OpenStack work.
>
> >
> > Thanks
> > Numan
> >
> >
> >>
> >> On Thu, Oct 12, 2017 at 11:08 AM, Andy Zhou <azhou at ovn.org> wrote:
> >> > Sure, I will take a look.
> >> >
> >> > On Thu, Oct 12, 2017 at 10:49 AM, Ben Pfaff <blp at ovn.org> wrote:
> >> >> Hi Andy.  In the IRC meeting today, Numan suggested that you might
> be an
> >> >> appropriate reviewer for this patch, so if you agree and you have a
> >> >> chance to look at this then it would be appreciated.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Ben.
> >> >>
> >> >> On Wed, Oct 11, 2017 at 02:22:33PM +0530, nusiddiq at redhat.com wrote:
> >> >>> From: Numan Siddique <nusiddiq at redhat.com>
> >> >>>
> >> >>> In the case of OVN HA deployments with openstack, it has been
> noticed
> >> >>> that the 5 seconds inactivity probe interval is not enough and
> >> ovsdb-servers
> >> >>> time out.
> >> >>> This patch
> >> >>>    - providdes an option to configure this value.
> >> >>>    - creates a connection row in NB/SB dbs and sets the target and
> >> >>>      inactivity_probe values when the node is promoted to master.
> >> >>>
> >> >>> CC: Andy Zhou <azhou at ovn.org>
> >> >>> Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
> >> >>> ---
> >> >>>  ovn/utilities/ovndb-servers.ocf | 27 +++++++++++++++++++++++++++
> >> >>>  1 file changed, 27 insertions(+)
> >> >>>
> >> >>> diff --git a/ovn/utilities/ovndb-servers.ocf
> >> b/ovn/utilities/ovndb-servers.ocf
> >> >>> index fe1207c22..92620af6a 100755
> >> >>> --- a/ovn/utilities/ovndb-servers.ocf
> >> >>> +++ b/ovn/utilities/ovndb-servers.ocf
> >> >>> @@ -8,6 +8,8 @@
> >> >>>  : ${SB_MASTER_PORT_DEFAULT="6642"}
> >> >>>  : ${SB_MASTER_PROTO_DEFAULT="tcp"}
> >> >>>  : ${MANAGE_NORTHD_DEFAULT="no"}
> >> >>> +: ${INACTIVE_PROBE_DEFAULT="60000"}
> >> >>> +
> >> >>>  CRM_MASTER="${HA_SBIN_DIR}/crm_master -l reboot"
> >> >>>  CRM_ATTR_REPL_INFO="${HA_SBIN_DIR}/crm_attribute --type crm_config
> >> --name OVN_REPL_INFO -s ovn_ovsdb_master_server"
> >> >>>  OVN_CTL=${OCF_RESKEY_ovn_ctl:-${OVN_CTL_DEFAULT}}
> >> >>> @@ -17,6 +19,7 @@ NB_MASTER_PROTO=${OCF_RESKEY_
> >> nb_master_protocol:-${NB_MASTER_PROTO_DEFAULT}}
> >> >>>  SB_MASTER_PORT=${OCF_RESKEY_sb_master_port:-${SB_MASTER_
> >> PORT_DEFAULT}}
> >> >>>  SB_MASTER_PROTO=${OCF_RESKEY_sb_master_protocol:-${SB_
> >> MASTER_PROTO_DEFAULT}}
> >> >>>  MANAGE_NORTHD=${OCF_RESKEY_manage_northd:-${MANAGE_
> NORTHD_DEFAULT}}
> >> >>> +INACTIVE_PROBE=${OCF_RESKEY_inactive_probe_interval:-${
> >> INACTIVE_PROBE_DEFAULT}}
> >> >>>
> >> >>>  # Invalid IP address is an address that can never exist in the
> >> network, as
> >> >>>  # mentioned in rfc-5737. The ovsdb servers connects to this IP
> >> address till
> >> >>> @@ -101,6 +104,14 @@ ovsdb_server_metadata() {
> >> >>>    <content type="string" />
> >> >>>    </parameter>
> >> >>>
> >> >>> +  <parameter name="inactive_probe_interval" unique="1">
> >> >>> +  <longdesc lang="en">
> >> >>> +  Inactive probe interval to set for ovsdb-server.
> >> >>> +  </longdesc>
> >> >>> +  <shortdesc lang="en">Set inactive probe interval</shortdesc>
> >> >>> +  <content type="string" />
> >> >>> +  </parameter>
> >> >>> +
> >> >>>    </parameters>
> >> >>>
> >> >>>    <actions>
> >> >>> @@ -138,6 +149,22 @@ ovsdb_server_notify() {
> >> >>>              ${OVN_CTL} --ovn-manage-ovsdb=no start_northd
> >> >>>          fi
> >> >>>
> >> >>> +        conn=`ovn-nbctl get NB_global . connections`
> >> >>> +        if [ "$conn" == "[]" ]
> >> >>> +        then
> >> >>> +            ovn-nbctl -- --id=@conn_uuid create Connection \
> >> >>> +target="p${NB_MASTER_PROTO}\:${NB_MASTER_PORT}\:${MASTER_IP}" \
> >> >>> +inactivity_probe=$INACTIVE_PROBE -- set NB_Global .
> >> connections=@conn_uuid
> >> >>> +        fi
> >> >>> +
> >> >>> +        conn=`ovn-sbctl get SB_global . connections`
> >> >>> +        if [ "$conn" == "[]" ]
> >> >>> +        then
> >> >>> +            ovn-sbctl -- --id=@conn_uuid create Connection \
> >> >>> +target="p${SB_MASTER_PROTO}\:${SB_MASTER_PORT}\:${MASTER_IP}" \
> >> >>> +inactivity_probe=$INACTIVE_PROBE -- set SB_Global .
> >> connections=@conn_uuid
> >> >>> +        fi
> >> >>> +
> >> >>>      else
> >> >>>          if [ "$MANAGE_NORTHD" = "yes" ]; then
> >> >>>              # Stop ovn-northd service. Set --ovn-manage-ovsdb=no so
> >> that
> >> >>> --
> >> >>> 2.13.5
> >> >>>
> >> >>> _______________________________________________
> >> >>> dev mailing list
> >> >>> dev at openvswitch.org
> >> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
>
> --
> Russell Bryant
>


More information about the dev mailing list