[ovs-dev] [PATCH] ovn pacemaker: Provide the option to configure inactivity probe value

Numan Siddique nusiddiq at redhat.com
Fri Oct 13 12:30:06 UTC 2017


On Fri, Oct 13, 2017 at 6:05 AM, Andy Zhou <azhou at ovn.org> wrote:

> Hi, Numan,
>
> I am curious why default 5 seconds inactivity time does not work? Do
> you have more details?
>
> Does the glitch usually happen around the HA switch over?  If this
> happens during normal operation,
> Then this is not HA specific issue, but an indication of some
> connectivity issues.
>

Hi Andy. This happens in the openstack deployment and when the
neutron-server is busy handling lots of API requests.
Normally the deployment would be having 3 controller nodes and
neutron-server would be running in each node.  On each controller node,
neutron-server starts around 10 - 12 neutron workers (which are separate
processes).  Number of API workers is a configuration option and normally
number of cores = no of neutron works if not configured.

I have tested  in both physical nodes deployment and virtual deployment (3
controllers running as vms in a node). Around 40 connections are opened to
the OVN north ovsdb-server by all the neutron workers in the physical
deployment and around 15 connections are opened in the virtual deployment.
When neutron-server is loaded with many API requests, I have noticed that,
ovsdb-server drops the connections when it doesn't get the echo reply every
5 seconds. This leads to lot of reconnections to the ovsdb-server and the
response from the neutron-server is very slow and bad.  With this patch it
seems to work fine.

The issue is not because of any network issues but because of lots of
connections from the neutron-server workers to the ovsdb-server and failure
by the idl clients to reply to the echo request every 5 seconds when the
neutron-server is loaded.

I can make the patch to provide the configuration option to override the
inactivity probe value so that it doesn't affect others who use the OVN OCF
pacemaker script.

Let me know your comments.

Thanks
Numan


>
> On Thu, Oct 12, 2017 at 11:08 AM, Andy Zhou <azhou at ovn.org> wrote:
> > Sure, I will take a look.
> >
> > On Thu, Oct 12, 2017 at 10:49 AM, Ben Pfaff <blp at ovn.org> wrote:
> >> Hi Andy.  In the IRC meeting today, Numan suggested that you might be an
> >> appropriate reviewer for this patch, so if you agree and you have a
> >> chance to look at this then it would be appreciated.
> >>
> >> Thanks,
> >>
> >> Ben.
> >>
> >> On Wed, Oct 11, 2017 at 02:22:33PM +0530, nusiddiq at redhat.com wrote:
> >>> From: Numan Siddique <nusiddiq at redhat.com>
> >>>
> >>> In the case of OVN HA deployments with openstack, it has been noticed
> >>> that the 5 seconds inactivity probe interval is not enough and
> ovsdb-servers
> >>> time out.
> >>> This patch
> >>>    - providdes an option to configure this value.
> >>>    - creates a connection row in NB/SB dbs and sets the target and
> >>>      inactivity_probe values when the node is promoted to master.
> >>>
> >>> CC: Andy Zhou <azhou at ovn.org>
> >>> Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
> >>> ---
> >>>  ovn/utilities/ovndb-servers.ocf | 27 +++++++++++++++++++++++++++
> >>>  1 file changed, 27 insertions(+)
> >>>
> >>> diff --git a/ovn/utilities/ovndb-servers.ocf
> b/ovn/utilities/ovndb-servers.ocf
> >>> index fe1207c22..92620af6a 100755
> >>> --- a/ovn/utilities/ovndb-servers.ocf
> >>> +++ b/ovn/utilities/ovndb-servers.ocf
> >>> @@ -8,6 +8,8 @@
> >>>  : ${SB_MASTER_PORT_DEFAULT="6642"}
> >>>  : ${SB_MASTER_PROTO_DEFAULT="tcp"}
> >>>  : ${MANAGE_NORTHD_DEFAULT="no"}
> >>> +: ${INACTIVE_PROBE_DEFAULT="60000"}
> >>> +
> >>>  CRM_MASTER="${HA_SBIN_DIR}/crm_master -l reboot"
> >>>  CRM_ATTR_REPL_INFO="${HA_SBIN_DIR}/crm_attribute --type crm_config
> --name OVN_REPL_INFO -s ovn_ovsdb_master_server"
> >>>  OVN_CTL=${OCF_RESKEY_ovn_ctl:-${OVN_CTL_DEFAULT}}
> >>> @@ -17,6 +19,7 @@ NB_MASTER_PROTO=${OCF_RESKEY_
> nb_master_protocol:-${NB_MASTER_PROTO_DEFAULT}}
> >>>  SB_MASTER_PORT=${OCF_RESKEY_sb_master_port:-${SB_MASTER_
> PORT_DEFAULT}}
> >>>  SB_MASTER_PROTO=${OCF_RESKEY_sb_master_protocol:-${SB_
> MASTER_PROTO_DEFAULT}}
> >>>  MANAGE_NORTHD=${OCF_RESKEY_manage_northd:-${MANAGE_NORTHD_DEFAULT}}
> >>> +INACTIVE_PROBE=${OCF_RESKEY_inactive_probe_interval:-${
> INACTIVE_PROBE_DEFAULT}}
> >>>
> >>>  # Invalid IP address is an address that can never exist in the
> network, as
> >>>  # mentioned in rfc-5737. The ovsdb servers connects to this IP
> address till
> >>> @@ -101,6 +104,14 @@ ovsdb_server_metadata() {
> >>>    <content type="string" />
> >>>    </parameter>
> >>>
> >>> +  <parameter name="inactive_probe_interval" unique="1">
> >>> +  <longdesc lang="en">
> >>> +  Inactive probe interval to set for ovsdb-server.
> >>> +  </longdesc>
> >>> +  <shortdesc lang="en">Set inactive probe interval</shortdesc>
> >>> +  <content type="string" />
> >>> +  </parameter>
> >>> +
> >>>    </parameters>
> >>>
> >>>    <actions>
> >>> @@ -138,6 +149,22 @@ ovsdb_server_notify() {
> >>>              ${OVN_CTL} --ovn-manage-ovsdb=no start_northd
> >>>          fi
> >>>
> >>> +        conn=`ovn-nbctl get NB_global . connections`
> >>> +        if [ "$conn" == "[]" ]
> >>> +        then
> >>> +            ovn-nbctl -- --id=@conn_uuid create Connection \
> >>> +target="p${NB_MASTER_PROTO}\:${NB_MASTER_PORT}\:${MASTER_IP}" \
> >>> +inactivity_probe=$INACTIVE_PROBE -- set NB_Global .
> connections=@conn_uuid
> >>> +        fi
> >>> +
> >>> +        conn=`ovn-sbctl get SB_global . connections`
> >>> +        if [ "$conn" == "[]" ]
> >>> +        then
> >>> +            ovn-sbctl -- --id=@conn_uuid create Connection \
> >>> +target="p${SB_MASTER_PROTO}\:${SB_MASTER_PORT}\:${MASTER_IP}" \
> >>> +inactivity_probe=$INACTIVE_PROBE -- set SB_Global .
> connections=@conn_uuid
> >>> +        fi
> >>> +
> >>>      else
> >>>          if [ "$MANAGE_NORTHD" = "yes" ]; then
> >>>              # Stop ovn-northd service. Set --ovn-manage-ovsdb=no so
> that
> >>> --
> >>> 2.13.5
> >>>
> >>> _______________________________________________
> >>> dev mailing list
> >>> dev at openvswitch.org
> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list