[ovs-discuss] Question to OVN DB pacemaker script

aginwala aginwala at asu.edu
Thu May 10 21:21:48 UTC 2018


On Thu, May 10, 2018 at 1:54 PM, aginwala <aginwala at asu.edu> wrote:

> Hi :
>
> Just to further update, I am able to re-open tcp port for failover
> scenario when new master is getting promoted with additional code changes
> as below which do require stop of ovs service on the new selected master to
> reset the tcp settings:
>
>
> diff --git a/ovn/utilities/ovndb-servers.ocf
> b/ovn/utilities/ovndb-servers.ocf
> index 164b6bc..8cb4c25 100755
> --- a/ovn/utilities/ovndb-servers.ocf
> +++ b/ovn/utilities/ovndb-servers.ocf
> @@ -295,8 +295,8 @@ ovsdb_server_start() {
>
>      set ${OVN_CTL}
>
> -    set $@ --db-nb-addr=${MASTER_IP} --db-nb-port=${NB_MASTER_PORT}
> -    set $@ --db-sb-addr=${MASTER_IP} --db-sb-port=${SB_MASTER_PORT}
> +    set $@ --db-nb-port=${NB_MASTER_PORT}
> +    set $@ --db-sb-port=${SB_MASTER_PORT}
>
>      if [ "x${NB_MASTER_PROTO}" = xtcp ]; then
>          set $@ --db-nb-create-insecure-remote=yes
> @@ -307,6 +307,8 @@ ovsdb_server_start() {
>      fi
>
>      if [ "x${present_master}" = x ]; then
> +        set $@ --db-nb-create-insecure-remote=yes
> +        set $@ --db-sb-create-insecure-remote=yes
>          # No master detected, or the previous master is not among the
>          # set starting.
>          #
> @@ -316,6 +318,8 @@ ovsdb_server_start() {
>          set $@ --db-nb-sync-from-addr=${INVALID_IP_ADDRESS}
> --db-sb-sync-from-addr=${INVALID_IP_ADDRESS}
>
>      elif [ ${present_master} != ${host_name} ]; then
> +        set $@ --db-nb-create-insecure-remote=no
> +        set $@ --db-sb-create-insecure-remote=no
>          # An existing master is active, connect to it
>          set $@ --db-nb-sync-from-addr=${MASTER_IP}
> --db-sb-sync-from-addr=${MASTER_IP}
>          set $@ --db-nb-sync-from-port=${NB_MASTER_PORT}
> @@ -416,6 +420,8 @@ ovsdb_server_promote() {
>              ;;
>      esac
>
> +    ${OVN_CTL} stop_ovsdb
> +    ovsdb_server_start
>      ${OVN_CTL} promote_ovnnb
>      ${OVN_CTL} promote_ovnsb
>
>
>
> Below are the scenarios tested:
>
>>> updating the test scenario table correctly as it got skipped from
confluence copy

> MasterSlaveScenarioResult
>
>    - Reboot master
>
>
>    - NA
>
> reboot/failure New master gets promoted with tcp ports enabled to start
> taking LB traffic.
>
>    - NA
>
>
>    -
>       - Reboot slave
>
>
> reboot/failure
> No change and current master continues taking traffic with slave continue
> to sync from master.
>
>    -
>       - Reboot master
>
>
>
>    -
>       - Reboot slave
>
>
> reboot/failure
> New master gets promoted with tcp ports enabled to start taking LB traffic.
>
> Also sync on slaves from master works as expected:
> # On master
> ovn-nbctl --db=tcp:10.169.129.33:6641 ls-add  556
> # on slave port is shutdown as expected
> ovn-nbctl --db=tcp:10.169.129.34:6641 show
> ovn-nbctl: tcp:10.169.129.34:6641: database connection failed (Connection
> refused)
> # on slave local unix socket, above lswitch 556 gets replicated too as
> --sync-from=tcp:10.149.4.252:6641
> ovn-nbctl show
> switch 2bd07b67-fd6b-401d-9612-da75e8f9ffc8 (556)
>
> # Same testing for sb db too
> # Slave port 6642 is shutdown too
> ovn-sbctl --db=tcp:10.169.129.34:6642 show hangs and
> # Using master ip works
>  ovn-sbctl --db=tcp:10.169.129.33:6642 show
> Chassis "21f12bd6-e9e8-4ee2-afeb-28b331df6715"
>     hostname: "test-pace2-2365308.lvs02.dev.ebayc3.com"
>     Encap geneve
>         ip: "10.169.129.34"
>         options: {csum="true"}
>
>
>
> # Accessing via LB vip works fine too as only one member is active:
> for i in `seq 1 500`; do ovn-sbctl --db=tcp:10.149.4.252:664
> <http://10.149.4.252:6642/>2 <http://10.149.4.252:6642> show; done
>
>>> Typo as its:  for i in `seq 1 500`; do ovn-nbctl --db=tcp:
10.149.4.252:664 <http://10.149.4.252:6642/>1 show ;done

> switch 2bd07b67-fd6b-401d-9612-da75e8f9ffc8 (556)
> switch 2bd07b67-fd6b-401d-9612-da75e8f9ffc8 (556)
> switch 2bd07b67-fd6b-401d-9612-da75e8f9ffc8 (556)
> switch 2bd07b67-fd6b-401d-9612-da75e8f9ffc8 (556)
> switch 2bd07b67-fd6b-401d-9612-da75e8f9ffc8 (556)
>
>
> Everything works fine as expected. Let me know for any corner case missed.
> I will submit a formal patch using LISTEN_ON_MASTER_IP_ONLY for using LB
> with tcp  to avoid breaking existing functionality accordingly.
>
>
>
> Regards,
> Aliasgar
>
>
>
> On Thu, May 10, 2018 at 9:55 AM, aginwala <aginwala at asu.edu> wrote:
>
>> Thanks folks for suggestions:
>>
>> For LB vip configurations, I did  the testing further and yes it does
>> tries to hit the slave db as per the logs below and fails as slave do not
>> have write permission of which LB is not aware of:
>> for i in `seq 1 500`; do ovn-nbctl --db=tcp:10.149.4.252:6641 ls-add
>> $i590;done
>> ovn-nbctl: transaction error: {"details":"insert operation not allowed
>> when database server is in read only mode","error":"not allowed"}
>> ovn-nbctl: transaction error: {"details":"insert operation not allowed
>> when database server is in read only mode","error":"not allowed"}
>> ovn-nbctl: transaction error: {"details":"insert operation not allowed
>> when database server is in read only mode","error":"not allowed"}
>>
>> Hence, with little more code changes(in the same patch without the flag
>> variable suggestion), I am able to shutdown the tcp port on the slave and
>> it works fine as below:
>> #Master Node
>> # ovn-nbctl --db=tcp:10.169.129.33:6641 ls-add test444
>> #Slave Node
>> # ovn-nbctl --db=tcp:10.169.129.34:6641 ls-add test444
>> ovn-nbctl: tcp:10.169.129.34:6641: database connection failed
>> (Connection refused)
>>
>> Code to shutdown tcp port on slave db along with only master listening on
>> tcp ports:
>> diff --git a/ovn/utilities/ovndb-servers.ocf
>> b/ovn/utilities/ovndb-servers.ocf
>> index 164b6bc..b265df6 100755
>> --- a/ovn/utilities/ovndb-servers.ocf
>> +++ b/ovn/utilities/ovndb-servers.ocf
>> @@ -295,8 +295,8 @@ ovsdb_server_start() {
>>
>>      set ${OVN_CTL}
>>
>> -    set $@ --db-nb-addr=${MASTER_IP} --db-nb-port=${NB_MASTER_PORT}
>> -    set $@ --db-sb-addr=${MASTER_IP} --db-sb-port=${SB_MASTER_PORT}
>> +    set $@ --db-nb-port=${NB_MASTER_PORT}
>> +    set $@ --db-sb-port=${SB_MASTER_PORT}
>>
>>      if [ "x${NB_MASTER_PROTO}" = xtcp ]; then
>>          set $@ --db-nb-create-insecure-remote=yes
>> @@ -307,6 +307,8 @@ ovsdb_server_start() {
>>      fi
>>
>>      if [ "x${present_master}" = x ]; then
>> +        set $@ --db-nb-create-insecure-remote=yes
>> +        set $@ --db-sb-create-insecure-remote=yes
>>          # No master detected, or the previous master is not among the
>>          # set starting.
>>          #
>> @@ -316,6 +318,8 @@ ovsdb_server_start() {
>>          set $@ --db-nb-sync-from-addr=${INVALID_IP_ADDRESS}
>> --db-sb-sync-from-addr=${INVALID_IP_ADDR
>>
>>      elif [ ${present_master} != ${host_name} ]; then
>> +        set $@ --db-nb-create-insecure-remote=no
>> +        set $@ --db-sb-create-insecure-remote=no
>>
>>
>> But I noticed that if the slave becomes active post failover after active
>> node reboot/failure, pacemaker shows it online but I am not able to access
>> the dbs.
>>
>> # crm status
>> Online: [ test-pace2-2365308 ]
>> OFFLINE: [ test-pace1-2365293 ]
>>
>> Full list of resources:
>>
>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>      Masters: [ test-pace2-2365308 ]
>>      Stopped: [ test-pace1-2365293 ]
>>
>>
>> # ovn-nbctl --db=tcp:10.169.129.33:6641 ls-add test444
>> ovn-nbctl: tcp:10.169.129.33:6641: database connection failed
>> (Connection refused)
>> # ovn-nbctl --db=tcp:10.169.129.34:6641 ls-add test444
>> ovn-nbctl: tcp:10.169.129.34:6641: database connection failed
>> (Connection refused)
>>
>> Hence, if failover happens, slave is already running with
>> --sync-from=lbVIP:6641/6642 for nb and sb db respectively. Thus, re-opening
>> of tcp ports for nb and sb db on the slave that is getting promoted to
>> master is not happening automatically.
>>
>> Let me know if there is a valid way/approach too which I am missing to
>> handle it during slave promote logic?  Will do further code changes
>> accordingly.
>>
>> Note: Current code changes for use with LB will needs to be handled for
>> ssl too. Will have to handle that separately but want to get the tcp
>> working first and we can add ssl support later.
>>
>>
>> Regards,
>> Aliasgar
>>
>> On Wed, May 9, 2018 at 12:19 PM, Numan Siddique <nusiddiq at redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, May 10, 2018 at 12:44 AM, Han Zhou <zhouhan at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Wed, May 9, 2018 at 11:51 AM, Numan Siddique <nusiddiq at redhat.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, May 10, 2018 at 12:15 AM, Han Zhou <zhouhan at gmail.com> wrote:
>>>>>
>>>>>> Thanks Ali for the quick patch. Please see my comments inline.
>>>>>>
>>>>>> On Wed, May 9, 2018 at 9:30 AM, aginwala <aginwala at asu.edu> wrote:
>>>>>> >
>>>>>> > Thanks Han and Numan for the clarity to help sort it out.
>>>>>> >
>>>>>> > For making vip work with using LB in my two node setup, I had
>>>>>> changed below code to skip setting master IP  when creating pcs resource
>>>>>> for ovndbs and listen on 0.0.0.0 instead. Hence, the discussion seems
>>>>>> inline with the code change which is small for sure as below:
>>>>>> >
>>>>>> >
>>>>>> > diff --git a/ovn/utilities/ovndb-servers.ocf
>>>>>> b/ovn/utilities/ovndb-servers.ocf
>>>>>> > index 164b6bc..d4c9ad7 100755
>>>>>> > --- a/ovn/utilities/ovndb-servers.ocf
>>>>>> > +++ b/ovn/utilities/ovndb-servers.ocf
>>>>>> > @@ -295,8 +295,8 @@ ovsdb_server_start() {
>>>>>> >
>>>>>> >      set ${OVN_CTL}
>>>>>> >
>>>>>> > -    set $@ --db-nb-addr=${MASTER_IP} --db-nb-port=${NB_MASTER_PORT}
>>>>>> > -    set $@ --db-sb-addr=${MASTER_IP} --db-sb-port=${SB_MASTER_PORT}
>>>>>> > +    set $@ --db-nb-port=${NB_MASTER_PORT}
>>>>>> > +    set $@ --db-sb-port=${SB_MASTER_PORT}
>>>>>> >
>>>>>> >      if [ "x${NB_MASTER_PROTO}" = xtcp ]; then
>>>>>> >          set $@ --db-nb-create-insecure-remote=yes
>>>>>> >
>>>>>>
>>>>>> This change solves the IP binding problem. It will just listen on
>>>>>> 0.0.0.0.
>>>>>>
>>>>>
>>>>> One problem with this approach I see is that it would listen on all
>>>>> the IPs. May be it's not a good idea and may have some security issues.
>>>>>
>>>>> Can we instead check the value of  MASTER_IP param something like
>>>>> below ?
>>>>>
>>>>>  if [ "$MASTER_IP" == "0.0.0.0" ]; then
>>>>>      set $@ --db-nb-addr=${MASTER_IP} --db-nb-port=${NB_MASTER_PORT}
>>>>>      set $@ --db-sb-addr=${MASTER_IP} --db-sb-port=${SB_MASTER_PORT}
>>>>> else
>>>>>      set $@ --db-nb-port=${NB_MASTER_PORT}
>>>>>      set $@ --db-sb-port=${SB_MASTER_PORT}
>>>>> fi
>>>>>
>>>>> And when you create OVN pacemaker resource in your deployment, you can
>>>>> pass master_ip=0.0.0.0
>>>>>
>>>>> Will this work ?
>>>>>
>>>>>
>>>> Maybe some misunderstanding here. We still need to use master_ip = LB
>>>> VIP, so that the standby nodes can "sync-from" the active node. So we
>>>> cannot pass 0.0.0.0 explicitly.
>>>>
>>>
>>> I misunderstood earlier. I thought you wouldn't need master ip at all.
>>> Thanks for the clarification.
>>>
>>>>
>>>> I didn't understand your code above either. Why would we specify the
>>>> master_ip if we know it is 0.0.0.0? Or do you mean the other way around but
>>>> just a typo in the code?
>>>>
>>>> For security of listening on any IP, I am not quit sure. It may be a
>>>> problem if the nodes sits on multiple networks and some of them are
>>>> considered insecure, and you want to listen on the security one only. If
>>>> this is the concern, we can add a parameter e.g. LISTEN_ON_MASTER_IP_ONLY,
>>>> and set it to true by default. What do you think?
>>>>
>>>
>>> I would prefer adding the parameter as you have suggested so that the
>>> existing behavior remain intact.
>>>
>>> Thanks
>>> Numan
>>>
>>>
>>>> Thanks,
>>>> Han
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20180510/df5a720c/attachment-0001.html>


More information about the discuss mailing list