[ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are created for the same logical-switch

Han Zhou zhouhan at gmail.com
Fri Jul 31 02:30:35 UTC 2020


On Thu, Jul 30, 2020 at 7:24 PM Tony Liu <tonyliu0592 at hotmail.com> wrote:

> Hi Han,
>
>
>
> Continue with this thread. Regarding to your comment in another thread.
>
> ===================
>
> 2) OVSDB clients usually monitors and syncs all (interested) data from
> server to local, so when they do declarative processing, they could correct
> problems by themselves. In fact, ovn-northd does the check and deletes
> duplicated datapaths. I did a simple test and it did cleanup by itself:
>
> 2020-07-30T18:55:53.057Z|00006|ovn_northd|INFO|ovn-northd lock acquired.
> This ovn-northd instance is now active.
> 2020-07-30T19:02:10.465Z|00007|ovn_northd|INFO|deleting Datapath_Binding
> abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate
> external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e
>
>
>
> I am not sure why in your case north was stuck, but I agree there must be
> something wrong. Please collect northd logs if you encounter this again so
> we can dig further.
>
> ===================
>
>
>
> You are right that ovn-northd will try to clean up the duplication, but,
>
> there are ports in port-binding referencing to this datapath-binding, so
>
> ovn-northd fails to delete the datapath-binding. I have to manually delete
>
> those ports to be able to delete the datapath-binding. I believe it’s not
>
> supported for ovn-northd to delete a configuration that is being
>
> referenced. Is that right? If yes, should we fix it or it's the intention?
>
>
>

Yes, good point!
It is definitely a bug and we should fix it. I think the best fix is to
change the schema and add "logical_datapath" as a index, but we'll need to
make it backward compatible to avoid upgrade issues.


>
>
> Thanks!
>
>
>
> Tony
>
>
>
> *From: *Tony Liu <tonyliu0592 at hotmail.com>
> *Sent: *Thursday, July 23, 2020 7:51 PM
> *To: *Han Zhou <zhouhan at gmail.com>; Ben Pfaff <blp at ovn.org>
> *Cc: *ovs-dev <ovs-dev at openvswitch.org>; ovs-discuss at openvswitch.org
> *Subject: *Re: [ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are
> created for the same logical-switch
>
>
>
> Hi Han,
>
>
>
> Thanks for taking the time to look into this. This problem is not
> consistently reproduced.
>
> Developers normally ignore it:) I think we collected enough context and we
> can let it go for now.
>
> I will rebuild setup, tune that RAFT heartbeat timer and rerun the test.
> Will keep you posted.
>
>
>
>
>
> Thanks again!
>
>
>
> Tony
>
>
>
> *From:* Han Zhou <zhouhan at gmail.com>
> *Sent:* July 23, 2020 06:53 PM
> *To:* Tony Liu <tonyliu0592 at hotmail.com>; Ben Pfaff <blp at ovn.org>
> *Cc:* Numan Siddique <numans at ovn.org>; ovs-dev <ovs-dev at openvswitch.org>;
> ovs-discuss at openvswitch.org <ovs-discuss at openvswitch.org>
> *Subject:* Re: [ovs-dev] OVN: Two datapath-bindings are created for the
> same logical-switch
>
>
>
>
> On Thu, Jul 23, 2020 at 10:33 AM Tony Liu <tonyliu0592 at hotmail.com> wrote:
> >
> > Changed the title for this specific problem.
> > I looked into logs and have more findings.
> > The problem was happening when sb-db leader switched.
>
>
>
> Hi Tony,
>
>
>
> Thanks for this detailed information. Could you confirm which version of
> OVS is used (to understand OVSDB behavior).
>
>
>
> >
> > For ovsdb cluster, what may trigger the leader switch? Given the log,
> > 2020-07-21T01:08:38.119Z|00074|raft|INFO|term 2: 1135 ms timeout
> expired, starting election
> > The election is asked by a follower node. Is that because the connection
> from follower to leader timeout,
> > then follower assumes the leader is dead and starts an election?
>
>
>
> You are right, the RAFT heart beat would timeout when server is too busy
> and the election timer is too small (default 1s). For large scale test,
> please increase the election timer by:
>
> ovn-appctl -t <ovn-sb> cluster/change-election-timer OVN_Southbound <time>
>
>
>
> I suggest to set <time> to be at least bigger than 10000 or more in your
> case. (you need to increase the value gradually - 2000, 4000, 8000, 16000 -
> so it will take you 4 commands to reach this from the initial default value
> 1000, not very convenient, I know :)
>
>
>
> <ovn-sb> here is the path to the socket ctl file of ovn-sb, usually under
> /var/run/ovn.
>
>
>
> >
>
> > For ovn-northd (3 instances), they all connect to the sb-db leader,
> whoever has the locker is the master.
> > When sb-db leader switches, all ovn-northd instances look for the new
> leader. In this case, there is no
> > guarantee that the old ovn-northd master remains the role, other
> ovn-northd instance may find the
> > leader and acquire the lock first. So, the sb-db leader switch may also
> cause ovn-northd master switch.
> > Such switch may happen in the middle of ovn-northd transaction, in that
> case, is there any guarantee to
> > the transaction completeness? My guess is that, the older created a
> datapath-binding for a logical-switch,
> > switch happened when this transaction is not completed, then the new
> master/leader created another
> > data-path binding for the same logical-switch. Does it make any sense?
>
>
>
> I agree with you it could be related to the failover and the lock behavior
> during the failover. It could be a lock problem causing 2 northds became
> active at the same time for a short moment. However, I still can't imagine
> how the duplicated entries are created with different tunnel keys. If both
> northd create the datapath binding for the same LS at the same time, they
> should allocate the same tunnel key, and then one of them should fail
> during the transaction commit because of index conflict in DB. But here
> they have different keys so both were inserted in DB.
>
>
>
> (OVSDB transaction is atomic even during failover and no client should see
> partial data of a transaction.)
>
>
>
> (cc Ben to comment more on the possibility of both clients acquiring the
> lock during failover)
>
>
>
> >
> > From the log, when sb-db switched, ovn-northd master connected to the
> new leader and lost the master,
> > but immediately, it acquired the lock and become master again. Not sure
> how this happened.
>
>
>
> From the ovn-northd logs, the ovn-northd on .86 firstly connected to SB DB
> on .85, which suggests that it regarded .85 as the leader (otherwise it
> would disconnect and retry another server), and then immediately after
> connecting .85 and acquiring the lock, it disconnected because it somehow
> noticed that .85 is not the leader, and then retried and connected to .86
> (the new leader) and found out that the lock is already acquired by .85
> northd, so it switched to standby. The .85 northd luckly connected to .86
> in the first try so it was able to acquire the lock on the leader node
> first. Maybe the key thing is to figure out why the .86 northd initially
> connected to .85 DB which is not the leader and acquired lock.
>
>
>
> Thanks,
>
> Han
>
>
>
> >
> > Here are some loggings.
> > ==== .84 sb-db leader =================
> > 2020-07-21T01:08:20.221Z|01408|raft|INFO|current entry eid
> 639238ba-bc00-4efe-bb66-6ac766bb5f4b does not match prerequisite
> 78e8e167-8b4c-4292-8e25-d9975631b010 in execute_command_request
> >
> > 2020-07-21T01:08:38.450Z|01409|timeval|WARN|Unreasonably long 1435ms
> poll interval (1135ms user, 43ms system)
> > 2020-07-21T01:08:38.451Z|01410|timeval|WARN|faults: 5942 minor, 0 major
> > 2020-07-21T01:08:38.451Z|01411|timeval|WARN|disk: 0 reads, 50216 writes
> > 2020-07-21T01:08:38.452Z|01412|timeval|WARN|context switches: 60
> voluntary, 25 involuntary
> > 2020-07-21T01:08:38.453Z|01413|coverage|INFO|Skipping details of
> duplicate event coverage for hash=45329980
> >
> > 2020-07-21T01:08:38.455Z|01414|raft|WARN|ignoring vote request received
> as leader
> > 2020-07-21T01:08:38.456Z|01415|raft|INFO|server 1f9e is leader for term 2
> > 2020-07-21T01:08:38.457Z|01416|raft|INFO|rejected append_reply (not
> leader)
> > 2020-07-21T01:08:38.471Z|01417|raft|INFO|rejected append_reply (not
> leader)
> >
> > 2020-07-21T01:23:00.890Z|01418|timeval|WARN|Unreasonably long 1336ms
> poll interval (1102ms user, 20ms system)
> >
> > ==== .85 sb-db ==================
> > 2020-07-21T01:05:25.734Z|00061|coverage|INFO|Skipping details of
> duplicate event coverage for hash=a1e16f5b
> >
> > 2020-07-21T01:08:38.166Z|00062|raft|INFO|server 1f9e is leader for term 2
> > 2020-07-21T01:08:38.455Z|00063|raft|INFO|rejecting term 1 < current term
> 2 received in append_request message from server 39a9
> >
> > 2020-07-21T01:23:44.500Z|00064|timeval|WARN|Unreasonably long 1447ms
> poll interval (1118ms user, 33ms system)
> >
> > ==== .86 sb-db =================
> > 2020-07-21T01:02:35.607Z|00073|coverage|INFO|70 events never hit
> >
> > 2020-07-21T01:08:38.119Z|00074|raft|INFO|term 2: 1135 ms timeout
> expired, starting election
> > 2020-07-21T01:08:38.165Z|00075|raft|INFO|term 2: elected leader by 2+ of
> 3 servers
> > 2020-07-21T01:08:38.455Z|00076|raft|INFO|rejecting term 1 < current term
> 2 received in append_request message from server 39a9
> > 2020-07-21T01:08:38.456Z|00077|raft|INFO|rejecting term 1 < current term
> 2 received in vote_reply message from server 39a9
> >
> > 2020-07-21T01:12:51.363Z|00078|timeval|WARN|Unreasonably long 1466ms
> poll interval (1186ms user, 27ms system)
> >
> > ==== .85 ovn-northd master ==============
> > 2020-07-21T01:08:25.925Z|00479|poll_loop|INFO|wakeup due to [POLLIN] on
> fd 8 (10.6.20.85:45478<->10.6.20.84:6641) at lib/stream-fd.c:157 (54% CPU
> usage)
> >
> > 2020-07-21T01:08:38.464Z|00480|ovsdb_idl|INFO|tcp:10.6.20.84:6642:
> clustered database server is not cluster leader; trying another server
> > 2020-07-21T01:08:38.465Z|00481|reconnect|INFO|tcp:10.6.20.84:6642:
> connection attempt timed out
> > 2020-07-21T01:08:38.465Z|00482|reconnect|INFO|tcp:10.6.20.86:6642:
> connecting...
> > 2020-07-21T01:08:38.465Z|00483|reconnect|INFO|tcp:10.6.20.86:6642:
> connected
> > 2020-07-21T01:08:38.465Z|00484|ovn_northd|INFO|ovn-northd lock lost.
> This ovn-northd instance is now on standby.
> > 2020-07-21T01:08:38.465Z|00485|ovn_northd|INFO|ovn-northd lock acquired.
> This ovn-northd instance is now active.
> > 2020-07-21T01:08:38.966Z|00486|ovn_northd|INFO|deleting Datapath_Binding
> 108cf745-db82-43c0-a9d3-afe27a41e4aa with duplicate
> external-ids:logical-switch/router 8a5d1d3c-e9fc-4cbe-a461-98ff838e6473
> >
> > 2020-07-20T21:32:48.803Z|00019|reconnect|INFO|tcp:10.6.20.84:6641:
> connected
> >
> > ==== .86 ovn-northd ==============
> > 2020-07-21T01:08:38.463Z|00020|ovsdb_idl|INFO|tcp:10.6.20.84:6642:
> clustered database server is not cluster leader; trying another server
> > 2020-07-21T01:08:38.464Z|00021|reconnect|INFO|tcp:10.6.20.84:6642:
> connection attempt timed out
> > 2020-07-21T01:08:38.464Z|00022|reconnect|INFO|tcp:10.6.20.85:6642:
> connecting...
> > 2020-07-21T01:08:38.464Z|00023|reconnect|INFO|tcp:10.6.20.85:6642:
> connected
> > 2020-07-21T01:08:38.464Z|00024|ovn_northd|INFO|ovn-northd lock acquired.
> This ovn-northd instance is now active.
> > 2020-07-21T01:08:38.465Z|00025|ovsdb_idl|INFO|tcp:10.6.20.85:6642:
> clustered database server is not cluster leader; trying another server
> > 2020-07-21T01:08:38.466Z|00026|reconnect|INFO|tcp:10.6.20.85:6642:
> connection attempt timed out
> > 2020-07-21T01:08:39.467Z|00027|reconnect|INFO|tcp:10.6.20.86:6642:
> connecting...
> > 2020-07-21T01:08:39.468Z|00028|reconnect|INFO|tcp:10.6.20.86:6642:
> connected
> > 2020-07-21T01:08:39.468Z|00029|ovn_northd|INFO|ovn-northd lock lost.
> This ovn-northd instance is now on standby.
> >
> > 2020-07-21T02:04:22.718Z|00030|ovsdb_idl|INFO|tcp:10.6.20.86:6642:
> clustered database server is not cluster leader; trying another server
> >
> >
> > Thanks!
> >
> > Tony
> > ________________________________
> >
> > From: Numan Siddique <numans at ovn.org>
> > Sent: July 23, 2020 03:15 AM
> > To: Tony Liu <tonyliu0592 at hotmail.com>
> > Cc: ovs-dev <ovs-dev at openvswitch.org>; ovs-discuss at openvswitch.org <
> ovs-discuss at openvswitch.org>
> > Subject: Re: [ovs-discuss] OVN nb-db and sb-db out of sync
> >
> >
> >
> > On Thu, Jul 23, 2020 at 8:22 AM Tony Liu <tonyliu0592 at hotmail.com
> <mailto:tonyliu0592 at hotmail.com>> wrote:
> > Hi,
> >
> > I see why sb-db broke at 1568th port-binding.
> > The 1568th datapath-binding in sb-db references the same
> >
> > _uuid               : 108cf745-db82-43c0-a9d3-afe27a41e4aa
> > external_ids        :
> {logical-switch="8a5d1d3c-e9fc-4cbe-a461-98ff838e6473",
> name=neutron-e907dc17-f1e8-4217-a37d-86e9a98c86c2, name2=net-97-192}
> > tunnel_key          : 1567
> >
> > _uuid               : d934ed92-2f3c-4b31-8a76-2a5047a3bb46
> > external_ids        :
> {logical-switch="8a5d1d3c-e9fc-4cbe-a461-98ff838e6473",
> name=neutron-e907dc17-f1e8-4217-a37d-86e9a98c86c2, name2=net-97-192}
> > tunnel_key          : 1568
> >
> > I don't believe this is supposed to happen. Any idea how could it happen?
> > Then ovn-northd is stuck in trying to delete this duplication, and it
> ignores all the following updates.
> > That caused out-of-sync between nb-db and sb-db.
> > Any way I can fix it manually, like with ovn-sbctl to delete it?
> >
> > If you delete the ovn sb db resources manually, ovn-northd should sync
> it up.
> >
> > But I'm surprised why ovn-northd didn't sync earlier. There's something
> wrong related to raft going
> > on here. Not sure what.
> >
> > Thanks
> > Numan
> >
> >
> >
> >
> > Thanks!
> >
> > Tony
> >
> > ________________________________
> > From: dev <ovs-dev-bounces at openvswitch.org<mailto:
> ovs-dev-bounces at openvswitch.org>> on behalf of Tony Liu <
> tonyliu0592 at hotmail.com<mailto:tonyliu0592 at hotmail.com>>
> > Sent: July 22, 2020 11:33 AM
> > To: ovs-dev <ovs-dev at openvswitch.org<mailto:ovs-dev at openvswitch.org>>
> > Subject: [ovs-dev] OVN nb-db and sb-db out of sync
> >
> > Hi,
> >
> > During a scaling test where 4000 networks are created from OpenStack, I
> see that
> > nb-db and sb-db are out of sync. All 4000 logical switches and 8000 LS
> ports
> > (GW port and service port of each network) are created in nb-db. In
> sb-db,
> > only 1567 port-bindings, 4000 is expected.
> >
> > [root at ovn-db-2 ~]# ovn-nbctl list nb_global
> > _uuid               : b7b3aa05-f7ed-4dbc-979f-10445ac325b8
> > connections         : []
> > external_ids        : {"neutron:liveness_check_at"="2020-07-22
> 04:03:17.726917+00:00"}
> > hv_cfg              : 312
> > ipsec               : false
> > name                : ""
> > nb_cfg              : 2636
> > options             : {mac_prefix="ca:e8:07",
> svc_monitor_mac="4e:d0:3a:80:d4:b7"}
> > sb_cfg              : 2005
> > ssl                 : []
> >
> > [root at ovn-db-2 ~]# ovn-sbctl list sb_global
> > _uuid               : 3720bc1d-b0da-47ce-85ca-96fa8d398489
> > connections         : []
> > external_ids        : {}
> > ipsec               : false
> > nb_cfg              : 312
> > options             : {mac_prefix="ca:e8:07",
> svc_monitor_mac="4e:d0:3a:80:d4:b7"}
> > ssl                 : []
> >
> > Is there any way to force ovn-northd to rebuild sb-db to sync with nb-db,
> > like manipulating nb_cfg or anything else? Note, it's 3-node RAFT
> cluster for both
> > nb-db and sb-db.
> >
> > Is that "incremental update" implemented in 20.03?
> > If not, in which release it's going to be available?
> >
> >
> > Thanks!
> >
> > Tony
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org<mailto:dev at openvswitch.org>
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > _______________________________________________
> > discuss mailing list
> > discuss at openvswitch.org<mailto:discuss at openvswitch.org>
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200730/a2df6823/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 5E96F5CD68624167993426D66900ADCE.png
Type: image/png
Size: 144 bytes
Desc: not available
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200730/a2df6823/attachment-0001.png>


More information about the discuss mailing list