[ovs-dev] raft ovsdb clustering with scale test

aginwala aginwala at asu.edu
Wed Apr 18 01:07:19 UTC 2018


Thanks Numan:


Will also try to go little deep and see if we get little more hints by
increasing the log levels to debug and will wait for inputs from you and
Ben further . Hope we can find the culprit soon :)


Regards,


On Tue, Apr 17, 2018 at 12:13 AM, Numan Siddique <nusiddiq at redhat.com>
wrote:

>
>
> On Tue, Apr 17, 2018 at 5:52 AM, aginwala <aginwala at asu.edu> wrote:
>
>> Hi Ben/Numan:
>>
>>
>> Just following up again. Let me know for any further
>> improvements/suggestions on this.
>>
>>
> Hi Aliasgar,
>
> I  need to educate myself first by studing raft and Ben's patches :). I
> hope I will do it soon.
>
> Thanks
> Numan
>
>
>
>
>> Regards,
>>
>>
>> On Mon, Apr 9, 2018 at 10:53 AM, aginwala <aginwala at asu.edu> wrote:
>>
>> > Further Analysis indicates that cluster is not able to select a new
>> leader
>> > and hence every node keeps retrying:
>> >
>> > ovn-northd logs around the same time on all three nodes complains about
>> > disconnection from cluster and failing to choose leader
>> >
>> >
>> > #Node1
>> > 2018-04-06T01:19:06.481Z|00466|ovn_northd|INFO|ovn-northd lock lost.
>> This
>> > ovn-northd instance is now on standby.
>> > 2018-04-06T01:19:07.668Z|00467|ovn_northd|INFO|ovn-northd lock
>> acquired.
>> > This ovn-northd instance is now active.
>> > 2018-04-06T01:19:07.671Z|00468|ovsdb_idl|INFO|tcp:192.168.220.102:6642:
>> > clustered database server is not cluster leader; trying another server
>> > 2018-04-06T01:19:15.673Z|00469|reconnect|INFO|tcp:192.168.220.103:6642:
>> > connected
>> > 2018-04-06T01:19:15.673Z|00470|ovn_northd|INFO|ovn-northd lock lost.
>> This
>> > ovn-northd instance is now on standby.
>> >
>> > #Node2
>> > 2018-04-06T01:19:58.249Z|00487|ovn_northd|INFO|ovn-northd lock
>> acquired.
>> > This ovn-northd instance is now active.
>> > 2018-04-06T01:19:58.255Z|00488|ovsdb_idl|INFO|tcp:192.168.220.103:6642:
>> > clustered database server is disconnected from cluster; trying another
>> > server
>> > 2018-04-06T01:20:09.261Z|00489|reconnect|INFO|tcp:192.168.220.102:6642:
>> > connected
>> > 2018-04-06T01:20:09.261Z|00490|ovn_northd|INFO|ovn-northd lock lost.
>> This
>> > ovn-northd instance is now on standby.
>> >
>> >
>> > #Node3
>> > 2018-04-06T01:19:01.654Z|00964|ovn_northd|INFO|ovn-northd lock
>> acquired.
>> > This ovn-northd instance is now active.
>> > 2018-04-06T01:19:01.711Z|00965|ovsdb_idl|INFO|tcp:192.168.220.102:6642:
>> > clustered database server is disconnected frr
>> > om cluster; trying another server
>> > 2018-04-06T01:19:09.715Z|00966|reconnect|INFO|tcp:192.168.220.103:6642:
>> > connected
>> > 2018-04-06T01:19:09.716Z|00967|ovn_northd|INFO|ovn-northd lock lost.
>> This
>> > ovn-northd instance is now on standby.
>> >
>> >
>> >
>> > Regards,
>> > Aliasgar
>> >
>> > On Fri, Apr 6, 2018 at 2:16 PM, aginwala <aginwala at asu.edu> wrote:
>> >
>> >> Hi Ben/Numan:
>> >>
>> >> So I went ahead to try out clustered db in scale env and results are
>> not
>> >> as expected.
>> >> OVS branch used is master; commit-id(2062840612904ff0873d
>> >> 46b2f4f226bc23f2296d)
>> >>
>> >> Setup is uing 3 nodes.
>> >>
>> >> Also disabled inactivity_probe,
>> >> ovn-nbctl --db="tcp:192.168.220.101:6641,tcp:192.168.220.102:6641,tcp:
>> >> 192.168.220.103:6641" set NB . external_ids:inactivity_probe=0
>> >> ovn-sbctl --db="tcp:192.168.220.101:6642,tcp:192.168.220.102:6642,tcp:
>> >> 192.168.220.103:6642" set SB . external_ids:inactivity_probe=0
>> >>
>> >>
>> >> 1. With wait_up = true for 10k lports and 1k HVs which internally uses
>> >> wait-until, it was able to create around 1k lports and rally exited
>> due db
>> >> connection error.
>> >> Cause of failure: database connection failed is @
>> >> https://raw.githubusercontent.com/noah8713/ovn-scale-test/e3
>> >> 98ccdd77b5c0bfed3d1cfe37c7e7ac5d8d8f81/results/output_raft_fail.html
>> >>
>> >> Some data from sb db logs are as below
>> >> 2018-04-05T06:29:32.628Z|00065|poll_loop|INFO|wakeup due to 9-ms
>> timeout
>> >> at lib/reconnect.c:643 (90% CPU usage)
>> >> 2018-04-05T06:30:19.432Z|00066|raft|INFO|current entry eid
>> >> 23a85453-70a1-49ae-bf8f-22a1eeda6a60 does not match prerequisite
>> >> 966d3234-961a-4dc5-b2ad-4f12766fd610 in execute_command_request
>> >> 2018-04-05T06:31:03.428Z|00067|raft|INFO|current entry eid
>> >> f0e3517a-e59a-4533-9e60-a3ded2ba704b does not match prerequisite
>> >> 43ed7278-e1de-44b8-83ea-1cc9578c4d41 in execute_command_request
>> >> 2018-04-05T06:31:03.469Z|00068|raft|INFO|current entry eid
>> >> f0e3517a-e59a-4533-9e60-a3ded2ba704b does not match prerequisite
>> >> 43ed7278-e1de-44b8-83ea-1cc9578c4d41 in execute_command_request
>> >> 2018-04-05T06:31:03.492Z|00069|raft|INFO|current entry eid
>> >> f0e3517a-e59a-4533-9e60-a3ded2ba704b does not match prerequisite
>> >> 43ed7278-e1de-44b8-83ea-1cc9578c4d41 in execute_command_request
>> >> 2018-04-05T06:31:03.505Z|00070|raft|INFO|current entry eid
>> >> f0e3517a-e59a-4533-9e60-a3ded2ba704b does not match prerequisite
>> >> 43ed7278-e1de-44b8-83ea-1cc9578c4d41 in execute_command_request
>> >> 2018-04-05T06:31:03.510Z|00071|raft|INFO|current entry eid
>> >> f0e3517a-e59a-4533-9e60-a3ded2ba704b does not match prerequisite
>> >> 43ed7278-e1de-44b8-83ea-1cc9578c4d41 in execute_command_request
>> >> 2018-04-05T06:32:04.056Z|00072|memory|INFO|peak resident set size grew
>> >> 51% in last 35031.7 seconds, from 24748 kB to 37332 kB
>> >> 2018-04-05T06:32:04.056Z|00073|memory|INFO|cells:59908 json-caches:1
>> >> monitors:3 sessions:431
>> >> 2018-04-05T06:32:27.956Z|00074|raft|WARN|ignoring vote request
>> received
>> >> as leader
>> >> 2018-04-05T06:32:27.958Z|00075|timeval|WARN|Unreasonably long 1278ms
>> >> poll interval (1044ms user, 232ms system)
>> >> 2018-04-05T06:32:27.958Z|00076|timeval|WARN|faults: 61669 minor, 0
>> major
>> >> 2018-04-05T06:32:27.958Z|00077|timeval|WARN|context switches: 0
>> >> voluntary, 3 involuntary
>> >> 2018-04-05T06:32:27.958Z|00078|coverage|INFO|Event coverage, avg rate
>> >> over last: 5 seconds, last minute, last hour,  hash=1dd2af20:
>> >> 2018-04-05T06:32:27.959Z|00079|coverage|INFO|hmap_pathological
>> >>  25.0/sec    24.633/sec        1.5550/sec   total: 19274
>> >> 2018-04-05T06:32:27.959Z|00080|coverage|INFO|hmap_expand
>> >> 4202.6/sec  4041.417/sec      578.9233/sec   total: 14158936
>> >> 2018-04-05T06:32:27.959Z|00081|coverage|INFO|lockfile_lock
>> >> 0.0/sec     0.000/sec        0.0000/sec   total: 1
>> >> 2018-04-05T06:32:27.960Z|00082|coverage|INFO|poll_create_node
>> >>  50281.4/sec 47195.233/sec    29424.6886/sec   total: 1001238470
>> >> 2018-04-05T06:32:27.960Z|00083|coverage|INFO|poll_zero_timeout
>> >> 9.8/sec    12.167/sec        8.2392/sec   total: 312540
>> >> 2018-04-05T06:32:27.960Z|00084|coverage|INFO|seq_change
>> >>  122.8/sec   115.967/sec       68.2761/sec   total: 2309640
>> >> 2018-04-05T06:32:27.961Z|00085|coverage|INFO|pstream_open
>> >>  0.0/sec     0.000/sec        0.0000/sec   total: 4
>> >> 2018-04-05T06:32:27.961Z|00086|coverage|INFO|stream_open
>> >> 0.0/sec     0.000/sec        0.0000/sec   total: 9
>> >> 2018-04-05T06:32:27.961Z|00087|coverage|INFO|util_xalloc
>> >> 229362.2/sec 218086.767/sec    46494.5058/sec   total: 1344641131
>> >>
>> >>
>> >>
>> >> some data from nb db logs:
>> >> 2018-04-05T23:35:33.458Z|00020|raft|INFO|adding b7bd (b7bd at tcp:
>> >> 192.168.220.103:6645) to cluster a36d succeeded (completed)
>> >> 2018-04-05T23:53:06.360Z|00021|jsonrpc|WARN|tcp:192.168.10.10:57298:
>> >> receive error: Connection reset by peer
>> >> 2018-04-05T23:53:06.360Z|00022|reconnect|WARN|tcp:192.168.10.10:57298:
>> >> connection dropped (Connection reset by peer)
>> >> 2018-04-05T23:53:09.071Z|00023|jsonrpc|WARN|tcp:192.168.10.10:57348:
>> >> receive error: Connection reset by peer
>> >> 2018-04-05T23:53:09.071Z|00024|reconnect|WARN|tcp:192.168.10.10:57348:
>> >> connection dropped (Connection reset by peer)
>> >> ^2018-04-06T00:02:03.864Z|00076|raft|INFO|current entry eid
>> >> 0b9eeff2-8377-4f6d-a907-cb4e98396d14 does not match prerequisite
>> >> 277c09b4-96b3-43e9-b482-a9df29576988
>> >> 2018-04-06T00:02:06.882Z|00077|poll_loop|INFO|Dropped 6187 log
>> messages
>> >> in last 6 seconds (most recently, 0 seconds ago) due to excessive rate
>> >> 2018-04-06T00:02:06.882Z|00078|poll_loop|INFO|wakeup due to [POLLIN]
>> on
>> >> fd 31 (192.168.220.102:6645<->192.168.10.10:54782) at
>> >> lib/stream-fd.c:157 (62% CPU usage)
>> >> 2018-04-06T00:02:09.130Z|00079|jsonrpc|WARN|tcp:192.168.10.10:33592:
>> >> receive error: Connection reset by peer
>> >> 2018-04-06T00:02:09.131Z|00080|reconnect|WARN|tcp:192.168.10.10:33592:
>> >> connection dropped (Connection reset by peer)
>> >> 2018-04-06T00:02:33.986Z|00081|poll_loop|INFO|Dropped 1509 log
>> messages
>> >> in last 26 seconds (most recently, 24 seconds ago) due to excessive
>> rate
>> >> 2018-04-06T00:02:33.986Z|00082|poll_loop|INFO|wakeup due to [POLLIN]
>> on
>> >> fd 24 (FIFO pipe:[40290445]) at ovsdb/log.c:994 (51% CPU usage)
>> >> 2018-04-06T00:02:33.986Z|00083|poll_loop|INFO|wakeup due to [POLLIN]
>> on
>> >> fd 29 (192.168.220.102:6645<->192.168.10.10:55738) at
>> >> lib/stream-fd.c:157 (51% CPU usage)
>> >>
>> >>
>> >>
>> >> 2018-04-06T00:13:48.141Z|00037|raft|INFO|rejecting term 4 < current
>> term
>> >> 5 received in vote_reply message from server b7bd
>> >> 2018-04-06T00:14:21.138Z|00038|jsonrpc|WARN|tcp:192.168.10.10:33990:
>> >> receive error: Connection reset by peer
>> >> 2018-04-06T00:14:21.138Z|00039|reconnect|WARN|tcp:192.168.10.10:33990:
>> >> connection dropped (Connection reset by peer)
>> >> 2018-04-06T00:17:01.532Z|00040|raft|INFO|current entry eid
>> >> e07229fe-99d8-436b-8ff5-ee533ed3c825 does not match prerequisite
>> >> d47dbdf1-0981-4077-baab-372b05aeb0bb
>> >> 2018-04-06T00:17:01.532Z|00041|raft|INFO|current entry eid
>> >> e07229fe-99d8-436b-8ff5-ee533ed3c825 does not match prerequisite
>> >> d47dbdf1-0981-4077-baab-372b05aeb0bb
>> >> 2018-04-06T00:17:01.532Z|00042|raft|INFO|current entry eid
>> >> e07229fe-99d8-436b-8ff5-ee533ed3c825 does not match prerequisite
>> >> d47dbdf1-0981-4077-baab-372b05aeb0bb
>> >> 2018-04-06T00:17:01.533Z|00043|raft|INFO|current entry eid
>> >> e07229fe-99d8-436b-8ff5-ee533ed3c825 does not match prerequisite
>> >> d47dbdf1-0981-4077-baab-372b05aeb0bb
>> >> 2018-04-06T00:17:01.533Z|00044|raft|INFO|current entry eid
>> >> e07229fe-99d8-436b-8ff5-ee533ed3c825 does not match prerequisite
>> >> d47dbdf1-0981-4077-baab-372b05aeb0bb
>> >> 2018-04-06T00:19:32.039Z|00045|timeval|WARN|Unreasonably long 1373ms
>> >> poll interval (1340ms user, 24ms system)
>> >> 2018-04-06T00:19:32.039Z|00046|timeval|WARN|faults: 319 minor, 0 major
>> >> 2018-04-06T00:19:32.039Z|00047|timeval|WARN|disk: 0 reads, 368 writes
>> >> 2018-04-06T00:19:32.039Z|00048|timeval|WARN|context switches: 5
>> >> voluntary, 19 involuntary
>> >> 2018-04-06T00:19:32.039Z|00049|coverage|INFO|Event coverage, avg rate
>> >> over last: 5 seconds, last minute, last hour,  hash=1dd2af20:
>> >> 2018-04-06T00:19:32.039Z|00050|coverage|INFO|hmap_pathological
>> >> 0.0/sec     0.000/sec        3.4164/sec   total: 12306
>> >> 2018-04-06T00:19:32.039Z|00051|coverage|INFO|hmap_expand
>> >>  73.6/sec    71.267/sec      729.9656/sec   total: 2628980
>> >> 2018-04-06T00:19:32.039Z|00052|coverage|INFO|lockfile_lock
>> >> 0.0/sec     0.000/sec        0.0003/sec   total: 1
>> >> 2018-04-06T00:19:32.039Z|00053|coverage|INFO|poll_create_node
>> >>  129.6/sec   122.333/sec      866.7178/sec   total: 3120304
>> >> 2018-04-06T00:19:32.039Z|00054|coverage|INFO|poll_zero_timeout
>> >> 0.0/sec     0.267/sec       27.4550/sec   total: 98838
>> >> 2018-04-06T00:19:32.039Z|00055|coverage|INFO|seq_change
>> >> 10.8/sec    10.217/sec      135.4125/sec   total: 487498
>> >> 2018-04-06T00:19:32.039Z|00056|coverage|INFO|pstream_open
>> >>  0.0/sec     0.000/sec        0.0011/sec   total: 4
>> >> 2018-04-06T00:19:32.039Z|00057|coverage|INFO|stream_open
>> >> 0.0/sec     0.000/sec        0.0019/sec   total: 7
>> >> 2018-04-06T00:19:32.039Z|00058|coverage|INFO|util_xalloc
>> >> 1542.2/sec  1488.750/sec    21346.2483/sec   total: 76872987
>> >> 2018-04-06T00:19:32.039Z|00059|coverage|
>> >>
>> >>
>> >>
>> >> To find that wait-until was only the bottleneck, I skipped wait-until
>> and
>> >> test got completed in 2.4 hours but things did not go as expected. All
>> the
>> >> chassis ovn-controllers crashed and ports actually did not get bind
>> all the
>> >> way down to the compute as ovs sb db was running super hot
>> >>
>> >> Below is the controller log from sample HV without wait-until.
>> >> 2018-04-06T04:50:40.274Z|01703|poll_loop|INFO|Dropped 5903 log
>> messages
>> >> in last 6 seconds (most recently, 0 seconds ago) due to excessive rate
>> >> 2018-04-06T04:50:40.274Z|01704|poll_loop|INFO|wakeup due to
>> >> [POLLIN][POLLHUP] on fd 13 (<->/root/sandbox-192.168.72.4
>> 5/br-int.mgmt)
>> >> at ../lib/stream-fd.c:157 (100% CPU usage)
>> >> 2018-04-06T04:50:41.060Z|01705|ovsdb_idl|INFO|tcp:192.168.220.102:6642
>> :
>> >> clustered database server is disconnected from cluster; trying another
>> >> server
>> >> 2018-04-06T04:50:41.066Z|01706|binding|INFO|Claiming lport
>> >> lport_a3060c_mlAOEq for this chassis.
>> >> 2018-04-06T04:50:41.066Z|01707|binding|INFO|lport_a3060c_mlAOEq:
>> >> Claiming a2:1c:47:32:30:52 172.145.23.37
>> >> 2018-04-06T04:50:41.066Z|01708|vconn_stream|ERR|send: Broken pipe
>> >> 2018-04-06T04:50:41.066Z|01709|rconn|WARN|unix:/root/sandbox-
>> >> 192.168.72.45/br-int.mgmt: connection dropped (Broken pipe)
>> >> 2018-04-06T04:50:41.067Z|01710|rconn|INFO|unix:/root/sandbox-
>> >> 192.168.72.45/br-int.mgmt: connected
>> >> 2018-04-06T04:50:41.067Z|01711|rconn|WARN|unix:/root/sandbox-
>> >> 192.168.72.45/br-int.mgmt: connection dropped (Broken pipe)
>> >> 2018-04-06T04:50:41.067Z|01712|util|EMER|../lib/ofp-msgs.c:1062:
>> >> assertion version >= info->min_version && version <= info->max_version
>> >> failed in raw_instance_get()
>> >>
>> >>
>> >>
>> >> Also northd sees many transaction errors without wait-until:
>> >> northd.log
>> >> 2018-04-06T04:25:26.594Z|01076|ovsdb_idl|WARN|transaction error:
>> >> {"details":"Transaction causes multiple rows in \"Port_Binding\" table
>> to
>> >> have identical values (44d6726d-2a3a-428d-b8d4-b0d1e4c9593b and 104)
>> for
>> >> index on columns \"datapath\" and \"tunnel_key\".  First row, with UUID
>> >> 489f777f-d7f6-4b92-80b9-c1c64c083ea4, existed in the database before
>> >> this transaction and was not modified by the transaction.  Second row,
>> with
>> >> UUID b59a2500-3da1-40ab-932c-5e2dd4acc3b4, was inserted by this
>> >> transaction.","error":"constraint violation"}
>> >> 2018-04-06T04:25:27.202Z|01077|ovsdb_idl|WARN|transaction error:
>> >> {"details":"Transaction causes multiple rows in \"Port_Binding\" table
>> to
>> >> have identical values (44d6726d-2a3a-428d-b8d4-b0d1e4c9593b and 26)
>> for
>> >> index on columns \"datapath\" and \"tunnel_key\".  First row, with UUID
>> >> 0668b300-55f8-4bdd-b0a8-8038258d4502, was inserted by this
>> transaction.
>> >> Second row, with UUID 63972d45-8845-4489-ac3e-916909a593c7, existed in
>> >> the database before this transaction and was not modified by the
>> >> transaction.","error":"constraint violation"}
>> >> 2018-04-06T04:25:32.036Z|01078|ovsdb_idl|WARN|transaction error:
>> >> {"details":"Transaction causes multiple rows in \"Port_Binding\" table
>> to
>> >> have identical values (44d6726d-2a3a-428d-b8d4-b0d1e4c9593b and 224)
>> for
>> >> index on columns \"datapath\" and \"tunnel_key\".  First row, with UUID
>> >> 528dea47-f284-40d6-bc7e-a97a084f49eb, existed in the database before
>> >> this transaction and was not modified by the transaction.  Second row,
>> with
>> >> UUID 6fcc2500-fedb-4ee9-b54d-ea548dd36936, was inserted by this
>> >> transaction.","error":"constraint violation"}
>> >> 2018-04-06T04:25:32.482Z|01079|ovsdb_idl|WARN|transaction error:
>> >> {"details":"Transaction causes multiple rows in \"Port_Binding\" table
>> to
>> >> have identical values (44d6726d-2a3a-428d-b8d4-b0d1e4c9593b and 23)
>> for
>> >> index on columns \"datapath\" and \"tunnel_key\".  First row, with UUID
>> >> 925eb001-adeb-4748-a55f-6c5812e70a7e, was inserted by this
>> transaction.
>> >> Second row, with UUID 5ed7bbc5-b7f5-4d4e-8e99-d8c4adcb2b34, existed in
>> >> the database before this transaction and was not modified by the
>> >> transaction.","error":"constraint violation"}
>> >> 2018-04-06T04:25:35.204Z|01080|ovsdb_idl|WARN|transaction error:
>> >> {"details":"Transaction causes multiple rows in \"Port_Binding\" table
>> to
>> >> have identical values (44d6726d-2a3a-428d-b8d4-b0d1e4c9593b and 18)
>> for
>> >> index on columns \"datapath\" and \"tunnel_key\".  First row, with UUID
>> >> 4b429b00-75ca-4e38-904d-80a11e6c0f38, was inserted by this
>> transaction.
>> >> Second row, with UUID c2db03fe-18b6-4a2b-8db0-bf37ef93db92, existed in
>> >> the database before this transaction and was not modified by the
>> >> transaction.","error":"constraint violation"}
>> >> 2018-04-06T04:25:35.933Z|01081|ovsdb_idl|INFO|tcp:192.168.220.101:6642
>> :
>> >> clustered database server is not cluster leader; trying another server
>> >>
>> >>
>> >> To summerize:
>> >>
>> >>    - ovn nb db has issues with wait-until and works fine without
>> >>    wait-until
>> >>    - ovn sb db causes controller to crash due to more HVs and just few
>> >>    hundred of ports with no solid root cause without wait-until. Even
>> sbctl
>> >>    commands hangs as sb db is 100% on cpu.
>> >>    - ovn northd sees transaction error without wait-until
>> >>
>> >>
>> >>
>> >> Will continue to troubleshoot and would love to get any
>> >> improvements/suggestions if any.
>> >>
>> >>
>> >> Regards,
>> >>
>> >>
>> >
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
>


More information about the dev mailing list