[ovs-discuss] ovn-controller and northd trashing 100% cpu due to l3 logical flow update2->transaction->update2->...

Flaviof flavio at flaviof.com
Mon Jul 18 17:16:34 UTC 2016


On Mon, Jul 18, 2016 at 12:36 PM, Guru Shetty <guru at ovn.org> wrote:

>
>
> On 17 July 2016 at 19:00, Flaviof <flavio at flaviof.com> wrote:
>
>> Hi folks,
>>
>> This could be that I'm configuring something wrong, but I consistently
>> get my test VM setup
>> spinning at 100% CPU utilization after doing the following config:
>>
>>    3 VM: db, compute1, compute2
>>    2 ls, each with 1 lsp, 1 lr that has logical ports on both ls
>>
>> https://gist.github.com/a5547b0b98a9e29f6e52b7142072b905
>>
>> # Create a logical switches "ls1" and "ls2".
>> sudo ovn-nbctl ls-add ls1
>> sudo ovn-nbctl ls-add ls2
>>
>> # Create logical port on "ls1" and "ls2".
>> sudo ovn-nbctl lsp-add ls1 ls1-port1
>> sudo ovn-nbctl lsp-add ls2 ls2-port1
>>
>> # Set a MAC address for each of the two logical ports.
>> sudo ovn-nbctl lsp-set-addresses ls1-port1 00:00:00:00:00:01
>> sudo ovn-nbctl lsp-set-addresses ls2-port1 00:00:00:00:00:02
>>
>> # Set up port security for the two logical ports.
>> sudo ovn-nbctl lsp-set-port-security ls1-port1 00:00:00:00:00:01
>> sudo ovn-nbctl lsp-set-port-security ls2-port1 00:00:00:00:00:02
>>
>> # Add a logical router, so 1.0.0.1 can reach 2.0.0.1
>> sudo ovn-nbctl lr-add lr0
>>
>> sudo ovn-nbctl lrp-add lr0 lrp1 00:00:00:01:00:01 1.0.0.2/24
>> peer=lrp1-attachment
>>
>
> The above is wrong. "peer" should only be used to connect 2 routers
> together. There is a bug in OVN unit tests too which does the same thing. I
> am looking at all the places that this problem exists and will send a fix.
>
>

Aha! And that was a copy-and-paste from ovn.at, so now I know why I did it.
;)

Yup, as soon as I removed the peer attribute things look _a_lot_ better.

Thank you Guru!

-- flaviof





> sudo ovn-nbctl -- lsp-add ls1 lrp1-attachment \
>>                -- set Logical_Switch_Port lrp1-attachment \
>>                   type=router \
>>                   options:router-port=lrp1 \
>>                   addresses='"00:00:00:01:00:01 1.0.0.2"'
>>
>> sudo ovn-nbctl lrp-add lr0 lrp2 00:00:00:01:00:02 2.0.0.2/24
>> peer=lrp2-attachment
>> sudo ovn-nbctl -- lsp-add ls2 lrp2-attachment \
>>                -- set Logical_Switch_Port lrp2-attachment \
>>                   type=router \
>>                   options:router-port=lrp2 \
>>                   addresses='"00:00:00:01:00:02 2.0.0.2"'
>>
>> Note that I make this happen even w/out creating any OVS ports in the
>> compute nodes.
>> The logs are available here [1], but I observe that northd and
>> onv-controllers
>> appear to be reacting to ovsdb db update2, and that is causing this
>> vicious cycle by
>> northd [2]:
>>
>> central/db:  https://gist.github.com/62acf7b41860b3ed510e1f7802677264
>> c1: https://gist.github.com/6556f6fc30656a7d5dd6f7d3051173d5
>> c2: https://gist.github.com/9437c2b0b5fbd64c7af14d196e0cf50f
>>
>> Output of flows and client dumps:
>>
>> central: https://gist.github.com/63eb195977a28d790d74cb0cf500c72d
>> c1: https://gist.github.com/12fcb1dfe091d5cbcc53b657f67f080a
>> c2: https://gist.github.com/87dffff7ccb6de9231031e72606e68b8
>>
>> Any ideas/suggestion on further debugging this? Maybe you see something
>> I'm doing wrong?
>>
>> Thanks,
>>
>> -- flaviof
>>
>> [1]:
>> https://www.dropbox.com/sh/r5neb4nfdyktgi9/AACRGmtKZNPky1EE4QRw4CwRa?dl=0
>>
>> [2]:
>>
>> central:
>> 2016-07-18T01:26:28.147Z|25613|poll_loop|DBG|wakeup due to [POLLIN] on fd
>> 12 (<->/var/run/openvswitch/ovnsb_db.sock) at lib/stream-fd.c:155 (47% CPU
>> usage)
>> 2016-07-18T01:26:28.147Z|25614|jsonrpc|DBG|unix:/var/run/openvswitch/ovnsb_db.sock:
>> received notification, method="update2",
>> params=[null,{"Logical_Flow":{"e3651924-6a8c-4bc2-9696-be12719b52d9":{"delete":null},"f5b5091e-23ee-4ce8-8b9f-8e2630901bd6":{"insert":{"match":"outport
>> == \"lrp2-attachment\" && reg0 ==
>> 2.0.0.2","pipeline":"ingress","priority":100,"logical_datapath":["uuid","39e023cc-594a-43a1-8934-ff530d47602c"],"table_id":5,"actions":"eth.dst
>> = 00:00:00:01:00:02;
>> next;"}},"7a6b2ce7-f3f9-486b-a094-15eadd331b4c":{"insert":{"match":"outport
>> == \"lrp1-attachment\" && reg0 ==
>> 1.0.0.2","pipeline":"ingress","priority":100,"logical_datapath":["uuid","f3d761aa-4350-400c-9472-43efe9a81cc5"],"table_id":5,"actions":"eth.dst
>> = 00:00:00:01:00:01;
>> next;"}},"0927e555-45eb-4c8d-9ad7-22a14545e31f":{"delete":null}}}]
>> 2016-07-18T01:26:28.147Z|25615|jsonrpc|DBG|unix:/var/run/openvswitch/ovnsb_db.sock:
>> received reply,
>> result=[{"count":1},{"uuid":["uuid","f5b5091e-23ee-4ce8-8b9f-8e2630901bd6"]},{"count":1},{"uuid":["uuid","7a6b2ce7-f3f9-486b-a094-15eadd331b4c"]}],
>> id=5049
>> 2016-07-18T01:26:28.147Z|25616|poll_loop|DBG|wakeup due to 0-ms timeout
>> at lib/ovsdb-idl.c:3505 (47% CPU usage)
>> 2016-07-18T01:26:28.148Z|25617|jsonrpc|DBG|unix:/var/run/openvswitch/ovnsb_db.sock:
>> send request, method="transact",
>> params=["OVN_Southbound",{"uuid-name":"rowae010ada_2573_4d58_babf_7cacdd193683","row":{"pipeline":"ingress","match":"outport
>> == \"lrp1-attachment\" && reg0 ==
>> 1.0.0.2","priority":100,"logical_datapath":["uuid","f3d761aa-4350-400c-9472-43efe9a81cc5"],"table_id":5,"actions":"eth.dst
>> = 00:00:00:01:00:01;
>> next;","external_ids":["map",[["stage-name","lr_in_arp_resolve"]]]},"op":"insert","table":"Logical_Flow"},{"where":[["_uuid","==",["uuid","f5b5091e-23ee-4ce8-8b9f-8e2630901bd6"]]],"op":"delete","table":"Logical_Flow"},{"uuid-name":"row9ace491f_7392_49ae_af90_c8981d5332d4","row":{"pipeline":"ingress","match":"outport
>> == \"lrp2-attachment\" && reg0 ==
>> 2.0.0.2","priority":100,"logical_datapath":["uuid","39e023cc-594a-43a1-8934-ff530d47602c"],"table_id":5,"actions":"eth.dst
>> = 00:00:00:01:00:02;
>> next;","external_ids":["map",[["stage-name","lr_in_arp_resolve"]]]},"op":"insert","table":"Logical_Flow"},{"where":[["_uuid","==",["uuid","7a6b2ce7-f3f9-486b-a094-15eadd331b4c"]]],"op":"delete","table":"Logical_Flow"}],
>> id=5050
>> ===
>> c1:
>> 2016-07-18T01:26:24.343Z|05024|jsonrpc|DBG|tcp:192.168.33.11:6642:
>> received notification, method="update2",
>> params=[null,{"Logical_Flow":{"0a3e67e2-fcbd-41e9-936a-f88455f170e4":{"insert":{"match":"outport
>> == \"lrp2-attachment\" && reg0 ==
>> 2.0.0.2","pipeline":"ingress","priority":100,"logical_datapath":["uuid","39e023cc-594a-43a1-8934-ff530d47602c"],"table_id":5,"external_ids":["map",[["stage-name","lr_in_arp_resolve"]]],"actions":"eth.dst
>> = 00:00:00:01:00:02;
>> next;"}},"d75e17c9-f71b-4b61-a792-bf0f79f58acf":{"delete":null},"fa8c8b4e-ec34-47a5-ae28-1ec78573aa04":{"insert":{"match":"outport
>> == \"lrp1-attachment\" && reg0 ==
>> 1.0.0.2","pipeline":"ingress","priority":100,"logical_datapath":["uuid","f3d761aa-4350-400c-9472-43efe9a81cc5"],"table_id":5,"external_ids":["map",[["stage-name","lr_in_arp_resolve"]]],"actions":"eth.dst
>> = 00:00:00:01:00:01;
>> next;"}},"dc541d60-0969-4da5-83df-cad58e541fa0":{"delete":null}}}]
>> ===
>> c2:
>> 2016-07-18T01:26:25.884Z|06227|jsonrpc|DBG|tcp:192.168.33.11:6642:
>> received notification, method="update2",
>> params=[null,{"Logical_Flow":{"9ba59604-cbc8-489e-8c09-b67625df8f1e":{"delete":null},"25063148-af81-4242-883c-b3cc729488ec":{"insert":{"match":"outport
>> == \"lrp1-attachment\" && reg0 ==
>> 1.0.0.2","pipeline":"ingress","priority":100,"logical_datapath":["uuid","f3d761aa-4350-400c-9472-43efe9a81cc5"],"table_id":5,"external_ids":["map",[["stage-name","lr_in_arp_resolve"]]],"actions":"eth.dst
>> = 00:00:00:01:00:01;
>> next;"}},"8d0bff01-9672-4dc7-a761-41fa840b86af":{"delete":null},"7983f3e5-9808-457e-9ad3-23c390dd419b":{"insert":{"match":"outport
>> == \"lrp2-attachment\" && reg0 ==
>> 2.0.0.2","pipeline":"ingress","priority":100,"logical_datapath":["uuid","39e023cc-594a-43a1-8934-ff530d47602c"],"table_id":5,"external_ids":["map",[["stage-name","lr_in_arp_resolve"]]],"actions":"eth.dst
>> = 00:00:00:01:00:02; next;"}}}}]
>>
>> _______________________________________________
>> discuss mailing list
>> discuss at openvswitch.org
>> http://openvswitch.org/mailman/listinfo/discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20160718/b4190e40/attachment-0002.html>


More information about the discuss mailing list