[ovs-discuss] [External] : Re: /etc/openvswitch/conf.db filling up with lost of "ovn-controller: modifying OVS tunnels" updates

Brendan Doyle brendan.doyle at oracle.com
Thu Oct 28 09:19:23 UTC 2021


Numan,

Just wondering if you got  a chance to look at those logs?

Thanks

Brendan

On 27/10/2021 11:25, Brendan Doyle wrote:
> Hi,
>
> I finally got some debug logs, truncated after the failure occurs, the 
> truncated entries just
> are repeated updates of the same entry.
>
> So some more light on this, It seems this is a timing issue. The test 
> being run involves
> creating  a number of Logical switches (LS), Routers (LR) and 
> Distributed Router Port
> gateways (DR). And then immediately deleting them, with the last 
> created DR being
> deleted first. Our CMs is using the ovsdbapp python lib to do this.
>
> So it occurs to me that perhaps the objects get created in NB, but 
> before they have been
> propagated to SB and to the HV chassis, we get the delete, and this 
> causes updates to
> be sent to the chassis for a logical port that does not exist? Just a 
> hypothesis.
>
> The ovn-nbctl has synchronization flags (--wait) to guard against such 
> behavior, does
> ovsdbapp I wonder?
>
> In any-case the test fails (we see a runaway conf.db) pretty 
> regularly, but not every time.
> The failure is always observed on the delete operations. If I put a 
> delay after create and
> before delete, then we don't see the failure.
>
> If anyone can shed light on this from the logs would be much appreciated.
>
> Thanks
>
> Brendan
>
>
>
>
>
>
>
> On 26/10/2021 17:11, Brendan Doyle wrote:
>>
>>
>> On 26/10/2021 15:50, Numan Siddique wrote:
>>> On Tue, Oct 26, 2021 at 8:20 AM Brendan Doyle 
>>> <brendan.doyle at oracle.com> wrote:
>>>> Hi,
>>>>
>>>>
>>>> So what is very odd here, is that I have used ovn-nbctl to delete 
>>>> the NB
>>>> config, so
>>>> # ovn-nbctl show
>>>> # ovn-sbctl lflow-list
>>>>
>>>> Yet I still see /etc/openvswitch/conf.db growing with updates for
>>>> Logical switch ports that no longer exist!
>>>>
>>>> "],["ct-zone-ln-ls_vcn9195577_external_ugw","220"],["ct-zone-ln-ls_vcn9206002_external_igw","110"],["ct-zone-ln-ls_vcn9210052_external_igw","110"],["ct-zone-ln-ls_vcn9232395_external_ugw","75"],["ct-zone-ln-ls_vcn9236987_external_igw","110"],["ct-zone-ln-ls_vcn9236987_external_ugw","78"],["ct-zone-ln-ls_vcn9255861_external_igw","118"],["ct-zone-ln-ls_vcn9255861_external_ugw","100"],["ct-zone-ln-ls_vcn9319435_external_igw","87"],["ct-zone-ln-ls_vcn9352502_external_igw","40"],["ct-zone-ln-ls_vcn9402504_external_ugw","99"],["ct-zone-ln-ls_vcn9403404_external_igw","133"],["ct-zone-ln-ls_vcn9403404_external_ugw","114"],["ct-zone-ln-ls_vcn9461566_external_ugw","191"],["ct-zone-ln-ls_vcn9480000_external_igw","254"],["ct-zone-ln-ls_vcn9480000_external_ugw","236"],["ct-zone-ln-ls_vcn9492134_external_igw","262"],["ct-zone-ln-ls_vcn9523503_external_igw","207"],["ct-zone-ln-ls_vcn9542102_external_igw","133"],["ct-zone-ln-ls_vcn9542102_external_ugw","115"],["ct-zone-ln-ls_vcn9559658_external_igw","125"],["ct-zone-ln-ls_vcn9559658_external_ugw","78"],["ct-zone-ln-ls_vcn9594034_external_igw","49"],["ct-zone-ln-ls_vcn9619021_external_igw","133"],["ct-zone-ln-ls_vcn9634773_external_igw","292"],["ct-zone-ln-ls_vcn9649169_external_igw","132"],["ct-zone-ln-ls_vcn9649169_external_ugw","110"],["ct-zone-ln-ls_vcn9661290_external_ugw","78"],["ct-zone-ln-ls_vcn9734192_external_ugw","114"],["ct-zone-ln-ls_vcn9774252_external_igw","262"],["ct-zone-ln-ls_vcn9796262_external_igw","72"],["ct-zone-ln-ls_vcn9796262_external_ugw","54"],["ct-zone-ln-ls_vcn9805903_external_igw","147"],["ct-zone-ln-ls_vcn9805903_external_ugw","126"],["ct-zone-ln-ls_vcn9809895_external_igw","246"],["ct-zone-ln-ls_vcn9812576_external_ugw","78"],["ct-zone-ln-ls_vcn9834728_external_igw","110"],["ct-zone-ln-ls_vcn9886683_external_ugw","114"],["ct-zone-ln-ls_vcn9903419_external_ugw","235"],["ct-zone-ln-ls_vcn9917510_external_igw","56"],["ct-zone-ln-ls_vcn9917510_external_ugw","38"]]]}},"_comment":"ovn-controller: 
>>>>
>>>> modifying OVS tunnels 'pcacn001'"}
>>>>
>>>> A shortened version of one entry Could it be that switch ports must be
>>>> deleted before
>>>> deleting the switch? I was under the impression once a switch is 
>>>> deleted
>>>> it's ports get deleted?
>>> Yes.  If you delete the switch,  the switch ports get deleted too.
>>>
>>> After deleting the logical switch (or switch ports) do you see them to
>>> be deleted by
>>> ovn-northd in SB DB ?
>>>
>>> Run - ovn-sbctl list port_binding <deleted_port>
>>> or/and
>>>
>>> ovn-sbctl list datapath_binding <deleted_lswitch>
>>>
>>> I'd suggest you enable jsonrpc debug in ovn-controller and see 
>>> what's happening.
>>> It would be helpful if you can share the ovn-controller debug logs.
>>>
>>> ovn-appctl -t ovn-controller vlog/set jsonrpc:dbg
>>>
>>
>>
>> So in my test I create a simple network then delete it so NB DB and 
>> SB DB
>> are empty.
>>
>> # ovn-sbctl list port_binding
>> # ovn-sbctl list datapath_binding
>> #
>>
>> The network has a number of LS's and LR's and two Distributed Router 
>> (DR) ports (on
>> separate LRs).  When I just create one DR all seems fine, but when I 
>> add the second into
>> the mix I get a runaway openvswitch/conf.db but NOT on all chassis. 
>> I  have 4 chassis
>> that I can schedule  the DR ports to. In this latest test I observed  
>> the runaway conf.db
>> on pcacn003 & pcacn005. The logs are too large to send in email, is 
>> there an ftp server
>> that I can upload to?
>>
>> I will redo with debug  enabled and collect updated logs. The conf.db 
>> on both pcacn003 &
>> pcacn005 is several GBs.
>>
>>
>> The only way to recover is to stop the OVS/OVN procs, then delete 
>> /etc/openvswitch/conf.db
>> and restart them.
>>
>> Brendan
>>
>>
>>
>>
>>> Thanks
>>> Numan
>>>
>>>>
>>>> switch 712757c3-2481-4f8b-940c-05dc13ce37a5 
>>>> (ls_vcn9319435_external_ugw)
>>>>       port ls_vcn9319435_external_ugw-lr_vcn9319435
>>>>           type: router
>>>>           router-port: lr_vcn9319435-ls_vcn9319435_external_ugw
>>>>       port ln-ls_vcn9319435_external_ugw
>>>>           type: localnet
>>>>           addresses: ["unknown"]
>>>>
>>>> router 80c281af-319b-416b-8a17-0ce7b8901bb1 (lr_vcn9319435)
>>>>       port lr_vcn9319435-ls_vcn9319435_external_ugw
>>>>           mac: "00:13:97:88:31:90"
>>>>           networks: ["253.255.80.4/16"]
>>>>           gateway chassis: [pcacn002 pcacn003 pcacn001]
>>>>       port lr_vcn9319435-lsb_vcn9319435
>>>>           mac: "00:13:97:d4:26:ec"
>>>>           networks: ["253.255.29.2/25"]
>>>>       nat 6c87050f-cd27-423e-815e-deda74bd9bc6
>>>>           external ip: "253.255.80.4"
>>>>           logical ip: "10.221.0.0/16"
>>>>           type: "snat"
>>>>
>>>> Do each port have to be deleted or is it ok to just delete the switch
>>>> and router?
>>>>
>>>> Brendan
>>>>
>>>> On 25/10/2021 16:10, Brendan Doyle wrote:
>>>>>
>>>>> On 25/10/2021 15:08, Numan Siddique wrote:
>>>>>> On Fri, Oct 22, 2021 at 9:30 AM Brendan Doyle
>>>>>> <brendan.doyle at oracle.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>
>>>>>>> Looking at /etc/openvswitch/conf.db I see it getting very large:
>>>>>>>
>>>>>>> [root at pcacn001 ~]#  ls -l /etc/openvswitch/conf.db
>>>>>>> -rw-r--r--. 1 root root 6069248828 Oct 22 11:55
>>>>>>> /etc/openvswitch/conf.db
>>>>>>>
>>>>>>> And has lots and lots (mostly)  "ovn-controller: modifying OVS 
>>>>>>> tunnels"
>>>>>>> updates entries, like below.
>>>>>>> What are these? it does not seem normal?
>>>>>>> OVSDB JSON 4687 00e8788dd5d9af2aac5ca7724759017c52ddd580
>>>>>>> {"_date":1634903752117,"Bridge":{"745726c4-0451-4f52-a52b-1f9c5e85c703":{"external_ids":["map",[["ct-zone-0dca7370-1c18-4117-84e4-a72f277ccc6c_dnat","4"],["ct-zone-0dca7370-1c18-4117-84e4-a72f277ccc6c_snat","1"],["ct-zone-11637f38-8725-4c77-adfe-f9c4c804ae8c_dnat","4"],["ct-zone-11637f38-8725-4c77-adfe-f9c4c804ae8c_snat","5"],["ct-zone-1de487d1-f3a5-4b15-bae4-aa8cf794fcf9_dnat","17"],["ct-zone-1de487d1-f3a5-4b15-bae4-aa8cf794fcf9_snat","7"],["ct-zone-22c71c2a-0e59-41cc-a2da-91d3c7276c11_dnat","9"],["ct-zone-22c71c2a-0e59-41cc-a2da-91d3c7276c11_snat","10"],["ct-zone-3228b120-4192-476b-ab67-51fb45e786d6_dnat","3"],["ct-zone-3228b120-4192-476b-ab67-51fb45e786d6_snat","4"],["ct-zone-3753ff1a-d0cf-48e4-b06a-640f0467d202_dnat","19"],["ct-zone-3753ff1a-d0cf-48e4-b06a-640f0467d202_snat","18"],["ct-zone-3c1c02f4-31c9-45d4-9c63-54ad2122bb15_dnat","10"],["ct-zone-3c1c02f4-31c9-45d4-9c63-54ad2122bb15_snat","16"],["ct-zone-423896cb-5573-4c54-b6e2-38f192eacae3_dnat","9"],["ct-zone-423896cb-5573 
>>>>>>>
>>>>>>>
>>>>> -4c54-b6e2-38f192eacae3_snat","12"],["ct-zone-46b7b247-31a7-4fbb-88b9-0f3db042409c_dnat","10"],["ct-zone-46b7b247-31a7-4fbb-88b9-0f3db042409c_snat","11"],["ct-zone-51376927-fca0-49b3-b0ba-1aa22153b366_dnat","2"],["ct-zone-51376927-fca0-49b3-b0ba-1aa22153b366_snat","5"],["ct-zone-58033baa-916d-47d4-bcf0-d95f7fb1f861_dnat","18"],["ct-zone-58033baa-916d-47d4-bcf0-d95f7fb1f861_snat","3"],["ct-zone-5f92f974-f0dc-4820-bb43-a14cc16d851f_dnat","12"],["ct-zone-5f92f974-f0dc-4820-bb43-a14cc16d851f_snat","11"],["ct-zone-87055326-0535-4042-a0ff-bf0e9f494433_dnat","10"],["ct-zone-87055326-0535-4042-a0ff-bf0e9f494433_snat","12"],["ct-zone-8a840bfe-118f-4041-ac72-0637d6373ffc_dnat","1"],["ct-zone-8a840bfe-118f-4041-ac72-0637d6373ffc_snat","11"],["ct-zone-8fff9b0b-0fd6-42f9-ab77-e9f1475a5d82_dnat","2"],["ct-zone-8fff9b0b-0fd6-42f9-ab77-e9f1475a5d82_snat","13"],["ct-zone-913c36a1-f987-4084-9119-f279b317c72f_dnat","11"],["ct-zone-913c36a1-f987-4084-9119-f279b317c72f_snat","12"],["ct-zone-9498aca9-762 
>>>>>
>>>>>
>>>>> 3-4ce0-a0ff-d4d5c17d7223_dnat","19"],["ct-zone-9498aca9-7623-4ce0-a0ff-d4d5c17d7223_snat","15"],["ct-zone-9c373522-fd02-424f-a2b3-14dc359062d2_dnat","18"],["ct-zone-9c373522-fd02-424f-a2b3-14dc359062d2_snat","17"],["ct-zone-a28b45db-2dfb-4d38-905c-c5eb44da8c9c_dnat","13"],["ct-zone-a28b45db-2dfb-4d38-905c-c5eb44da8c9c_snat","10"],["ct-zone-b1e8636a-5cf8-48ba-9693-793a59e5430d_dnat","8"],["ct-zone-b1e8636a-5cf8-48ba-9693-793a59e5430d_snat","14"],["ct-zone-bbcc6e17-ee1e-4e82-b404-1dd0f1307002_dnat","12"],["ct-zone-bbcc6e17-ee1e-4e82-b404-1dd0f1307002_snat","11"],["ct-zone-bd3b86b7-2aba-4ff7-a5f7-975612692aca_dnat","13"],["ct-zone-bd3b86b7-2aba-4ff7-a5f7-975612692aca_snat","10"],["ct-zone-cb94affd-f2aa-4bdd-9407-1e16ac046596_dnat","9"],["ct-zone-cb94affd-f2aa-4bdd-9407-1e16ac046596_snat","1"],["ct-zone-ce71f6db-4dab-41ca-bd10-cd6204687b9d_dnat","16"],["ct-zone-ce71f6db-4dab-41ca-bd10-cd6204687b9d_snat","15"],["ct-zone-cfa46699-cc79-445e-a902-f1e37ff99806_dnat","5"],["ct-zone-cfa46699-c 
>>>>>
>>>>>
>>>>> c79-445e-a902-f1e37ff99806_snat","2"],["ct-zone-cr-lr_vcn0747157-ls_vcn0747157_external_ugw","9"],["ct-zone-cr-lr_vcn1645571_igw-ls_vcn1645571_external_igw","21"],["ct-zone-cr-lr_vcn7319607-ls_vcn7319607_external_ugw","14"],["ct-zone-cr-lr_vcn7319607_igw-ls_vcn7319607_external_igw","21"],["ct-zone-cr-lr_vcn7395327_igw-ls_vcn7395327_external_igw","21"],["ct-zone-cr-lr_vcn9567153-ls_vcn9567153_external_ugw","1"],["ct-zone-d0232f68-8d26-454c-87bf-e79066a1ed62_dnat","9"],["ct-zone-d0232f68-8d26-454c-87bf-e79066a1ed62_snat","8"],["ct-zone-d161aaef-e73e-452c-9d77-f465718f1f67_dnat","3"],["ct-zone-d161aaef-e73e-452c-9d77-f465718f1f67_snat","6"],["ct-zone-e2f0a229-15b0-4255-b52d-71b078239ed2_dnat","12"],["ct-zone-e2f0a229-15b0-4255-b52d-71b078239ed2_snat","13"],["ct-zone-e6986bf4-e813-4df0-9bfe-1de95ceb2e30_dnat","15"],["ct-zone-e6986bf4-e813-4df0-9bfe-1de95ceb2e30_snat","14"],["ct-zone-e93b7a93-8507-4036-8281-f2be764a44da_dnat","16"],["ct-zone-e93b7a93-8507-4036-8281-f2be764a44da_snat","17 
>>>>>
>>>>>
>>>>> "],["ct-zone-f3b9843a-d498-41dc-8244-0f87d9bc1384_dnat","6"],["ct-zone-f3b9843a-d498-41dc-8244-0f87d9bc1384_snat","7"],["ct-zone-f42fcb51-0af6-426f-974b-1478a169a70c_dnat","13"],["ct-zone-f42fcb51-0af6-426f-974b-1478a169a70c_snat","11"],["ct-zone-f708c12e-34b6-4657-b7d0-4b5ac5e0d6c7_dnat","20"],["ct-zone-f708c12e-34b6-4657-b7d0-4b5ac5e0d6c7_snat","19"],["ct-zone-ln-ls_vcn6603036_external_ugw","7"],["ct-zone-ln-ls_vcn7319607_external_igw","20"],["ct-zone-ln-ls_vcn7395327_external_ugw","7"],["ct-zone-ln-ls_vcn7836024_external_igw","20"],["ct-zone-ln-ls_vcn9567153_external_igw","21"],["ct-zone-ln-ls_vcn9567153_external_ugw","8"]]]}},"_comment":"ovn-controller: 
>>>>>
>>>>>
>>>>>>> modifying OVS tunnels 'pcacn001'"}
>>>>>> In which OVN version are you seeing this ?
>>>>> ovs-vsctl -V
>>>>> ovs-vsctl (Open vSwitch) 2.14.0_r0.0.0
>>>>> DB Schema 8.2.0
>>>>> # ovn-nbctl -V
>>>>> ovn-nbctl 20.09.0_r1.0.0
>>>>> Open vSwitch Library 2.14.0
>>>>> DB Schema 5.27.0
>>>>>
>>>>>
>>>>>
>>>>>> I wonder if you're seeing this issue -
>>>>>> https://urldefense.com/v3/__https://github.com/ovn-org/ovn/commit/e7788554a7f5e824fc0d8afc6cbf20e94fe4245f__;!!ACWV5N9M2RV99hQ!bwIWH-KoNwkjzx2Sw8BLj6uGXg6zeGUoB-ZG4wtzO42NUmxA95Id3NxKLRgReUsdtEU$ 
>>>>>>
>>>>>>
>>>>> Have to step out for a bit will look at this when I can
>>>>> What I can say is that we are using ovsdbapp to configure central, 
>>>>> and
>>>>> I see /etc/openvswitch/conf.db
>>>>>
>>>>> getting up to several Gb! so much so that systemd times out when you
>>>>> try start the service using it.
>>>>> I am also seeing ovs-vswitchd getting a SEGV on a regular basis which
>>>>> I think is related.
>>>>> I wondering if this patch might help
>>>>>
>>>>> [External] : Re: [ovs-dev] [PATCH branch-2.14] python:
>>>>>                idl: Avoid sending transactions when the DB is not 
>>>>> synced
>>>>>                up.
>>> I'm not sure.   /etc/openvswitch/conf.db is the local ovsdb-server 
>>> database
>>> and not the OVN database.
>>>
>>> Numan
>>>
>>>>>> If you run a tail on /etc/openvswitch/conf.db, do you see the ct 
>>>>>> zone
>>>>>> ids toggling between 2 values constantly ?
>>>>>>
>>>>>> Thanks
>>>>>> Numan
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Brendan
>>>>>>> _______________________________________________
>>>>>>> discuss mailing list
>>>>>>> discuss at openvswitch.org
>>>>>>> https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!bwIWH-KoNwkjzx2Sw8BLj6uGXg6zeGUoB-ZG4wtzO42NUmxA95Id3NxKLRgR-G4xGfo$ 
>>>>>>>
>>>>>>>
>>>>> _______________________________________________
>>>>> discuss mailing list
>>>>> discuss at openvswitch.org
>>>>> https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!cR934SfxrIJu507dsVUIyZ7JHH9WWkNjqT4uWiSsnnfk72lkytha0jMrSq39KbktpyU$ 
>>>>>
>>>>
>>>> _______________________________________________
>>>> discuss mailing list
>>>> discuss at openvswitch.org
>>>> https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!aXU0ishuScB8BUBe7ocXxXDlPWZCYdhri_dfVWZN8rSI68YA6J3XGRVlo1SQy9umVfs$ 
>>>>
>>
>> _______________________________________________
>> discuss mailing list
>> discuss at openvswitch.org
>> https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!c1HxNgHI2KosY03K_FFa5GpfOez9mAgB_8fm8G8Z-hCxG9RpSlq-pE8OO1R0lILyU-k$ 
>
>
>
>
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!fD4xiCtsxdVfl4DnJx7GuPacUj3Tt3j19-f571D1i2v_sJfL7xvt0W_aJeZva9Y7nh8$  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20211028/23a84723/attachment-0001.html>


More information about the discuss mailing list