[ovs-dev] [ovn] bug: ovn-northd was blocked for changes handling

Vladislav Odintsov odivlad at gmail.com
Wed Oct 20 10:23:51 UTC 2021


Hi Han,

could you take a look on this, please?

Regards,
Vladislav Odintsov

> On 5 Oct 2021, at 01:02, Vladislav Odintsov <odivlad at gmail.com> wrote:
> 
> Hi,
> 
> I’ve faced with a next issue using latest OVN (master branch) with OVN-IC enabled:
> ovn-northd CPU utilisation was at ~70-80% and ovsdb-server serving OVN_Southbound DB was also under heavy load.
> 
> In ovn-northd.log file there were warnings:
> 2021-10-01T11:14:43.548Z|18845|northd|INFO|deleting Datapath_Binding dd6af7f7-ea46-4496-988a-e7f9828924d0 with duplicate external-ids:logical-switch/router ec35e3e0-2674-47e7-b645-2c9b8b31865b
> 2021-10-01T11:14:44.148Z|18846|poll_loop|INFO|Dropped 32 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate
> 2021-10-01T11:14:44.148Z|18847|poll_loop|INFO|wakeup due to [POLLIN] on fd 19 (172.20.33.110:55202<->172.20.33.102:16642) at lib/stream-ssl.c:832 (69% CPU usage)
> 2021-10-01T11:14:48.336Z|18848|ovsdb_idl|WARN|Dropped 307 log messages in last 60 seconds (most recently, 0 seconds ago) due to excessive rate
> 2021-10-01T11:14:48.336Z|18849|ovsdb_idl|WARN|transaction error: {"details":"Deletion of 1 weak reference(s) to deleted (or never-existing) rows from column \"datapath\" in \"IP_Multicast\" row 72bac803-e484-4358-9e48-11911c8aa16f caused this column to become empty, but constraints on this column disallow an empty column.","error":"constraint violation"}
> 
> I checked datapath by reported logical-switch id:
> 
> ~]# ovn-sbctl find datap external_ids:logical-switch=ec35e3e0-2674-47e7-b645-2c9b8b31865b
> _uuid               : 7e045551-7700-4a50-b0aa-02cb4e1be59d
> external_ids        : {interconn-ts=vpc-CCF01DF6-rtb-216BABB1-global, logical-switch="ec35e3e0-2674-47e7-b645-2c9b8b31865b", name=vpc-CCF01DF6-rtb-216BABB1-global}
> load_balancers      : []
> tunnel_key          : 16712519
> 
> _uuid               : dd6af7f7-ea46-4496-988a-e7f9828924d0
> external_ids        : {interconn-ts=vpc-CCF01DF6-rtb-216BABB1-global, logical-switch="ec35e3e0-2674-47e7-b645-2c9b8b31865b", name=vpc-CCF01DF6-rtb-216BABB1-global}
> load_balancers      : []
> tunnel_key          : 5
> 
> 
> It refers to ovn-ic transit switch:
> 
> ~]# ovn-ic-sbctl list datapath vpc-CCF01DF6-rtb-216BABB1-global
> _uuid               : fc159fa4-d5ba-46ed-a54c-d00745091021
> external_ids        : {}
> transit_switch      : vpc-CCF01DF6-rtb-216BABB1-global
> tunnel_key          : 16712519
> 
> ~]# ovn-ic-nbctl list transit vpc-CCF01DF6-rtb-216BABB1-global
> _uuid               : b5312889-92f9-40fe-98f9-2ea7ce3debcc
> external_ids        : {}
> name                : vpc-CCF01DF6-rtb-216BABB1-global
> other_config        : {}
> 
> The problem ip-multicast document:
> 
> ~]# ovn-sbctl find ip-mul datapath=dd6af7f7-ea46-4496-988a-e7f9828924d0
> _uuid               : 72bac803-e484-4358-9e48-11911c8aa16f
> datapath            : dd6af7f7-ea46-4496-988a-e7f9828924d0
> enabled             : false
> eth_src             : ""
> idle_timeout        : 300
> ip4_src             : ""
> ip6_src             : ""
> querier             : true
> query_interval      : 150
> query_max_resp      : 1
> seq_no              : 0
> table_size          : 2048
> 
> I tried manually destroy this ip_multicast document, which blocked deletion of datapath document:
> ~]# ovn-sbctl destroy ip-mul 72bac803-e484-4358-9e48-11911c8aa16f
> 
> ~]# ovn-sbctl list ip-mul 72bac803-e484-4358-9e48-11911c8aa16f
> ovn-sbctl: no row "72bac803-e484-4358-9e48-11911c8aa16f" in table IP_Multicast
> 
> 
> And problem was resolved. ovn-northd daemon stopped consuming CPU and excess datapath was deleted automatically.
> 
> ~]# ovn-sbctl find datapath external_ids:interconn-ts=vpc-CCF01DF6-rtb-216BABB1-global
> _uuid               : 7e045551-7700-4a50-b0aa-02cb4e1be59d
> external_ids        : {interconn-ts=vpc-CCF01DF6-rtb-216BABB1-global, logical-switch="ec35e3e0-2674-47e7-b645-2c9b8b31865b", name=vpc-CCF01DF6-rtb-216BABB1-global}
> load_balancers      : []
> tunnel_key          : 16712519
> 
> The main problem here is that northd was entirely blocked. No processing to SB was done. Even ovn-controllers couldn’t claim new ports. No changes were propagated, so this is critial issue.
> 
> Can anybody help understand the reason for this?
> 
> Regards,
> Vladislav Odintsov



More information about the dev mailing list