[ovs-dev] northd-ddlog slowness when adding the first regular LSP to a LS full of router ports

Ben Pfaff blp at ovn.org
Thu Oct 7 00:42:49 UTC 2021

On Thu, Sep 30, 2021 at 12:23:57AM -0700, Han Zhou wrote:
> Hi Ben,
> I understand that we have difficulties for northd-ddlog progress, but I
> still want to try it before it goes away. I tested with the latest version,
> and it is super fast for most of the operations. With a large NB & SB
> created, the C northd takes ~8 seconds for any change computation. For the
> same DB, northd usually takes less than 1 sec for most operations.
> However, I did find an interesting problem. I tried to create one more LSP
> on a LS that already has 800 gateway-routers connected to it, which means
> there are already 800 LSPs on the LS of the type "router". Creating the
> extra LSP (without type) took 12 sec, which is even longer than a full
> compute of the C version. What's more interesting is, when I create another
> LSP on the same LS, it takes only 100+ms, and same for the 3rd, 4th LSPs,
> etc. When I remove any one of the extra LSPs I created, it is also fast,
> just 100+ms. But when I remove the last LSP that I just created it takes 12
> sec again. Then I tried creating a LSP with type=router on the same LS, it
> is very fast, less than 100ms. Basically, only creating the first
> non-router LSP or removing the last non-router LSP takes a very long time.
> I haven't debugged yet (and not sure if I am capable of), but I think it
> might be useful to report it first.

This kind of thing usually happens because there is a lot of
intermediate data being built up in internal relations that changes when
the input changes.  My usual strategy for figuring it out is to
configure with --output-internal-relations (that might not be the
exactly correct option; the documentation should say) and then replay
using the CLI to see what's changing before and after the changes in
question.  It's usually big.

More information about the dev mailing list