[ovs-dev] northd-ddlog slowness when adding the first regular LSP to a LS full of router ports
blp at ovn.org
Thu Oct 7 00:42:49 UTC 2021
On Thu, Sep 30, 2021 at 12:23:57AM -0700, Han Zhou wrote:
> Hi Ben,
> I understand that we have difficulties for northd-ddlog progress, but I
> still want to try it before it goes away. I tested with the latest version,
> and it is super fast for most of the operations. With a large NB & SB
> created, the C northd takes ~8 seconds for any change computation. For the
> same DB, northd usually takes less than 1 sec for most operations.
> However, I did find an interesting problem. I tried to create one more LSP
> on a LS that already has 800 gateway-routers connected to it, which means
> there are already 800 LSPs on the LS of the type "router". Creating the
> extra LSP (without type) took 12 sec, which is even longer than a full
> compute of the C version. What's more interesting is, when I create another
> LSP on the same LS, it takes only 100+ms, and same for the 3rd, 4th LSPs,
> etc. When I remove any one of the extra LSPs I created, it is also fast,
> just 100+ms. But when I remove the last LSP that I just created it takes 12
> sec again. Then I tried creating a LSP with type=router on the same LS, it
> is very fast, less than 100ms. Basically, only creating the first
> non-router LSP or removing the last non-router LSP takes a very long time.
> I haven't debugged yet (and not sure if I am capable of), but I think it
> might be useful to report it first.
This kind of thing usually happens because there is a lot of
intermediate data being built up in internal relations that changes when
the input changes. My usual strategy for figuring it out is to
configure with --output-internal-relations (that might not be the
exactly correct option; the documentation should say) and then replay
using the CLI to see what's changing before and after the changes in
question. It's usually big.
More information about the dev