[ovs-dev] [PATCH ovn 20/21] ovn-northd-ddlog: Remove Router.static_routes.

Ben Pfaff blp at ovn.org
Sat Mar 27 00:31:46 UTC 2021

From: Leonid Ryzhyk <lryzhyk at vmware.com>

This is another instance of a performance bug when a change to a
router object forces a cascade of changes to relations that reference
the object.  This time around the problem was caused by the
`Router.static_routes` field, which is copied from
`nb::Logical_Router`.  Luckily, this field was only used in one rule
and was easy to remove.

Here is how we diagnosed the issue (this may be useful for posterity):

- We started with a benchmark that executed several hundreds of similar
  transactions (in this case, these transactions were adding new router
  ports).  We recorded execution of the benchmark in a DDlog command
  file (replay.txt) and added `timestamp;` commands after each
  transaction in the file.

- Run `make NORTHD_CLI=1` to generate the ovn_northd_cli executable and
  use it to execute the command file:
  ./ovn_northd_ddlog/target/release/ovn_northd_cli -w 1  < replay.txt > replay.dump

- Extract only the timestamps from replay.dump and plot differences
  between successive timestamps (i.e., individual transaction times).
  I use gnumeric.  It would be nice to have some automation for this
  in the future.  We observe that one of the transactions in the
  benchmark loop slows down linearly as the size of the network
  topology grows:
  Clearly some of the rules in the program are getting more expensive
  as the number of ports goes up.  Another interesting observation is
  that the size of the delta output at each iteration of the benchmark
  remains constant (the delta mostly consists of new network flows).
  This suggests that whatever extra work DDlog is doing at each
  iteration might be redundant.

- To identify where the wasted work happens, we re-compile the program
  passing the `--output-internal-relations` flag to DDlog, which tells it
  to dump changes to all intermediate relations, not just output
  relation.  We replay the trace again.  We locate the expensive
  transaction in the log and compare its output from one of the first
  iterations vs one from the end of the log.  We now see a whole bunch of
  intermediate relations that only had a few modified records in the
  first transaction versus hundreds in the second one.  We further
  observe that all of these changes simply update the `static_routes`
  field (as explained above).

- After removing the field, we observe that changes to intermediate
  relations no longer grow with the topology, and transaction timing
  increases much more slowly:

Signed-off-by: Leonid Ryzhyk <lryzhyk at vmware.com>
Signed-off-by: Ben Pfaff <blp at ovn.org>
 northd/lrouter.dl | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index 38a19e7ea324..7ed5358c9431 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -434,7 +434,6 @@ typedef Router = Router {
     /* Fields copied from nb::Logical_Router. */
     _uuid:              uuid,
     name:               string,
-    static_routes:      Set<uuid>,
     policies:           Set<uuid>,
     enabled:            Option<bool>,
     nat:                Set<uuid>,
@@ -459,7 +458,6 @@ relation Router[Intern<Router>]
         ._uuid         =    lr._uuid,
         .name          =    lr.name,
-        .static_routes =    lr.static_routes,
         .policies      =    lr.policies,
         .enabled       =    lr.enabled,
         .nat           =    lr.nat,
@@ -730,7 +728,8 @@ RouterStaticRoute_(.router = router,
                    .nexthop = route.nexthop,
                    .output_port = route.output_port,
                    .ecmp_symmetric_reply = route.ecmp_symmetric_reply) :-
-    router in &Router(.static_routes = routes),
+    router in &Router(),
+    nb::Logical_Router(._uuid = router._uuid, .static_routes = routes),
     var route_id = FlatMap(routes),
     route in &StaticRoute(.lrsr = nb::Logical_Router_Static_Route{._uuid = route_id}).

More information about the dev mailing list