[ovs-dev] [PATCH ovn] inc-proc-eng: Improve debug logging.

Han Zhou hzhou at ovn.org
Fri Jul 2 18:39:41 UTC 2021


On Fri, Jul 2, 2021 at 2:08 AM Dumitru Ceara <dceara at redhat.com> wrote:
>
> Time how long change/run handlers take and log this at debug level.
> I-P engine debug logs are not so verbose so enabling them is quite
> common when debugging scale/control plane latency related issues.
>
> One of the major missing pieces though was a log about how long each I-P
> node change/run handler took to run.  This commit adds it and also logs
> a message when a node is recomputed because it misses an input change
> handler.
>
> Signed-off-by: Dumitru Ceara <dceara at redhat.com>

Thanks Dumitru. This will be very helpful. I have just some minor
suggestions.


> ---
>  lib/inc-proc-eng.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/lib/inc-proc-eng.c b/lib/inc-proc-eng.c
> index c349efb22..bb96ef1a3 100644
> --- a/lib/inc-proc-eng.c
> +++ b/lib/inc-proc-eng.c
> @@ -27,6 +27,7 @@
>  #include "openvswitch/hmap.h"
>  #include "openvswitch/vlog.h"
>  #include "inc-proc-eng.h"
> +#include "timeval.h"
>  #include "unixctl.h"
>
>  VLOG_DEFINE_THIS_MODULE(inc_proc_eng);
> @@ -333,8 +334,10 @@ engine_recompute(struct engine_node *node, bool
forced, bool allowed)
>      }
>
>      /* Run the node handler which might change state. */
> +    long long int now = time_msec();
>      node->run(node, node->data);
>      node->stats.recompute++;
> +    VLOG_DBG("node: %s, recompute took %lldms", node->name, time_msec()
- now);

We already log an entry at the beginning of this function. I think it is
better to move that one here and combine with the time spent into just one
log.

Secondly, the log in the beginning prints if it is triggered or forced, but
as you mentioned there can be two cases for "triggered" and we didn't have
an extra log to tell when it is triggered by missing a handler, I'd suggest
instead of adding another log, we can change the "forced" argument to
"reason" so we can log the reason directly in the same log entry in this
function, and remove the log for calling this function. We need to add
another argument to this function to tell the current input that triggers
this recompute. It can be something like:

"node: abc, recompute (forced/missing handler/failed handler), current
input: xyz, time spent: ..."

>  }
>
>  /* Return true if the node could be computed, false otherwise. */
> @@ -350,7 +353,12 @@ engine_compute(struct engine_node *node, bool
recompute_allowed)
>              /* If the input change can't be handled incrementally, run
>               * the node handler.
>               */
> -            if (!node->inputs[i].change_handler(node, node->data)) {
> +            long long int now = time_msec();
> +            bool handled = node->inputs[i].change_handler(node,
node->data);
> +            VLOG_DBG("node: %s, handling change for input %s took
%lldms",
> +                     node->name, node->inputs[i].node->name,
> +                     time_msec() - now);
> +            if (!handled) {
>                  VLOG_DBG("node: %s, can't handle change for input %s, "
>                           "fall back to recompute",
>                           node->name, node->inputs[i].node->name);
> @@ -369,8 +377,11 @@ engine_run_node(struct engine_node *node, bool
recompute_allowed)
>  {
>      if (!node->n_inputs) {
>          /* Run the node handler which might change state. */
> +        long long int now = time_msec();
>          node->run(node, node->data);
>          node->stats.recompute++;
> +        VLOG_DBG("node: %s, recompute took %lldms", node->name,
> +                 time_msec() - now);

This is a pure input node, so it's better not calling it "recompute". Since
they are evaluated in every iteration and usually very fast, I wonder if we
should avoid this log, which could be just noise.

>          return;
>      }
>
> @@ -389,6 +400,9 @@ engine_run_node(struct engine_node *node, bool
recompute_allowed)
>
>              /* Trigger a recompute if we don't have a change handler. */
>              if (!node->inputs[i].change_handler) {
> +                VLOG_DBG("node %s, no handler for input %s, "
> +                         "fall back to recompute",
> +                         node->name, node->inputs[i].node->name);

This is the one I mentioned above that could be avoided by passing the
reason and current input node name to the function call and get logged in
engine_recompute().

Thanks,
Han

>                  engine_recompute(node, false, recompute_allowed);
>                  return;
>              }
> --
> 2.27.0
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list