[ovs-dev] [PATCH V4 1/3] coverage: Reimplement the "ovs-appclt coverage/show" command

Ben Pfaff blp at nicira.com
Fri Aug 16 23:04:36 UTC 2013


Thanks.  Sorry about causing you extra work.

On Fri, Aug 16, 2013 at 04:03:39PM -0700, Alex Wang wrote:
> Thanks for the feedback,
> 
> I'll review your patch and fix the oddities in wording,
> 
> 
> On Fri, Aug 16, 2013 at 3:51 PM, Ben Pfaff <blp at nicira.com> wrote:
> 
> > On Fri, Aug 16, 2013 at 10:48:44AM -0700, Alex Wang wrote:
> > > This commit changes the "ovs-appclt coverage/show" command to show the
> > > secondly, minutely and hourly rates of function invocation. More
> > > importantly, this makes using coverage counter an easy way to monitor
> > > the execution of specific functions.
> > >
> > > Signed-off-by: Alex Wang <alexw at nicira.com>
> >
> > Thanks for the patch.
> >
> > Superficially, I noticed some oddities in wording.  The program for
> > talking to a running OVS daemon is "ovs-appctl" (not ovs-appclt).  But
> > it's really the words "secondly" and "minutely" that jumped out at me.
> > "Secondly" doesn't mean "per second".  "Minutely" can mean "per
> > minute" but it usually means "precisely".
> >
> > I also noticed a cast to float.  There's no benefit to float over
> > double these days, so I changed this to use double instead.
> >
> > All this is trivia.  I fixed it up locally and I'm appending a revised
> > version.
> >
> > But there's a bigger problem that I think needs to get fixed first.
> > The coverage module, as is, is not thread safe.  I think that it will
> > not cause any grave problem such as a segfault or wild pointer
> > dereference, but every thread's use of coverage interferes with every
> > other thread's use.  I do not want to potentially make this worse, by
> > adding additional sophistication and complexity to the coverage
> > module, before fixing it.  So I've sent out a patch that make coverage
> > thread safe for review:
> >        http://openvswitch.org/pipermail/dev/2013-August/030833.html
> > and after it or some equivalent is in, it will be time to revise and
> > review this patch.
> >
> > Thanks,
> >
> > Ben.
> >
> > --8<--------------------------cut here-------------------------->8--
> >
> > From: Alex Wang <alexw at nicira.com>
> > Date: Fri, 16 Aug 2013 15:25:15 -0700
> > Subject: [PATCH] coverage: Reimplement the "ovs-appctl coverage/show"
> > command
> >
> > This commit changes the "ovs-appctl coverage/show" command to show the
> > per-second, per-minute and per-hour rates of function invocation.  More
> > importantly, this makes using coverage counter an easy way to monitor
> > the execution of specific functions.
> >
> > Signed-off-by: Alex Wang <alexw at nicira.com>
> > Signed-off-by: Ben Pfaff <blp at nicira.com>
> > Acked-by: Andy Zhou <azhou at nicira.com>
> > ---
> >  lib/coverage-unixctl.man |    3 +-
> >  lib/coverage.c           |   96
> > +++++++++++++++++++++++++++++++++++++++-------
> >  lib/coverage.h           |   22 +++++++++--
> >  lib/timeval.c            |    1 -
> >  tests/ofproto-dpif.at    |   32 ++++++++++++++++
> >  vswitchd/ovs-vswitchd.c  |    2 +
> >  6 files changed, 136 insertions(+), 20 deletions(-)
> >
> > diff --git a/lib/coverage-unixctl.man b/lib/coverage-unixctl.man
> > index 9718894..49e8daa 100644
> > --- a/lib/coverage-unixctl.man
> > +++ b/lib/coverage-unixctl.man
> > @@ -8,4 +8,5 @@ main loop takes unusually long to run.
> >  Coverage counters are useful mainly for performance analysis and
> >  debugging.
> >  .IP "\fBcoverage/show\fR"
> > -Displays the values of all of the coverage counters.
> > +Displays the per-second, per-minute and per-hour rates and total count
> > +of all of the coverage counters.
> > diff --git a/lib/coverage.c b/lib/coverage.c
> > index f152474..3e1226a 100644
> > --- a/lib/coverage.c
> > +++ b/lib/coverage.c
> > @@ -1,5 +1,5 @@
> >  /*
> > - * Copyright (c) 2009, 2010, 2011, 2012 Nicira, Inc.
> > + * Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
> >   *
> >   * Licensed under the Apache License, Version 2.0 (the "License");
> >   * you may not use this file except in compliance with the License.
> > @@ -47,9 +47,19 @@ struct coverage_counter *coverage_counters[] = {
> >  #define n_coverage_counters ARRAY_SIZE(coverage_counters)
> >  #endif  /* !USE_LINKER_SECTIONS */
> >
> > -static unsigned int epoch;
> > +static long long int coverage_run_time = LLONG_MIN;
> > +
> > +/* Defines the moving average array index variables. */
> > +static unsigned int min_idx = 0;
> > +static unsigned int hr_idx = 0;
> > +
> > +/* Index counter used to compute the moving average array's index. */
> > +static unsigned int idx_count = 0;
> >
> >  static void coverage_read(struct svec *);
> > +static unsigned int coverage_array_sum(const unsigned int *arr,
> > +                                       const unsigned int len);
> > +
> >
> >  static void
> >  coverage_unixctl_show(struct unixctl_conn *conn, int argc OVS_UNUSED,
> > @@ -170,7 +180,7 @@ coverage_log(void)
> >          uint32_t hash = coverage_hash();
> >          if (coverage_hit(hash)) {
> >              VLOG_INFO("Skipping details of duplicate event coverage for "
> > -                      "hash=%08"PRIx32" in epoch %u", hash, epoch);
> > +                      "hash=%08"PRIx32, hash);
> >          } else {
> >              struct svec lines;
> >              const char *line;
> > @@ -189,8 +199,17 @@ coverage_log(void)
> >  static void
> >  coverage_read_counter(struct svec *lines, const struct coverage_counter
> > *c)
> >  {
> > -    svec_add_nocopy(lines, xasprintf("%-24s %5u / %9llu",
> > -                                     c->name, c->count, c->count +
> > c->total));
> > +    /* The per-second rate is calculated via averaging the counts in
> > +     * the most recent COVERAGE_RUN_INTERVAL interval. */
> > +    svec_add_nocopy(lines,
> > +        xasprintf("%-24s %5.1f/sec %7u/min "
> > +                  "%9u/hr   total: %llu",
> > +                  c->name,
> > +                  (c->min[(idx_count - 1) % MIN_AVG_LEN]
> > +                   * 1000.0 / COVERAGE_RUN_INTERVAL),
> > +                  coverage_array_sum(c->min, MIN_AVG_LEN),
> > +                  coverage_array_sum(c->hr,  HR_AVG_LEN),
> > +                  c->count + c->total));
> >  }
> >
> >  /* Adds coverage counter information to 'lines'. */
> > @@ -204,8 +223,8 @@ coverage_read(struct svec *lines)
> >      hash = coverage_hash();
> >
> >      n_never_hit = 0;
> > -    svec_add_nocopy(lines, xasprintf("Event coverage (epoch %u/entire
> > run), "
> > -                                     "hash=%08"PRIx32":", epoch, hash));
> > +    svec_add_nocopy(lines, xasprintf("Event coverage (frequency), "
> > +                                     "hash=%08"PRIx32":", hash));
> >      for (i = 0; i < n_coverage_counters; i++) {
> >          struct coverage_counter *c = coverage_counters[i];
> >          if (c->count) {
> > @@ -225,16 +244,65 @@ coverage_read(struct svec *lines)
> >      svec_add_nocopy(lines, xasprintf("%zu events never hit",
> > n_never_hit));
> >  }
> >
> > -/* Advances to the next epoch of coverage, resetting all the counters to
> > 0. */
> > +/* Runs approximately every COVERAGE_RUN_INTERVAL seconds to update the
> > + * coverage counters' 'min' and 'hr' array.  'min' array is for cumulating
> > + * per-second counts into per-minute count.  'hr' array is for cumulating
> > + * per-minute counts into hourly count. */
> >  void
> > -coverage_clear(void)
> > +coverage_run(void)
> >  {
> > +    long long int now = time_msec();
> > +
> > +    /* Initialize the coverage_run_time. */
> > +    if (coverage_run_time == LLONG_MIN) {
> > +        coverage_run_time = now;
> > +    }
> > +
> > +    if (now >= coverage_run_time) {
> > +        size_t i, j;
> > +        /* Computes the number of COVERAGE_RUN_INTERVAL slots, since
> > +         * it is possible that the actual run interval is multiple of
> > +         * COVERAGE_RUN_INTERVAL. */
> > +        int slots = (now - coverage_run_time) / COVERAGE_RUN_INTERVAL + 1;
> > +
> > +        for (i = 0; i < slots; i++) {
> > +            /* Updates the index variables. */
> > +            /* The min_idx is increased from 0 to MIN_AVG_LEN - 1. Every
> > +             * time the min_idx finishes a cycle (a cycle is one minute),
> > +             * the hr_idx is incremented by 1. */
> > +            min_idx = idx_count % MIN_AVG_LEN;
> > +            hr_idx  = idx_count / MIN_AVG_LEN;
> > +
> > +            for (j = 0; j < n_coverage_counters; j++) {
> > +                struct coverage_counter *c = coverage_counters[j];
> > +                int portion;
> > +
> > +                /* The count over the time interval is evenly distributed
> > +                 * among slots by calculating the portion. */
> > +                portion = c->count / slots + (i == (slots - 1) ?
> > +                                              c->count % slots : 0);
> > +                c->total += portion;
> > +                c->count = (i == (slots - 1) ? 0 : c->count);
> > +                c->min[min_idx] = portion;
> > +                c->hr[hr_idx] = min_idx == 0 ? portion
> > +                                             : (c->hr[hr_idx] + portion);
> > +            }
> > +            /* This is to guarantee that hr_idx ranges from 0 to 59. */
> > +            idx_count = (idx_count + 1) % (MIN_AVG_LEN * HR_AVG_LEN);
> > +        }
> > +        /* Updates the run time. */
> > +        coverage_run_time = now + COVERAGE_RUN_INTERVAL;
> > +    }
> > +}
> > +
> > +static unsigned int
> > +coverage_array_sum(const unsigned int *arr, const unsigned int len)
> > +{
> > +    unsigned int sum = 0;
> >      size_t i;
> >
> > -    epoch++;
> > -    for (i = 0; i < n_coverage_counters; i++) {
> > -        struct coverage_counter *c = coverage_counters[i];
> > -        c->total += c->count;
> > -        c->count = 0;
> > +    for (i = 0; i < len; i++) {
> > +        sum += arr[i];
> >      }
> > +    return sum;
> >  }
> > diff --git a/lib/coverage.h b/lib/coverage.h
> > index 968c489..7af2cb1 100644
> > --- a/lib/coverage.h
> > +++ b/lib/coverage.h
> > @@ -29,11 +29,24 @@
> >
> >  #include "vlog.h"
> >
> > +/* Makes coverage_run run every 5000 ms (5 seconds).
> > + * If this value is redefined, the new value must
> > + * divide 60000. */
> > +#define COVERAGE_RUN_INTERVAL    5000
> > +
> > +/* Defines the moving average array length. */
> > +#define MIN_AVG_LEN (60000/COVERAGE_RUN_INTERVAL)
> > +#define HR_AVG_LEN  60
> > +
> >  /* A coverage counter. */
> >  struct coverage_counter {
> >      const char *name;           /* Textual name. */
> > -    unsigned int count;         /* Count within the current epoch. */
> > -    unsigned long long int total; /* Total count over all epochs. */
> > +    unsigned int count;         /* Count within the current second. */
> > +    unsigned long long int total; /* Total count over all time. */
> > +
> > +    /* The moving average arrays. */
> > +    unsigned int min[MIN_AVG_LEN];
> > +    unsigned int hr[HR_AVG_LEN];
> >  };
> >
> >  /* Defines COUNTER.  There must be exactly one such definition at file
> > scope
> > @@ -57,11 +70,12 @@ struct coverage_counter {
> >
> >  void coverage_init(void);
> >  void coverage_log(void);
> > -void coverage_clear(void);
> > +void coverage_run(void);
> >
> >  /* Implementation detail. */
> >  #define COVERAGE_DEFINE__(COUNTER)                              \
> >          extern struct coverage_counter counter_##COUNTER;       \
> > -        struct coverage_counter counter_##COUNTER = { #COUNTER, 0, 0 }
> > +        struct coverage_counter counter_##COUNTER = { #COUNTER, 0, 0, \
> > +                                                      {0}, {0} }
> >
> >  #endif /* coverage.h */
> > diff --git a/lib/timeval.c b/lib/timeval.c
> > index faf8e7b..2380a0a 100644
> > --- a/lib/timeval.c
> > +++ b/lib/timeval.c
> > @@ -304,7 +304,6 @@ time_poll(struct pollfd *pollfds, int n_pollfds, long
> > long int timeout_when,
> >      if (*last_wakeup) {
> >          log_poll_interval(*last_wakeup);
> >      }
> > -    coverage_clear();
> >      start = time_msec();
> >      blocked = false;
> >
> > diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> > index b093998..fa058f7 100644
> > --- a/tests/ofproto-dpif.at
> > +++ b/tests/ofproto-dpif.at
> > @@ -2689,3 +2689,35 @@ AT_CHECK([tail -1 stdout], [0], [Datapath actions: 5
> >  ])
> >  OVS_VSWITCHD_STOP
> >  AT_CLEANUP
> > +
> > +# Unit test for appctl coverage/show command
> > +AT_SETUP([ofproto-dpif - coverage/show])
> > +OVS_VSWITCHD_START
> > +
> > +ovs-appctl time/stop
> > +# before the first 5 seconds, nothing can be calculated but the total
> > count.
> > +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'],
> > [], [dnl
> > +unixctl_received           0.0/sec       0/min         0/hr   total: 2
> > +])
> > +
> > +ovs-appctl time/warp 5000
> > +# at first 5 second time instant, should have stats.
> > +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'],
> > [], [dnl
> > +unixctl_received           0.6/sec       3/min         3/hr   total: 4
> > +])
> > +
> > +for i in `seq 0 10`; do ovs-appctl time/warp 5000; done
> > +# advance to first 1 minute time instant.
> > +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'],
> > [], [dnl
> > +unixctl_received           0.2/sec      15/min        15/hr   total: 16
> > +])
> > +
> > +ovs-appctl time/warp 60000
> > +# advance to next 1 minute time instant directly, should observe the
> > per-minute
> > +# rate drop.
> > +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'],
> > [], [dnl
> > +unixctl_received           0.4/sec       2/min        17/hr   total: 18
> > +])
> > +
> > +OVS_VSWITCHD_STOP
> > +AT_CLEANUP
> > diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c
> > index bc45dac..a9db1fd 100644
> > --- a/vswitchd/ovs-vswitchd.c
> > +++ b/vswitchd/ovs-vswitchd.c
> > @@ -28,6 +28,7 @@
> >  #include "bridge.h"
> >  #include "command-line.h"
> >  #include "compiler.h"
> > +#include "coverage.h"
> >  #include "daemon.h"
> >  #include "dirs.h"
> >  #include "dpif.h"
> > @@ -119,6 +120,7 @@ main(int argc, char *argv[])
> >          bridge_run_fast();
> >          unixctl_server_run(unixctl);
> >          netdev_run();
> > +        coverage_run();
> >
> >          signal_wait(sighup);
> >          memory_wait();
> > --
> > 1.7.10.4
> >
> >



More information about the dev mailing list