[ovs-dev] [PATCH V4 1/3] coverage: Reimplement the "ovs-appclt coverage/show" command
Ben Pfaff
blp at nicira.com
Fri Aug 16 23:04:36 UTC 2013
Thanks. Sorry about causing you extra work.
On Fri, Aug 16, 2013 at 04:03:39PM -0700, Alex Wang wrote:
> Thanks for the feedback,
>
> I'll review your patch and fix the oddities in wording,
>
>
> On Fri, Aug 16, 2013 at 3:51 PM, Ben Pfaff <blp at nicira.com> wrote:
>
> > On Fri, Aug 16, 2013 at 10:48:44AM -0700, Alex Wang wrote:
> > > This commit changes the "ovs-appclt coverage/show" command to show the
> > > secondly, minutely and hourly rates of function invocation. More
> > > importantly, this makes using coverage counter an easy way to monitor
> > > the execution of specific functions.
> > >
> > > Signed-off-by: Alex Wang <alexw at nicira.com>
> >
> > Thanks for the patch.
> >
> > Superficially, I noticed some oddities in wording. The program for
> > talking to a running OVS daemon is "ovs-appctl" (not ovs-appclt). But
> > it's really the words "secondly" and "minutely" that jumped out at me.
> > "Secondly" doesn't mean "per second". "Minutely" can mean "per
> > minute" but it usually means "precisely".
> >
> > I also noticed a cast to float. There's no benefit to float over
> > double these days, so I changed this to use double instead.
> >
> > All this is trivia. I fixed it up locally and I'm appending a revised
> > version.
> >
> > But there's a bigger problem that I think needs to get fixed first.
> > The coverage module, as is, is not thread safe. I think that it will
> > not cause any grave problem such as a segfault or wild pointer
> > dereference, but every thread's use of coverage interferes with every
> > other thread's use. I do not want to potentially make this worse, by
> > adding additional sophistication and complexity to the coverage
> > module, before fixing it. So I've sent out a patch that make coverage
> > thread safe for review:
> > http://openvswitch.org/pipermail/dev/2013-August/030833.html
> > and after it or some equivalent is in, it will be time to revise and
> > review this patch.
> >
> > Thanks,
> >
> > Ben.
> >
> > --8<--------------------------cut here-------------------------->8--
> >
> > From: Alex Wang <alexw at nicira.com>
> > Date: Fri, 16 Aug 2013 15:25:15 -0700
> > Subject: [PATCH] coverage: Reimplement the "ovs-appctl coverage/show"
> > command
> >
> > This commit changes the "ovs-appctl coverage/show" command to show the
> > per-second, per-minute and per-hour rates of function invocation. More
> > importantly, this makes using coverage counter an easy way to monitor
> > the execution of specific functions.
> >
> > Signed-off-by: Alex Wang <alexw at nicira.com>
> > Signed-off-by: Ben Pfaff <blp at nicira.com>
> > Acked-by: Andy Zhou <azhou at nicira.com>
> > ---
> > lib/coverage-unixctl.man | 3 +-
> > lib/coverage.c | 96
> > +++++++++++++++++++++++++++++++++++++++-------
> > lib/coverage.h | 22 +++++++++--
> > lib/timeval.c | 1 -
> > tests/ofproto-dpif.at | 32 ++++++++++++++++
> > vswitchd/ovs-vswitchd.c | 2 +
> > 6 files changed, 136 insertions(+), 20 deletions(-)
> >
> > diff --git a/lib/coverage-unixctl.man b/lib/coverage-unixctl.man
> > index 9718894..49e8daa 100644
> > --- a/lib/coverage-unixctl.man
> > +++ b/lib/coverage-unixctl.man
> > @@ -8,4 +8,5 @@ main loop takes unusually long to run.
> > Coverage counters are useful mainly for performance analysis and
> > debugging.
> > .IP "\fBcoverage/show\fR"
> > -Displays the values of all of the coverage counters.
> > +Displays the per-second, per-minute and per-hour rates and total count
> > +of all of the coverage counters.
> > diff --git a/lib/coverage.c b/lib/coverage.c
> > index f152474..3e1226a 100644
> > --- a/lib/coverage.c
> > +++ b/lib/coverage.c
> > @@ -1,5 +1,5 @@
> > /*
> > - * Copyright (c) 2009, 2010, 2011, 2012 Nicira, Inc.
> > + * Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
> > *
> > * Licensed under the Apache License, Version 2.0 (the "License");
> > * you may not use this file except in compliance with the License.
> > @@ -47,9 +47,19 @@ struct coverage_counter *coverage_counters[] = {
> > #define n_coverage_counters ARRAY_SIZE(coverage_counters)
> > #endif /* !USE_LINKER_SECTIONS */
> >
> > -static unsigned int epoch;
> > +static long long int coverage_run_time = LLONG_MIN;
> > +
> > +/* Defines the moving average array index variables. */
> > +static unsigned int min_idx = 0;
> > +static unsigned int hr_idx = 0;
> > +
> > +/* Index counter used to compute the moving average array's index. */
> > +static unsigned int idx_count = 0;
> >
> > static void coverage_read(struct svec *);
> > +static unsigned int coverage_array_sum(const unsigned int *arr,
> > + const unsigned int len);
> > +
> >
> > static void
> > coverage_unixctl_show(struct unixctl_conn *conn, int argc OVS_UNUSED,
> > @@ -170,7 +180,7 @@ coverage_log(void)
> > uint32_t hash = coverage_hash();
> > if (coverage_hit(hash)) {
> > VLOG_INFO("Skipping details of duplicate event coverage for "
> > - "hash=%08"PRIx32" in epoch %u", hash, epoch);
> > + "hash=%08"PRIx32, hash);
> > } else {
> > struct svec lines;
> > const char *line;
> > @@ -189,8 +199,17 @@ coverage_log(void)
> > static void
> > coverage_read_counter(struct svec *lines, const struct coverage_counter
> > *c)
> > {
> > - svec_add_nocopy(lines, xasprintf("%-24s %5u / %9llu",
> > - c->name, c->count, c->count +
> > c->total));
> > + /* The per-second rate is calculated via averaging the counts in
> > + * the most recent COVERAGE_RUN_INTERVAL interval. */
> > + svec_add_nocopy(lines,
> > + xasprintf("%-24s %5.1f/sec %7u/min "
> > + "%9u/hr total: %llu",
> > + c->name,
> > + (c->min[(idx_count - 1) % MIN_AVG_LEN]
> > + * 1000.0 / COVERAGE_RUN_INTERVAL),
> > + coverage_array_sum(c->min, MIN_AVG_LEN),
> > + coverage_array_sum(c->hr, HR_AVG_LEN),
> > + c->count + c->total));
> > }
> >
> > /* Adds coverage counter information to 'lines'. */
> > @@ -204,8 +223,8 @@ coverage_read(struct svec *lines)
> > hash = coverage_hash();
> >
> > n_never_hit = 0;
> > - svec_add_nocopy(lines, xasprintf("Event coverage (epoch %u/entire
> > run), "
> > - "hash=%08"PRIx32":", epoch, hash));
> > + svec_add_nocopy(lines, xasprintf("Event coverage (frequency), "
> > + "hash=%08"PRIx32":", hash));
> > for (i = 0; i < n_coverage_counters; i++) {
> > struct coverage_counter *c = coverage_counters[i];
> > if (c->count) {
> > @@ -225,16 +244,65 @@ coverage_read(struct svec *lines)
> > svec_add_nocopy(lines, xasprintf("%zu events never hit",
> > n_never_hit));
> > }
> >
> > -/* Advances to the next epoch of coverage, resetting all the counters to
> > 0. */
> > +/* Runs approximately every COVERAGE_RUN_INTERVAL seconds to update the
> > + * coverage counters' 'min' and 'hr' array. 'min' array is for cumulating
> > + * per-second counts into per-minute count. 'hr' array is for cumulating
> > + * per-minute counts into hourly count. */
> > void
> > -coverage_clear(void)
> > +coverage_run(void)
> > {
> > + long long int now = time_msec();
> > +
> > + /* Initialize the coverage_run_time. */
> > + if (coverage_run_time == LLONG_MIN) {
> > + coverage_run_time = now;
> > + }
> > +
> > + if (now >= coverage_run_time) {
> > + size_t i, j;
> > + /* Computes the number of COVERAGE_RUN_INTERVAL slots, since
> > + * it is possible that the actual run interval is multiple of
> > + * COVERAGE_RUN_INTERVAL. */
> > + int slots = (now - coverage_run_time) / COVERAGE_RUN_INTERVAL + 1;
> > +
> > + for (i = 0; i < slots; i++) {
> > + /* Updates the index variables. */
> > + /* The min_idx is increased from 0 to MIN_AVG_LEN - 1. Every
> > + * time the min_idx finishes a cycle (a cycle is one minute),
> > + * the hr_idx is incremented by 1. */
> > + min_idx = idx_count % MIN_AVG_LEN;
> > + hr_idx = idx_count / MIN_AVG_LEN;
> > +
> > + for (j = 0; j < n_coverage_counters; j++) {
> > + struct coverage_counter *c = coverage_counters[j];
> > + int portion;
> > +
> > + /* The count over the time interval is evenly distributed
> > + * among slots by calculating the portion. */
> > + portion = c->count / slots + (i == (slots - 1) ?
> > + c->count % slots : 0);
> > + c->total += portion;
> > + c->count = (i == (slots - 1) ? 0 : c->count);
> > + c->min[min_idx] = portion;
> > + c->hr[hr_idx] = min_idx == 0 ? portion
> > + : (c->hr[hr_idx] + portion);
> > + }
> > + /* This is to guarantee that hr_idx ranges from 0 to 59. */
> > + idx_count = (idx_count + 1) % (MIN_AVG_LEN * HR_AVG_LEN);
> > + }
> > + /* Updates the run time. */
> > + coverage_run_time = now + COVERAGE_RUN_INTERVAL;
> > + }
> > +}
> > +
> > +static unsigned int
> > +coverage_array_sum(const unsigned int *arr, const unsigned int len)
> > +{
> > + unsigned int sum = 0;
> > size_t i;
> >
> > - epoch++;
> > - for (i = 0; i < n_coverage_counters; i++) {
> > - struct coverage_counter *c = coverage_counters[i];
> > - c->total += c->count;
> > - c->count = 0;
> > + for (i = 0; i < len; i++) {
> > + sum += arr[i];
> > }
> > + return sum;
> > }
> > diff --git a/lib/coverage.h b/lib/coverage.h
> > index 968c489..7af2cb1 100644
> > --- a/lib/coverage.h
> > +++ b/lib/coverage.h
> > @@ -29,11 +29,24 @@
> >
> > #include "vlog.h"
> >
> > +/* Makes coverage_run run every 5000 ms (5 seconds).
> > + * If this value is redefined, the new value must
> > + * divide 60000. */
> > +#define COVERAGE_RUN_INTERVAL 5000
> > +
> > +/* Defines the moving average array length. */
> > +#define MIN_AVG_LEN (60000/COVERAGE_RUN_INTERVAL)
> > +#define HR_AVG_LEN 60
> > +
> > /* A coverage counter. */
> > struct coverage_counter {
> > const char *name; /* Textual name. */
> > - unsigned int count; /* Count within the current epoch. */
> > - unsigned long long int total; /* Total count over all epochs. */
> > + unsigned int count; /* Count within the current second. */
> > + unsigned long long int total; /* Total count over all time. */
> > +
> > + /* The moving average arrays. */
> > + unsigned int min[MIN_AVG_LEN];
> > + unsigned int hr[HR_AVG_LEN];
> > };
> >
> > /* Defines COUNTER. There must be exactly one such definition at file
> > scope
> > @@ -57,11 +70,12 @@ struct coverage_counter {
> >
> > void coverage_init(void);
> > void coverage_log(void);
> > -void coverage_clear(void);
> > +void coverage_run(void);
> >
> > /* Implementation detail. */
> > #define COVERAGE_DEFINE__(COUNTER) \
> > extern struct coverage_counter counter_##COUNTER; \
> > - struct coverage_counter counter_##COUNTER = { #COUNTER, 0, 0 }
> > + struct coverage_counter counter_##COUNTER = { #COUNTER, 0, 0, \
> > + {0}, {0} }
> >
> > #endif /* coverage.h */
> > diff --git a/lib/timeval.c b/lib/timeval.c
> > index faf8e7b..2380a0a 100644
> > --- a/lib/timeval.c
> > +++ b/lib/timeval.c
> > @@ -304,7 +304,6 @@ time_poll(struct pollfd *pollfds, int n_pollfds, long
> > long int timeout_when,
> > if (*last_wakeup) {
> > log_poll_interval(*last_wakeup);
> > }
> > - coverage_clear();
> > start = time_msec();
> > blocked = false;
> >
> > diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> > index b093998..fa058f7 100644
> > --- a/tests/ofproto-dpif.at
> > +++ b/tests/ofproto-dpif.at
> > @@ -2689,3 +2689,35 @@ AT_CHECK([tail -1 stdout], [0], [Datapath actions: 5
> > ])
> > OVS_VSWITCHD_STOP
> > AT_CLEANUP
> > +
> > +# Unit test for appctl coverage/show command
> > +AT_SETUP([ofproto-dpif - coverage/show])
> > +OVS_VSWITCHD_START
> > +
> > +ovs-appctl time/stop
> > +# before the first 5 seconds, nothing can be calculated but the total
> > count.
> > +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'],
> > [], [dnl
> > +unixctl_received 0.0/sec 0/min 0/hr total: 2
> > +])
> > +
> > +ovs-appctl time/warp 5000
> > +# at first 5 second time instant, should have stats.
> > +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'],
> > [], [dnl
> > +unixctl_received 0.6/sec 3/min 3/hr total: 4
> > +])
> > +
> > +for i in `seq 0 10`; do ovs-appctl time/warp 5000; done
> > +# advance to first 1 minute time instant.
> > +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'],
> > [], [dnl
> > +unixctl_received 0.2/sec 15/min 15/hr total: 16
> > +])
> > +
> > +ovs-appctl time/warp 60000
> > +# advance to next 1 minute time instant directly, should observe the
> > per-minute
> > +# rate drop.
> > +AT_CHECK([ovs-appctl coverage/show | sed -n '/^unixctl_received.*/p'],
> > [], [dnl
> > +unixctl_received 0.4/sec 2/min 17/hr total: 18
> > +])
> > +
> > +OVS_VSWITCHD_STOP
> > +AT_CLEANUP
> > diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c
> > index bc45dac..a9db1fd 100644
> > --- a/vswitchd/ovs-vswitchd.c
> > +++ b/vswitchd/ovs-vswitchd.c
> > @@ -28,6 +28,7 @@
> > #include "bridge.h"
> > #include "command-line.h"
> > #include "compiler.h"
> > +#include "coverage.h"
> > #include "daemon.h"
> > #include "dirs.h"
> > #include "dpif.h"
> > @@ -119,6 +120,7 @@ main(int argc, char *argv[])
> > bridge_run_fast();
> > unixctl_server_run(unixctl);
> > netdev_run();
> > + coverage_run();
> >
> > signal_wait(sighup);
> > memory_wait();
> > --
> > 1.7.10.4
> >
> >
More information about the dev
mailing list