[ovs-dev] [PATCH 1/3] Add performance measuring API

Thu Jan 4 23:58:12 UTC 2018

On Mon, Dec 18, 2017 at 04:51:37PM -0600, Mark Michelson wrote:
> This is similar to the existing coverage and perf-counter APIs in OVS.
> However, rather than keeping counters, this is aimed at timing how long
> operations take to perform. "Operations" in this case can be anything
> from a loop iteration, to a function, to something more complex.
> 
> The library will keep track of how long it takes to perform the
> particular operations and will maintain statistics of those running
> times.
> 
> Statistics for a particular operation can be queried from the command
> line by using ovs-appctl -t <target> performance/show <operation name>.
> 
> The API is designed to be pretty small. The expected steps are as
> follows:
> 
> 1) Create a performance measurement, providing a unique name, using
> performance_create()
> 2) Add calls to start_sample() and end_sample() to mark the start and
> stop of the operation you wish to measure.
> 3) Periodically call performance_run() in order to compile statistics.
> 4) Upon completion (likely program shutdown), call performance_destroy()
> to clean up.
> 
> Signed-off-by: Mark Michelson <mmichels at redhat.com>

Thanks.  I guess that this will be useful from time to time.

It would be helpful to have a comment on each public function explaining
how to use it.  In particular, the meaning of parameters isn't
necessarily obvious (e.g. I guess that sample_rate is in msec?).

In performance_init(), usually we would use ovsthread_once instead of
raw pthread_once().

In performance_create(), you can use xzalloc() instead of xcalloc().

Usually we put struct and data definitions at the top of a file, unless
there's a good reason otherwise.  In this case, I'd move struct stats
and struct performance and 'performances' to the top of the file.

This doesn't look thread-safe; that would require a mutex (etc.) to
protect 'performances'.  Or it could be thread-local.

I think that 'performances' should be static.

Usually, in simple cases like this we put comments next to struct
members, e.g.:

struct sample {
    long long int start_time;   /* Time when we started this sample. */
    long long int end_time;     /* Time when we ended this sample. */
    long long int elapsed;      /* Elapsed time: end_time - start_time. */
};

It bothers me a little to see find_earliest use a size_t to iterate
through the samples and then return -1 to indicate failure.

There should probably be documentation on the precision here.  I think
it's 1 ms, so only fairly long kinds of activities can be timed with any
accuracy.

Maybe percentile() should document the valid range for 'percentile'.  In
particular, a 'percentile' of 100 will read past the end of the array.

I'd prefer if the names of start_sample() and end_sample() started with
performance_, for consistency.

sort_times() could use xmemdup().

Usually we line up function declarations more like this:
    static int
    get_stats(const struct sample_vec *vec, long long int age_ms,
              struct stats *stats)
with the second line of parameters indented just past the (.

In many cases, we've found that for "show" kinds of commands, it is
helpful to allow an empty set of parameters to show all of the entities
of that type.

unixctl_command_register() should pass "NAME" as the second argument, to
let the user know what parameter is expected (or "[NAME]", if you make
it optional).

Is there value in calculating stats periodically in performance_run()?
It looks to me like nothing is lost if they are calculated only when
requested via performance/show.

I think that running cull_old_times() in each call to performance_run()
means that we will normally cull off only a single element, which is
expensive due to the copying.  Maybe there should be a strategy to cull
less frequently so that on average maybe half of the elements are
culled.

Thanks,

Ben.