[ovs-dev] [patch_v4 2/2] dpif-netdev: Refactor some pmd stats.

Thu Aug 17 13:48:01 UTC 2017

Hi Darrell,

Please find my specific comments below.

I have a general concern for the pmd-stats-show counters: The output refers to the number of processed packets (e.g. processing cycles/pkt etc). But the data actually presented is the total number of *passes of packets through the datapath*. 

In the early OVS days there was little difference, but with increasing amount of recirculations ( for tunnel decap, pop_mpls, bond selection, conntrack), nowadays each packet passes the datapath several times and the output becomes confusing for users, especially as it does not contain any indication of the average number of datapath passes per packet.

I would strongly suggest that we add a new PMD stats counter for the actually received packets, which allows us to calculate and include the average number of passes per packet and true average processing cost per pkt. 

What do you think?

BR, Jan

> The per packets stats are presently overlapping in that miss stats include
> lost stats; make these stats non-overlapping for clarity and make this clear
> in the dp_stat_type enum.  This also eliminates the need to increment two
> 'miss' stats for a single packet.
> 
> The subtable lookup stats is renamed to make it clear that it relates to
> masked lookups.
> The stats that total to the number of packets seen are defined in
> dp_stat_type and an api is created to total the stats in case these stats are
> further divided in the future.
> 
> The pmd stats test is enhanced to include megaflow stats counting and
> checking.
> Also, miss and lost stats are annotated to make it clear what they mean.
> 
> Signed-off-by: Darrell Ball <dlu998 at gmail.com>
> ---
>  lib/dpif-netdev.c | 78 ++++++++++++++++++++++++++++++++++--------------
> -------
>  tests/pmd.at      | 31 +++++++++++++++++-----
>  2 files changed, 74 insertions(+), 35 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 17e1666..dfc6684
> 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -323,12 +323,19 @@ static struct dp_netdev_port
> *dp_netdev_lookup_port(const struct dp_netdev *dp,
>      OVS_REQUIRES(dp->port_mutex);
> 
>  enum dp_stat_type {
> -    DP_STAT_EXACT_HIT,          /* Packets that had an exact match (emc).
> */
> -    DP_STAT_MASKED_HIT,         /* Packets that matched in the flow table.
> */
> -    DP_STAT_MISS,               /* Packets that did not match. */
> -    DP_STAT_LOST,               /* Packets not passed up to the client. */
> -    DP_STAT_LOOKUP_HIT,         /* Number of subtable lookups for flow
> table
> -                                   hits */
> +    DP_STAT_EXACT_HIT,  /* Packets that had an exact match (emc). */
> +    DP_STAT_MASKED_HIT, /* Packets that matched in the flow table. */
> +    DP_STAT_MISS,       /* Packets that did not match and upcall was
> +                           done. */
> +    DP_STAT_LOST,       /* Packets that did not match and upcall was
> +                           not done. */

If I understand the implementation correctly, this counts packets for which an upcall was done but failed so that the packet was dropped. There are no packets with a miss for which we currently do not do an upcall. Why do you suggest to change the meaning of the counter?

> +    DP_N_PER_PKT_CNT,   /* The above statistics account for the total
> +                           number of packets seen and should not be
> +                           overlapping with each other. */

I find this definition very obscure. I think it was much clearer (and less likely to break again in the future) to explicitly sum up the above four non-overlapping counters to obtain the total number of packets as suggested by Ilya. It would be good to keep the comment, though.

> +    DP_STAT_MASKED_LOOKUP = DP_N_PER_PKT_CNT,  /* Number of
> subtable lookups
> +                                                  for flow table hits. Each
> +                                                  MASKED_HIT hit will have
> +                                                  >= 1 MASKED_LOOKUP(s). */
>      DP_N_STATS
>  };

+1 for the clarification of the comment.

> +static unsigned long long
> +dp_netdev_calcul_total_packets(unsigned long long *stats) {
> +    unsigned long long total_packets = 0;
> +    for (uint8_t i = 0; i < DP_N_PER_PKT_CNT; i++) {
> +        total_packets += stats[i];
> +    }
> +    return total_packets;
> +}

Please compute the sum explicitly. Much clearer and better maintainable.

>      ds_put_format(reply,
>                    "\temc hits:%llu\n\tmegaflow hits:%llu\n"
> -                  "\tavg. subtable lookups per hit:%.2f\n"
> -                  "\tmiss:%llu\n\tlost:%llu\n",
> +                  "\tavg. subtable lookups per megaflow hit:%.2f\n"
> +                  "\tmiss(upcall done):%llu\n\tlost(upcall not
> + done):%llu\n",
>                    stats[DP_STAT_EXACT_HIT], stats[DP_STAT_MASKED_HIT],
>                    stats[DP_STAT_MASKED_HIT] > 0
> -                  ?
> (1.0*stats[DP_STAT_LOOKUP_HIT])/stats[DP_STAT_MASKED_HIT]
> +                  ? (1.0 * stats[DP_STAT_MASKED_LOOKUP])
> +                     / stats[DP_STAT_MASKED_HIT]
>                    : 0,
>                    stats[DP_STAT_MISS], stats[DP_STAT_LOST]);

We should not lightly break backward compatibility in the output. There are plenty of scripts out there in test labs and production deployments that parse this output for performance monitoring/trouble-shooting purposes.

It is perhaps OK to add new lines, but please do not unnecessarily change existing lines!

> -static inline void
> +static inline int
>  handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
>                       struct dp_packet *packet,
>                       const struct netdev_flow_key *key,
>                       struct ofpbuf *actions, struct ofpbuf *put_actions,
> -                     int *lost_cnt, long long now)
> +                     int *miss_no_upcall, long long now)
>  {
>      struct ofpbuf *add_actions;
>      struct dp_packet_batch b;
>      struct match match;
>      ovs_u128 ufid;
> -    int error;
> +    int error = 0;
> 
>      match.tun_md.valid = false;
>      miniflow_expand(&key->mf, &match.flow); @@ -4749,8 +4765,8 @@
> handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
>                               put_actions);
>      if (OVS_UNLIKELY(error && error != ENOSPC)) {
>          dp_packet_delete(packet);
> -        (*lost_cnt)++;
> -        return;
> +        (*miss_no_upcall)++;
> +        return error;
>      }

If you return the result of the upcall explicitly, there is no need to also increase the lost_cnt passed by reference. Better step the appropriate counter in the calling function.

> 
>      /* The Netlink encoding of datapath flow keys cannot express @@ -
> 4790,6 +4806,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread
> *pmd,
>          ovs_mutex_unlock(&pmd->flow_mutex);
>          emc_probabilistic_insert(pmd, key, netdev_flow);
>      }
> +    return error;
>  }
> 
>  static inline void
> @@ -4811,7 +4828,7 @@ fast_path_processing(struct
> dp_netdev_pmd_thread *pmd,
>      struct dpcls *cls;
>      struct dpcls_rule *rules[PKT_ARRAY_SIZE];
>      struct dp_netdev *dp = pmd->dp;
> -    int miss_cnt = 0, lost_cnt = 0;
> +    int miss_upcall_cnt = 0, miss_no_upcall_cnt = 0;

Better upcall_ok_cnt and upcall_drop_cnt?

> @@ -4853,9 +4870,12 @@ fast_path_processing(struct
> dp_netdev_pmd_thread *pmd,
>                  continue;
>              }
> 
> -            miss_cnt++;
> -            handle_packet_upcall(pmd, packets[i], &keys[i], &actions,
> -                                 &put_actions, &lost_cnt, now);
> +            int error = handle_packet_upcall(pmd, packets[i], &keys[i],
> +                           &actions, &put_actions, &miss_no_upcall_cnt,
> + now);
> +
> +            if (OVS_LIKELY(!error)) {
> +                miss_upcall_cnt++;
> +            }

Better:
if (OVS_LIKELY(!error)) {
      upcall_ok_cnt++;
} else {
     upcall_drop_cnt++;
}

>              if (OVS_UNLIKELY(!rules[i])) {
>                  dp_packet_delete(packets[i]);
> -                lost_cnt++;
> -                miss_cnt++;
> +                miss_no_upcall_cnt++;

upcall_drop_cnt++;

>              }
>          }
>      }
> @@ -4885,10 +4904,11 @@ fast_path_processing(struct
> dp_netdev_pmd_thread *pmd,
>          dp_netdev_queue_batches(packet, flow, &keys[i].mf, batches,
> n_batches);
>      }
> 
> -    dp_netdev_count_packet(pmd, DP_STAT_MASKED_HIT, cnt - miss_cnt);
> -    dp_netdev_count_packet(pmd, DP_STAT_LOOKUP_HIT, lookup_cnt);
> -    dp_netdev_count_packet(pmd, DP_STAT_MISS, miss_cnt);
> -    dp_netdev_count_packet(pmd, DP_STAT_LOST, lost_cnt);
> +    dp_netdev_count_packet(pmd, DP_STAT_MASKED_HIT, cnt -
> miss_upcall_cnt -
> +                           miss_no_upcall_cnt);
> +    dp_netdev_count_packet(pmd, DP_STAT_MASKED_LOOKUP,
> lookup_cnt);
> +    dp_netdev_count_packet(pmd, DP_STAT_MISS, miss_upcall_cnt);
> +    dp_netdev_count_packet(pmd, DP_STAT_LOST, miss_no_upcall_cnt);
>  }

dp_netdev_count_packet(pmd, DP_STAT_MASKED_HIT, cnt - upcall_ok_cnt - upcall_drop_cnt);
dp_netdev_count_packet(pmd, DP_STAT_MISS, upcall_ok_cnt);
dp_netdev_count_packet(pmd, DP_STAT_LOST, upcall_drop_cnt);