[ovs-dev] [v12 08/16] dpif-netdev: Add a partial HWOL PMD statistic.

Ferriter, Cian cian.ferriter at intel.com
Thu Jun 10 14:46:07 UTC 2021


Hi Ian,

Thanks for the review. My responses are inline.

Also, one thing I spotted while looking at your later reviews was that we should update the "lib/dpif-netdev-unixctl.man" file since it prints an example of the PMD statistics.
I'll fix that in the next version.

> -----Original Message-----
> From: Stokes, Ian <ian.stokes at intel.com>
> Sent: Wednesday 2 June 2021 19:10
> To: Ferriter, Cian <cian.ferriter at intel.com>; ovs-dev at openvswitch.org; Van Haaren, Harry <harry.van.haaren at intel.com>; Gaetan
> Rivet <gaetanr at nvidia.com>; Sriharsha Basavapatna <sriharsha.basavapatna at broadcom.com>
> Cc: i.maximets at ovn.org
> Subject: RE: [ovs-dev] [v12 08/16] dpif-netdev: Add a partial HWOL PMD statistic.
> 
> > It is possible for packets traversing the userspace datapath to match a
> > flow before hitting on EMC by using a mark ID provided by a NIC. Add a
> > PMD statistic for this hit.
> >
> > Signed-off-by: Cian Ferriter <cian.ferriter at intel.com>
> 
> Thanks for the Patch Cian, comments inline.
> 
> So I have a general query here. In the bigger  picture why is the patch required for the DPIF work?
> Is it purely to track packet hits that have touched the software datapath (unlike a full offload usecase)?
> 

This patch is not strictly required for the DPIF work, but we found it much easier to test the functionality of HWOL in the DPIF with hit stats. As we added EMC, SMC and DPCLS capabilities to the AVX512 DPIF, we would test and could use the PMD stats to see if these lookup levels were being hit as we expected. When we tested HWOL, it just seemed like a missing feature, and adding it helped for our testing. We think it might be useful for others. 

> I also think it would be good to get some input from other involved on the HWOL work to date as they may have opinions on this so
> I've included some folks from Nvidia and Broadcom in case they have input.
> The implementation seems straight forward enough but good to gauge in case this goes against ongoing HWOL work.
> 

Good idea. Input from others involved in HWOL is much appreciated.

> Thanks
> Ian
> > ---
> >  NEWS                     | 2 ++
> >  lib/dpif-netdev-avx512.c | 3 +++
> >  lib/dpif-netdev-perf.c   | 3 +++
> >  lib/dpif-netdev-perf.h   | 1 +
> >  lib/dpif-netdev.c        | 9 +++++++--
> >  tests/pmd.at             | 6 ++++--
> >  6 files changed, 20 insertions(+), 4 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index 402ce5969..3eab5f4fa 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -5,6 +5,8 @@ Post-v2.15.0
> >     - Userspace datapath:
> >       * Auto load balancing of PMDs now partially supports cross-NUMA polling
> >         cases, e.g if all PMD threads are running on the same NUMA node.
> > +     * Add a partial HWOL PMD statistic counting hits similar to existing
> > +     * EMC/SMC/DPCLS stats.
> 
> No need for the * on the second line above, only the first line.
> 

Good catch, this is a typo. I'll fix in the next version.

> >     - ovs-ctl:
> >       * New option '--no-record-hostname' to disable hostname configuration
> >         in ovsdb on startup.
> > diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c
> > index e255c9d17..5c6f54799 100644
> > --- a/lib/dpif-netdev-avx512.c
> > +++ b/lib/dpif-netdev-avx512.c
> > @@ -114,6 +114,7 @@ dp_netdev_input_outer_avx512(struct
> > dp_netdev_pmd_thread *pmd,
> >      const uint32_t emc_enabled = pmd->ctx.emc_insert_min != 0;
> >      const uint32_t smc_enabled = pmd->ctx.smc_enable_db;
> >
> > +    uint32_t phwol_hits = 0;
> 
> Minor nit, I'd expect new variables to be added after the existing variables below.
> 

I'll add "phwol_hits" below the other *_hits variables. I was thinking from the OVS datapath pipeline POV (HWOL->EMC->SMC).

I'll fix this in the next version.

> >      uint32_t emc_hits = 0;
> >      uint32_t smc_hits = 0;
> >
> > @@ -147,6 +148,7 @@ dp_netdev_input_outer_avx512(struct
> > dp_netdev_pmd_thread *pmd,
> >                  pkt_meta[i].tcp_flags = parse_tcp_flags(packet);
> >
> >                  pkt_meta[i].bytes = dp_packet_size(packet);
> > +                phwol_hits++;
> >                  hwol_emc_smc_hitmask |= (1 << i);
> >                  continue;
> >              }
> > @@ -235,6 +237,7 @@ dp_netdev_input_outer_avx512(struct
> > dp_netdev_pmd_thread *pmd,
> >
> >      /* At this point we don't return error anymore, so commit stats here. */
> >      pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_RECV,
> > batch_size);
> > +    pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_PHWOL_HIT,
> > phwol_hits);
> >      pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_EXACT_HIT,
> > emc_hits);
> >      pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SMC_HIT,
> > smc_hits);
> >      pmd_perf_update_counter(&pmd->perf_stats,
> > PMD_STAT_MASKED_HIT,
> > diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c
> > index 9560e7c3c..7103a2d4d 100644
> > --- a/lib/dpif-netdev-perf.c
> > +++ b/lib/dpif-netdev-perf.c
> > @@ -246,6 +246,7 @@ pmd_perf_format_overall_stats(struct ds *str, struct
> > pmd_perf_stats *s,
> >          ds_put_format(str,
> >              "  Rx packets:        %12"PRIu64"  (%.0f Kpps, %.0f cycles/pkt)\n"
> >              "  Datapath passes:   %12"PRIu64"  (%.2f passes/pkt)\n"
> > +            "  - PHWOL hits:      %12"PRIu64"  (%5.1f %%)\n"
> >              "  - EMC hits:        %12"PRIu64"  (%5.1f %%)\n"
> >              "  - SMC hits:        %12"PRIu64"  (%5.1f %%)\n"
> >              "  - Megaflow hits:   %12"PRIu64"  (%5.1f %%, %.2f "
> > @@ -255,6 +256,8 @@ pmd_perf_format_overall_stats(struct ds *str, struct
> > pmd_perf_stats *s,
> >              rx_packets, (rx_packets / duration) / 1000,
> >              1.0 * stats[PMD_CYCLES_ITER_BUSY] / rx_packets,
> >              passes, rx_packets ? 1.0 * passes / rx_packets : 0,
> > +            stats[PMD_STAT_PHWOL_HIT],
> > +            100.0 * stats[PMD_STAT_PHWOL_HIT] / passes,
> >              stats[PMD_STAT_EXACT_HIT],
> >              100.0 * stats[PMD_STAT_EXACT_HIT] / passes,
> >              stats[PMD_STAT_SMC_HIT],
> > diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h
> > index 72645b6b3..8b1a52387 100644
> > --- a/lib/dpif-netdev-perf.h
> > +++ b/lib/dpif-netdev-perf.h
> > @@ -56,6 +56,7 @@ extern "C" {
> >  /* Set of counter types maintained in pmd_perf_stats. */
> >
> >  enum pmd_stat_type {
> > +    PMD_STAT_PHWOL_HIT,     /* Packets that had a partial HWOL hit
> > (phwol). */
> >      PMD_STAT_EXACT_HIT,     /* Packets that had an exact match (emc). */
> >      PMD_STAT_SMC_HIT,       /* Packets that had a sig match hit (SMC). */
> >      PMD_STAT_MASKED_HIT,    /* Packets that matched in the flow table. */
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > index b35ccbe3b..f161ae7f5 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -660,6 +660,7 @@ pmd_info_show_stats(struct ds *reply,
> >                    "  packets received: %"PRIu64"\n"
> >                    "  packet recirculations: %"PRIu64"\n"
> >                    "  avg. datapath passes per packet: %.02f\n"
> > +                  "  phwol hits: %"PRIu64"\n"
> >                    "  emc hits: %"PRIu64"\n"
> >                    "  smc hits: %"PRIu64"\n"
> >                    "  megaflow hits: %"PRIu64"\n"
> > @@ -668,7 +669,8 @@ pmd_info_show_stats(struct ds *reply,
> >                    "  miss with failed upcall: %"PRIu64"\n"
> >                    "  avg. packets per output batch: %.02f\n",
> >                    total_packets, stats[PMD_STAT_RECIRC],
> > -                  passes_per_pkt, stats[PMD_STAT_EXACT_HIT],
> > +                  passes_per_pkt, stats[PMD_STAT_PHWOL_HIT],
> > +                  stats[PMD_STAT_EXACT_HIT],
> >                    stats[PMD_STAT_SMC_HIT],
> >                    stats[PMD_STAT_MASKED_HIT], lookups_per_hit,
> >                    stats[PMD_STAT_MISS], stats[PMD_STAT_LOST],
> > @@ -1685,6 +1687,7 @@ dpif_netdev_get_stats(const struct dpif *dpif,
> > struct dpif_dp_stats *stats)
> >      CMAP_FOR_EACH (pmd, node, &dp->poll_threads) {
> >          stats->n_flows += cmap_count(&pmd->flow_table);
> >          pmd_perf_read_counters(&pmd->perf_stats, pmd_stats);
> > +        stats->n_hit += pmd_stats[PMD_STAT_PHWOL_HIT];
> >          stats->n_hit += pmd_stats[PMD_STAT_EXACT_HIT];
> >          stats->n_hit += pmd_stats[PMD_STAT_SMC_HIT];
> >          stats->n_hit += pmd_stats[PMD_STAT_MASKED_HIT];
> > @@ -6697,7 +6700,7 @@ dfc_processing(struct dp_netdev_pmd_thread
> > *pmd,
> >                 bool md_is_valid, odp_port_t port_no)
> >  {
> >      struct netdev_flow_key *key = &keys[0];
> > -    size_t n_missed = 0, n_emc_hit = 0;
> > +    size_t n_missed = 0, n_emc_hit = 0, n_phwol_hit = 0;
> >      struct dfc_cache *cache = &pmd->flow_cache;
> >      struct dp_packet *packet;
> >      const size_t cnt = dp_packet_batch_size(packets_);
> > @@ -6739,6 +6742,7 @@ dfc_processing(struct dp_netdev_pmd_thread
> > *pmd,
> >              flow = mark_to_flow_find(pmd, mark);
> >              if (OVS_LIKELY(flow)) {
> >                  tcp_flags = parse_tcp_flags(packet);
> > +                n_phwol_hit++;
> >                  if (OVS_LIKELY(batch_enable)) {
> >                      dp_netdev_queue_batches(packet, flow, tcp_flags, batches,
> >                                              n_batches);
> > @@ -6801,6 +6805,7 @@ dfc_processing(struct dp_netdev_pmd_thread
> > *pmd,
> >      /* Count of packets which are not flow batched. */
> >      *n_flows = map_cnt;
> >
> > +    pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_PHWOL_HIT,
> > n_phwol_hit);
> >      pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_EXACT_HIT,
> > n_emc_hit);
> >
> >      if (!smc_enable_db) {
> > diff --git a/tests/pmd.at b/tests/pmd.at
> > index cc5371d5a..34a59d502 100644
> > --- a/tests/pmd.at
> > +++ b/tests/pmd.at
> > @@ -202,11 +202,12 @@ dummy at ovs-dummy: hit:0 missed:0
> >      p0 7/1: (dummy-pmd: configured_rx_queues=4,
> > configured_tx_queues=<cleared>, requested_rx_queues=4,
> > requested_tx_queues=<cleared>)
> >  ])
> >
> > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed
> > SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 9], [0], [dnl
> > +AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed
> > SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 10], [0], [dnl
> >  pmd thread numa_id <cleared> core_id <cleared>:
> >    packets received: 0
> >    packet recirculations: 0
> >    avg. datapath passes per packet: 0.00
> > +  phwol hits: 0
> >    emc hits: 0
> >    smc hits: 0
> >    megaflow hits: 0
> > @@ -233,11 +234,12 @@ AT_CHECK([cat ovs-vswitchd.log |
> > filter_flow_install | strip_xout], [0], [dnl
> >
> > recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:77,dst
> > =50:54:00:00:01:78),eth_type(0x0800),ipv4(frag=no), actions: <del>
> >  ])
> >
> > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed
> > SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 9], [0], [dnl
> > +AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed
> > SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 10], [0], [dnl
> >  pmd thread numa_id <cleared> core_id <cleared>:
> >    packets received: 20
> >    packet recirculations: 0
> >    avg. datapath passes per packet: 1.00
> > +  phwol hits: 0
> >    emc hits: 19
> >    smc hits: 0
> >    megaflow hits: 0
> > --
> > 2.31.1
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev



More information about the dev mailing list