[ovs-dev] [PATCH v3] dpif-netdev: dfc_process optimization by prefetching EMC entry.

Ilya Maximets i.maximets at samsung.com
Fri Mar 29 17:33:05 UTC 2019


Hi.
I made few tests on PVP with bonded PHY setup and found no significant
difference in maximum performance with low and medium number of flows
(8 - 8192). In case of big number of flows (64K - 256K) I see performance
drop in about 2-3%. I think that because of prefetching of a second cacheline,
which is unneeded because current_entry->key.hash likely != key->hash
and we don't need to compare miniflows while emc_lookup. Changing the code
to prefetch the first cacheline only decreases the drop to ~1% (However,
this increases the consumed processing cycles for medium numbers of flows
described below).

OTOH, I see a slight decrease (~1%) of consumed cycles per packet for the
thread that polls HW NICs and send packets to VM, which is good.
This improvement observed for a medium small number of flows: 512 - 8192.
For a low (1 - 256) and high (64K - 256K) numbers of flows value of consumed
processing cycles per packet fro this thread was not affected by the patch.

Tests made with average 512B packets, EMC enabled, SMC disabled, TX flushing
interval 50us. Note that the bottleneck in this case is the VM --> bonded PHY
part which is the case of 5tuple hash calculation.

See review comments inline.

Best regards, Ilya Maximets.

On 22.03.2019 11:44, Yanqin Wei (Arm Technology China) wrote:
> Hi , OVS Maintainers,
> 
> Could you help to have a look at this patch? Thanks a lot.
> 
> Best Regards,
> Wei Yanqin
> 
> -----Original Message-----
> From: Yanqin Wei <Yanqin.Wei at arm.com> 
> Sent: Wednesday, March 13, 2019 1:28 PM
> To: dev at openvswitch.org
> Cc: nd <nd at arm.com>; Gavin Hu (Arm Technology China) <Gavin.Hu at arm.com>; Yanqin Wei (Arm Technology China) <Yanqin.Wei at arm.com>
> Subject: [ovs-dev][PATCH v3] dpif-netdev: dfc_process optimization by prefetching EMC entry.
> 
> It is observed that the throughput of multi-flow is worse than single-flow in the EMC NIC to NIC cases. It is because CPU cache-miss increasing in EMC lookup. Each flow need load at least one EMC entry to CPU cache(several cache lines) and compare it with packet miniflow.
> This patch improve it by prefetching EMC entry in advance. Hash value can be obtained from dpdk rss hash, so this step can be advanced ahead of
> miniflow_extract() and prefetch EMC entry there. The prefetching size is defined as ROUND_UP(128,CACHE_LINE_SIZE), which can cover majority traffic including TCP/UDP protocol and need 2 cache lines in most modern CPU.
> Performance test was run in some arm platform. 1000/10000 flows NIC2NIC test achieved around 10% throughput improvement in thunderX2(aarch64 platform).
> 
> Signed-off-by: Yanqin Wei <Yanqin.Wei at arm.com>
> Reviewed-by: Gavin Hu <Gavin.Hu at arm.com>
> ---
>  lib/dpif-netdev.c | 80 ++++++++++++++++++++++++++++++++++++-------------------
>  1 file changed, 52 insertions(+), 28 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 4d6d0c3..982082c 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -189,6 +189,10 @@ struct netdev_flow_key {
>  #define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX /                     \
>                                      DEFAULT_EM_FLOW_INSERT_INV_PROB)
>  
> +/* DEFAULT_EMC_PREFETCH_SIZE can cover majority traffic including 
> +TCP/UDP
> + * protocol. */
> +#define DEFAULT_EMC_PREFETCH_SIZE ROUND_UP(128,CACHE_LINE_SIZE)

Space after the ',' needed.

> +
>  struct emc_entry {
>      struct dp_netdev_flow *flow;
>      struct netdev_flow_key key;   /* key.hash used for emc hash value. */
> @@ -6166,15 +6170,20 @@ dp_netdev_upcall(struct dp_netdev_pmd_thread *pmd, struct dp_packet *packet_,  }
>  
>  static inline uint32_t
> -dpif_netdev_packet_get_rss_hash_orig_pkt(struct dp_packet *packet,
> -                                const struct miniflow *mf)
> +dpif_netdev_packet_get_packet_rss_hash(struct dp_packet *packet,
> +                                bool md_is_valid)

1. 'packet' word appears twice in the function name.
How about 'dpif_netdev_packet_get_rss_hash' and
'dpif_netdev_packet_get_5tuple_hash' for the second function?

2.  Please align the second line to the level of next char after '('.

3. 'md_is_valid' is meaningless name inside the function.
    What about renaming to 'bool account_recirc_id'?

>  {
> -    uint32_t hash;
> +    uint32_t hash,recirc_depth;

missing space.

>  
> -    if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> -        hash = dp_packet_get_rss_hash(packet);
> -    } else {
> -        hash = miniflow_hash_5tuple(mf, 0);
> +    hash = dp_packet_get_rss_hash(packet);
> +
> +    if (md_is_valid) {
> +        /* The RSS hash must account for the recirculation depth to avoid
> +         * collisions in the exact match cache */
> +        recirc_depth = *recirc_depth_get_unsafe();
> +        if (OVS_UNLIKELY(recirc_depth)) {

We're in recirculation if md is valid. So, 'OVS_UNLIKELY' is wrong here.
I think that we don't need to check the value at all.
'OVS_UNLIKELY' should be moved to the upper 'if'.

> +            hash = hash_finish(hash, recirc_depth);
> +        }
>          dp_packet_set_rss_hash(packet, hash);

This could be moved inside the 'if', because we need to update rss hash
only if it was changed.

>      }
>  
> @@ -6182,24 +6191,23 @@ dpif_netdev_packet_get_rss_hash_orig_pkt(struct dp_packet *packet,  }
>  
>  static inline uint32_t
> -dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
> -                                const struct miniflow *mf)
> +dpif_netdev_packet_get_hash_5tuple(struct dp_packet *packet,
> +                                const struct miniflow *mf,
> +                                bool md_is_valid)

Alignments.

>  {
> -    uint32_t hash, recirc_depth;
> +    uint32_t hash,recirc_depth;

Space.

>  
> -    if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> -        hash = dp_packet_get_rss_hash(packet);
> -    } else {
> -        hash = miniflow_hash_5tuple(mf, 0);
> -        dp_packet_set_rss_hash(packet, hash);
> -    }
> +    hash = miniflow_hash_5tuple(mf, 0);
> +    dp_packet_set_rss_hash(packet, hash);
>  
> -    /* The RSS hash must account for the recirculation depth to avoid
> -     * collisions in the exact match cache */
> -    recirc_depth = *recirc_depth_get_unsafe();
> -    if (OVS_UNLIKELY(recirc_depth)) {
> -        hash = hash_finish(hash, recirc_depth);
> -        dp_packet_set_rss_hash(packet, hash);
> +    if (md_is_valid) {
> +        /* The RSS hash must account for the recirculation depth to avoid
> +         * collisions in the exact match cache */
> +        recirc_depth = *recirc_depth_get_unsafe();
> +        if (OVS_UNLIKELY(recirc_depth)) {

Same as previous.

> +            hash = hash_finish(hash, recirc_depth);
> +            dp_packet_set_rss_hash(packet, hash);

This function sets the rss to dp_packet twice. You could avoid that
by moving the 'dp_packet_set_rss_hash' call to the end of this function.

> +        }
>      }
>      return hash;
>  }
> @@ -6390,6 +6398,7 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd,
>      bool smc_enable_db;
>      size_t map_cnt = 0;
>      bool batch_enable = true;
> +    bool is_5tuple_hash_needed;
>  
>      atomic_read_relaxed(&pmd->dp->smc_enable_db, &smc_enable_db);
>      pmd_perf_update_counter(&pmd->perf_stats,
> @@ -6436,16 +6445,31 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd,
>              }
>          }
>  
> -        miniflow_extract(packet, &key->mf);
> -        key->len = 0; /* Not computed yet. */
>          /* If EMC and SMC disabled skip hash computation */
>          if (smc_enable_db == true || cur_min != 0) {
> -            if (!md_is_valid) {
> -                key->hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet,
> -                        &key->mf);
> +            if (OVS_LIKELY(dp_packet_rss_valid(packet))) {

I'd not put the 'LIKELY' macro here, because all the packets from virtual
ports has no RSS and this is the common case.

> +                is_5tuple_hash_needed = false;
> +                key->hash =
> +                   dpif_netdev_packet_get_packet_rss_hash(packet,md_is_valid);

Space.

> +                if (cur_min) {
> +                    ovs_prefetch_range(
> +                      &cache->emc_cache.entries[key->hash & EM_FLOW_HASH_MASK],
> +                      DEFAULT_EMC_PREFETCH_SIZE);
> +                }
>              } else {
> -                key->hash = dpif_netdev_packet_get_rss_hash(packet, &key->mf);
> +                is_5tuple_hash_needed = true;
>              }
> +        } else {
> +            is_5tuple_hash_needed = false;
> +        }
> +
> +        miniflow_extract(packet, &key->mf);
> +        key->len = 0; /* Not computed yet. */
> +
> +        /* If 5tuple hash is needed */

This comment is not that useful. IMHO, the variable is self-documenting.

> +        if (is_5tuple_hash_needed) {
> +            key->hash = dpif_netdev_packet_get_hash_5tuple(packet, &key->mf,
> +                                                           
> + md_is_valid);
>          }
>          if (cur_min) {
>              flow = emc_lookup(&cache->emc_cache, key);
> --
> 2.7.4
> 
> 
> 


More information about the dev mailing list