[ovs-dev] [PATCH v4] dpif-netdev: Avoid reading RSS hash when EMC is disabled

O Mahony, Billy billy.o.mahony at intel.com
Fri Sep 1 13:24:05 UTC 2017


Hi Antonio,

Sorry for the late follow up on this version of this patch.

I'm not sure if you are withdrawing the v4 of this patch. But I think my comments are valid for both v4 and v3 versions.

> -----Original Message-----
> From: ovs-dev-bounces at openvswitch.org [mailto:ovs-dev-
> bounces at openvswitch.org] On Behalf Of Fischetti, Antonio
> Sent: Saturday, August 26, 2017 5:58 PM
> To: Darrell Ball <dball at vmware.com>; dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v4] dpif-netdev: Avoid reading RSS hash when
> EMC is disabled
> 
> 
> > -----Original Message-----
> > From: Darrell Ball [mailto:dball at vmware.com]
> > Sent: Saturday, August 26, 2017 1:45 AM
> > To: Fischetti, Antonio <antonio.fischetti at intel.com>;
> > dev at openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH v4] dpif-netdev: Avoid reading RSS hash
> > when EMC is disabled
> >
> >
> >
> > On 8/25/17, 3:21 PM, "Fischetti, Antonio" <antonio.fischetti at intel.com>
> wrote:
> >
> >
> >
> >     > -----Original Message-----
> >     > From: Darrell Ball [mailto:dball at vmware.com]
> >     > Sent: Friday, August 25, 2017 6:19 PM
> >     > To: Fischetti, Antonio <antonio.fischetti at intel.com>;
> dev at openvswitch.org
> >     > Subject: Re: [ovs-dev] [PATCH v4] dpif-netdev: Avoid reading RSS
> > hash when EMC
> >     > is disabled
> >     >
> >     > Hi Antonio
> >     >
> >     > On 8/25/17, 6:56 AM, "ovs-dev-bounces at openvswitch.org on behalf of
> >     > antonio.fischetti at intel.com" <ovs-dev-bounces at openvswitch.org on
> > behalf of
> >     > antonio.fischetti at intel.com> wrote:
> >     >
> >     >     When EMC is disabled the reading of RSS hash is skipped.
> >     >     Also, for packets that are not recirculated it retrieves
> >     >     the hash value without considering the recirc id.

 [[BO'M]] I know you must be tired of repackaging this changes but I these two changes are independent of each and are easier to consider in separate patches and not even as part of a larger patchset. 

> >     >
> >     >     Signed-off-by: Antonio Fischetti <antonio.fischetti at intel.com>
> >     >     ---
> >     >
> >     >      V4
> >     >       - reworked to remove dependencies from other patches in
> >     >         patchset "Skip EMC for recirc pkts and other optimizations."
> >     >         https://urldefense.proofpoint.com/v2/url?u=https-
> >     > 3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-
> >     >
> >
> 2DAugust_337320.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA
> 09CGX7JQ5Ih-
> >     >
> >
> uZnsw&m=7UGZunJeutRugFmidRAykhZFRYscZ2rUUmhF7Ys9IM8&s=hzVGT1
> bI3zOj_x9J5cu18Q2D5
> >     > WLHdEC1qhQ-Gu--TEE&e=
> >     >
> >     >       - measurements were repeated with the latest head of master.
> >     >
> >     >      Port-to-Port Test
> >     >      =================
> >     >             Software built with "-O2 -march=native -g".
> >     >
> >     >             I measured the Rx rate regardless of pkt loss by sending 1
> > UDP
> >     > flow,
> >     >             64B packets, at the line-rate.
> >     >
> >     >             2 PMDs with 3 Tx queues.
> >     >
> >     >             Flows setup:
> >     >               in_port=dpdk0 actions=output:dpdk1
> >     >               in_port=dpdk1 actions=output:dpdk0
> >     >
> >     >             Results
> >     >             -------
> >     >             Values are for the Rx rate in Mpps, regardless of packet
> > loss.
> >     >
> >     >             RSS column:
> >     >                Yes: RSS hash is provided by the NIC
> >     >                No: RSS is disabled and the 5-tuple hash must be
> >     >                    computed in software.
> >     >
> >     >                    Note: to simulate RSS disabled I added the line
> >     >                    dp_packet_rss_valid(struct dp_packet *p)
> >     >                    {
> >     >                    +return false;
> >     >                    #ifdef DPDK_NETDEV
> >     >
> >     >             EMC column:
> >     >                Yes: default probability insertion,
> >     >                No: EMC disabled.
> >     >
> >     >             Orig means Commit ID:
> >     >               75f9e007e7f7eb91461e238f882d1c539c56bb8d
> >     >
> >     > [Darrell]
> >     > It looks like the main benefit (+6% throughput) occurs
> >     > when RSS is not calculated by the NIC and emc is disabled. What
> > are the common
> >     > cases when RSS is not generated by the NIC ?
> >
> >     [Antonio]
> >     A common case could be when packets are coming in from a VM.
> >
> > Well, there is no pnic in this case.
> > I am thinking gateway usage is predominate in importance, but the RSS
> > cases are about equal based on the sampling below, so we could almost
> > ‘remove them from the equation’, as it were.
> >
> > For non-RSS:
> > Using emc, there is about 0.7 % degradation, but it is more common to
> > use emc So overall, it is not clear; what do you think ?
> 
> [Antonio]
> Yes, maybe it's better to keep this patch as it was in v3 - ie dependent on
> patch #1
> "dpif-netdev: Skip EMC lookup/insert for recirc packets" which is under
> discussion - because these changes can fit on those already present in patch
> #1.
> 
> When these changes are put in a stand-alone independent patch we can
> save some cpu cycle for the hash part, but we have some more branch like
> +            if (!md_is_valid) {
> that could explain the 0.7% degradation.
> 
> Thanks anyway for your review and comments.
> 
> >
> >     Also when the NIC does not provide a hash value, but this shouldn't be
> >     considered a common case I guess.
> >
> > I agree.
> >
> >
> >     >
> >     >
> >     >
> >     >                         +--------------+----------------+
> >     >             +-----+-----+    Orig      | Orig + this    |
> >     >             | RSS | EMC |              |    patch       |
> >     >             +-----+-----+--------------+----------------+
> >     >             | Yes | Yes | 12.02, 11.38 | 12.18, 11.32   |
> >     >             | Yes |  No |  8.35,  8.36 |  8.36,  8.38   |
> >     >             +-----+-----+--------------+----------------+
> >     >             |  No | Yes |  9.91, 11.15 |  9.84, 11.08   |
> >     >             |  No |  No |  7.83,  7.87 |  8.33,  8.35   |
> >     >             +-----+-----+--------------+----------------+
[[BO'M]] Unsurprisingly we only see an improvement in the special case of no EMC and no hash from the NIC. 

But the figures also show a slight degradation in other (more common) combinations.

Theoretically this is a little surprising. You think it is down to the extra if statement but cur_min was already being tested in the 'Orig' tests. Also as the test is in a loop the branch predictor while it may be wrong for the first packet in the batch should be correct for all the subsequent packets.

I think you need to run the tests several times to get an idea of the noise level in the tests. If the 'degradation' see above is within the noise level then all is well. I expect that the degradation we see will most likely be just noise in the results.

The two columns (e.g 12.02 and 11.38) refer to two separate directions of traffic? A>B and B<A. I would remove the bidirectional element of the results as it just distracts. We will see the effects more clearly with just one direction.

Also this test does not say anything about the "for non-recirculated packets retrieve the hash value without considering the recirc id" part of the patch.
 
> >     >
> >     >     ---
> >     >      lib/dpif-netdev.c | 32 ++++++++++++++++++++++++++++----
> >     >      1 file changed, 28 insertions(+), 4 deletions(-)
> >     >
> >     >     diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> >     >     index e2cd931..1157998 100644
> >     >     --- a/lib/dpif-netdev.c
> >     >     +++ b/lib/dpif-netdev.c
> >     >     @@ -4567,6 +4567,22 @@ dp_netdev_upcall(struct
> dp_netdev_pmd_thread
> > *pmd,
> >     > struct dp_packet *packet_,
> >     >      }
> >     >
> >     >      static inline uint32_t
> >     >     +dpif_netdev_packet_get_rss_hash_orig_pkt(struct dp_packet
> *packet,
> >     >     +                                const struct miniflow *mf)
> >     >     +{
> >     >     +    uint32_t hash;
> >     >     +
> >     >     +    if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> >     >     +        hash = dp_packet_get_rss_hash(packet);
> >     >     +    } else {
> >     >     +        hash = miniflow_hash_5tuple(mf, 0);
> >     >     +        dp_packet_set_rss_hash(packet, hash);
> >     >     +    }
> >     >     +
> >     >     +    return hash;
> >     >     +}
> >     >     +
> >     >     +static inline uint32_t
> >     >      dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
> >     >                                      const struct miniflow *mf)
> >     >      {
> >     >     @@ -4701,10 +4717,18 @@ emc_processing(struct
> dp_netdev_pmd_thread
> > *pmd,
> >     >              }
> >     >              miniflow_extract(packet, &key->mf);
> >     >              key->len = 0; /* Not computed yet. */
> >     >     -        key->hash = dpif_netdev_packet_get_rss_hash(packet, &key-
> > >mf);
> >     >     -
> >     >     -        /* If EMC is disabled skip emc_lookup */
> >     >     -        flow = (cur_min == 0) ? NULL: emc_lookup(flow_cache, key);
> >     >     +        /* If EMC is disabled skip hash computation and emc_lookup
> > */
> >     >     +        if (OVS_LIKELY(cur_min)) {
> >     >     +            if (!md_is_valid) {

 [[BO'M]]
The code already avoids hashing the recirc depth for non-recirculated packets but it does this by checking recirc depth value rather that using md_is_valid to decide which version of a function to call.

Can it really be more efficient to introduce a second slightly different version of dpif_netdev_packet_get_rss_hash and decide which one to call based on md_is_valid rather than just passing md_is_valid into the existing function and having the 'if' in there?

Is md_is_valid==True the same as recirc_depth>=1?  In which case one can replace the other. Ie would it be more efficient to pass and test md_is_valid rather than retrieving recirc depth (Is that function which is generated by a macro inlined? Or is there some other penalty for accessing thread local storage?

> >     >     +                key->hash =
> >     > dpif_netdev_packet_get_rss_hash_orig_pkt(packet,
> >     >     +                        &key->mf);
> >     >     +            } else {
> >     >     +                key->hash = dpif_netdev_packet_get_rss_hash(packet,
> > &key-
> >     > >mf);
> >     >     +            }
> >     >     +            flow = emc_lookup(flow_cache, key);
> >     >     +        } else {
> >     >     +            flow = NULL;
> >     >     +        }
> >     >              if (OVS_LIKELY(flow)) {
> >     >                  dp_netdev_queue_batches(packet, flow, &key->mf, batches,
> >     >                                          n_batches);
> >     >     --
> >     >     2.4.11
> >     >
> >     >     _______________________________________________
> >     >     dev mailing list
> >     >     dev at openvswitch.org
> >     >     https://urldefense.proofpoint.com/v2/url?u=https-
> >     > 3A__mail.openvswitch.org_mailman_listinfo_ovs-
> >     >
> 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-
> >     >
> >
> uZnsw&m=7UGZunJeutRugFmidRAykhZFRYscZ2rUUmhF7Ys9IM8&s=azu46p
> bDSUX7cT7WFQXNZKhOT
> >     > fE96wI4CF4xgXGi7Go&e=
> >     >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list