[ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

O Mahony, Billy billy.o.mahony at intel.com
Mon Sep 11 08:28:00 UTC 2017


Hi Wang,

I believe that the PMD stats processing cycles includes EMC processing time. 

This is just in the context of your results being surprising. It could be a factor if you are using code where the bug exists. The patch carries a fixes: tag (I think) that should help you figure out if your results were potentially affected by this issue.

Regards,
/Billy. 

> -----Original Message-----
> From: 王志克 [mailto:wangzhike at jd.com]
> Sent: Monday, September 11, 2017 3:00 AM
> To: O Mahony, Billy <billy.o.mahony at intel.com>; ovs-
> dev at openvswitch.org; Jan Scheurich <jan.scheurich at ericsson.com>; Darrell
> Ball <dball at vmware.com>; ovs-discuss at openvswitch.org; Kevin Traynor
> <ktraynor at redhat.com>
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> In my test, almost all traffic went trough via EMC. So the fix does not impact
> the result, especially we want to know the difference (not the exact num).
> 
> Can you test to get some data? Thanks.
> 
> Br,
> Wang Zhike
> 
> -----Original Message-----
> From: O Mahony, Billy [mailto:billy.o.mahony at intel.com]
> Sent: Friday, September 08, 2017 11:18 PM
> To: 王志克; ovs-dev at openvswitch.org; Jan Scheurich; Darrell Ball; ovs-
> discuss at openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Wang,
> 
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337309.html
> 
> I see it's been acked and is due to be pushed to master with other changes
> on the dpdk merge branch so you'll have to apply it manually for now.
> 
> /Billy.
> 
> > -----Original Message-----
> > From: 王志克 [mailto:wangzhike at jd.com]
> > Sent: Friday, September 8, 2017 11:48 AM
> > To: ovs-dev at openvswitch.org; Jan Scheurich
> > <jan.scheurich at ericsson.com>; O Mahony, Billy
> > <billy.o.mahony at intel.com>; Darrell Ball <dball at vmware.com>; ovs-
> > discuss at openvswitch.org; Kevin Traynor <ktraynor at redhat.com>
> > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Billy,
> >
> > I used ovs2.7.0. I searched the git log, and not sure which commit it
> > is. Do you happen to know?
> >
> > Yes, I cleared the stats after traffic run.
> >
> > Br,
> > Wang Zhike
> >
> >
> > From: "O Mahony, Billy" <billy.o.mahony at intel.com>
> > To: "wangzhike at jd.com" <wangzhike at jd.com>, Jan Scheurich
> > 	<jan.scheurich at ericsson.com>, Darrell Ball <dball at vmware.com>,
> > 	"ovs-discuss at openvswitch.org" <ovs-discuss at openvswitch.org>,
> > 	"ovs-dev at openvswitch.org" <ovs-dev at openvswitch.org>, Kevin
> Traynor
> > 	<ktraynor at redhat.com>
> > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > 	physical port
> > Message-ID:
> > 	<03135AEA779D444E90975C2703F148DC58C1945D at IRSMSX107.ger.c
> > orp.intel.com>
> >
> > Content-Type: text/plain; charset="utf-8"
> >
> > Hi Wang,
> >
> > Thanks for the figures. Unexpected results as you say. Two things come
> > to
> > mind:
> >
> > I?m not sure what code you are using but the cycles per packet
> > statistic was broken for a while recently. Ilya posted a patch to fix
> > it so make sure you have that patch included.
> >
> > Also remember to reset the pmd stats after you start your traffic and
> > then measure after a short duration.
> >
> > Regards,
> > Billy.
> >
> >
> >
> > From: ??? [mailto:wangzhike at jd.com]
> > Sent: Friday, September 8, 2017 8:01 AM
> > To: Jan Scheurich <jan.scheurich at ericsson.com>; O Mahony, Billy
> > <billy.o.mahony at intel.com>; Darrell Ball <dball at vmware.com>; ovs-
> > discuss at openvswitch.org; ovs-dev at openvswitch.org; Kevin Traynor
> > <ktraynor at redhat.com>
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> >
> > Hi All,
> >
> >
> >
> > I tested below cases, and get some performance data. The data shows
> > there is little impact for cross NUMA communication, which is
> > different from my expectation. (Previously I mentioned that cross NUMA
> > would add 60% cycles, but I can NOT reproduce it any more).
> >
> >
> >
> > @Jan,
> >
> > You mentioned cross NUMA communication would cost lots more cycles.
> > Can you share your data? I am not sure whether I made some mistake or
> not.
> >
> >
> >
> > @All,
> >
> > Welcome your data if you have data for similar cases. Thanks.
> >
> >
> >
> > Case1: VM0->PMD0->NIC0
> >
> > Case2:VM1->PMD1->NIC0
> >
> > Case3:VM1->PMD0->NIC0
> >
> > Case4:NIC0->PMD0->VM0
> >
> > Case5:NIC0->PMD1->VM1
> >
> > Case6:NIC0->PMD0->VM1
> >
> >
> >
> > ?     VM Tx Mpps  Host Tx Mpps  avg cycles per packet       avg processing
> > cycles per packet
> >
> > Case1     1.4           1.4                 512                             415
> >
> > Case2     1.3           1.3                 537                             436
> >
> > Case3     1.35        1.35               514                             390
> >
> >
> >
> > ?  VM Rx Mpps    Host Rx Mpps  avg cycles per packet       avg processing
> cycles
> > per packet
> >
> > Case4     1.3       1.3                     549                             533
> >
> > Case5     1.3       1.3                     559                             540
> >
> > Case6     1.28     1.28                  568                             551
> >
> >
> >
> > Br,
> >
> > Wang Zhike
> >
> >
> >
> > -----Original Message-----
> > From: Jan Scheurich [mailto:jan.scheurich at ericsson.com]
> > Sent: Wednesday, September 06, 2017 9:33 PM
> > To: O Mahony, Billy; ???; Darrell Ball; ovs-
> > discuss at openvswitch.org<mailto:ovs-discuss at openvswitch.org>; ovs-
> > dev at openvswitch.org<mailto:ovs-dev at openvswitch.org>; Kevin Traynor
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> >
> >
> > Hi Billy,
> >
> >
> >
> > > You are going to have to take the hit crossing the NUMA boundary at
> > > some
> > point if your NIC and VM are on different NUMAs.
> >
> > >
> >
> > > So are you saying that it is more expensive to cross the NUMA
> > > boundary
> > from the pmd to the VM that to cross it from the NIC to the
> >
> > > PMD?
> >
> >
> >
> > Indeed, that is the case: If the NIC crosses the QPI bus when storing
> > packets in the remote NUMA there is no cost involved for the PMD. (The
> > QPI bandwidth is typically not a bottleneck.) The PMD only performs
> > local memory access.
> >
> >
> >
> > On the other hand, if the PMD crosses the QPI when copying packets
> > into a remote VM, there is a huge latency penalty involved, consuming
> > lots of PMD cycles that cannot be spent on processing packets. We at
> > Ericsson have observed exactly this behavior.
> >
> >
> >
> > This latency penalty becomes even worse when the LLC cache hit rate is
> > degraded due to LLC cache contention with real VNFs and/or unfavorable
> > packet buffer re-use patterns as exhibited by real VNFs compared to
> > typical synthetic benchmark apps like DPDK testpmd.
> >
> >
> >
> > >
> >
> > > If so then in that case you'd like to have two (for example) PMDs
> > > polling 2
> > queues on the same NIC. With the PMDs on each of the
> >
> > > NUMA nodes forwarding to the VMs local to that NUMA?
> >
> > >
> >
> > > Of course your NIC would then also need to be able know which VM (or
> > > at
> > least which NUMA the VM is on) in order to send the frame
> >
> > > to the correct rxq.
> >
> >
> >
> > That would indeed be optimal but hard to realize in the general case (e.g.
> > with VXLAN encapsulation) as the actual destination is only known
> > after tunnel pop. Here perhaps some probabilistic steering of RSS hash
> > values based on measured distribution of final destinations might help in
> the future.
> >
> >
> >
> > But even without that in place, we need PMDs on both NUMAs anyhow
> (for
> > NUMA-aware polling of vhostuser ports), so why not use them to also
> > poll remote eth ports. We can achieve better average performance with
> > fewer PMDs than with the current limitation to NUMA-local polling.
> >
> >
> >
> > BR, Jan
> >
> >
> >



More information about the discuss mailing list