[ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port
ktraynor at redhat.com
Wed Sep 6 13:49:35 UTC 2017
On 09/06/2017 02:33 PM, Jan Scheurich wrote:
> Hi Billy,
>> You are going to have to take the hit crossing the NUMA boundary at some point if your NIC and VM are on different NUMAs.
>> So are you saying that it is more expensive to cross the NUMA boundary from the pmd to the VM that to cross it from the NIC to the
> Indeed, that is the case: If the NIC crosses the QPI bus when storing packets in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth is typically not a bottleneck.) The PMD only performs local memory access.
> On the other hand, if the PMD crosses the QPI when copying packets into a remote VM, there is a huge latency penalty involved, consuming lots of PMD cycles that cannot be spent on processing packets. We at Ericsson have observed exactly this behavior.
> This latency penalty becomes even worse when the LLC cache hit rate is degraded due to LLC cache contention with real VNFs and/or unfavorable packet buffer re-use patterns as exhibited by real VNFs compared to typical synthetic benchmark apps like DPDK testpmd.
>> If so then in that case you'd like to have two (for example) PMDs polling 2 queues on the same NIC. With the PMDs on each of the
>> NUMA nodes forwarding to the VMs local to that NUMA?
>> Of course your NIC would then also need to be able know which VM (or at least which NUMA the VM is on) in order to send the frame
>> to the correct rxq.
> That would indeed be optimal but hard to realize in the general case (e.g. with VXLAN encapsulation) as the actual destination is only known after tunnel pop. Here perhaps some probabilistic steering of RSS hash values based on measured distribution of final destinations might help in the future.
> But even without that in place, we need PMDs on both NUMAs anyhow (for NUMA-aware polling of vhostuser ports), so why not use them to also poll remote eth ports. We can achieve better average performance with fewer PMDs than with the current limitation to NUMA-local polling.
If the user has some knowledge of the numa locality of ports and can
place VM's accordingly, default cross-numa assignment can be harm
performance. Also, it would make for very unpredictable performance from
test to test and even for flow to flow on a datapath.
> BR, Jan
More information about the discuss