<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:SimSun;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:"MS Gothic";
        panose-1:2 11 6 9 7 2 5 8 2 4;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"Segoe UI";
        panose-1:2 11 5 2 4 2 4 2 2 3;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
        {font-family:"\@MS Gothic";
        panose-1:2 11 6 9 7 2 5 8 2 4;}
@font-face
        {font-family:"\@SimSun";
        panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        text-align:justify;
        font-size:10.5pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
        {mso-style-priority:99;
        mso-style-link:"Plain Text Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:10.5pt;
        font-family:"Calibri",sans-serif;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
        {mso-style-priority:99;
        mso-style-link:"Balloon Text Char";
        margin:0cm;
        margin-bottom:.0001pt;
        text-align:justify;
        font-size:9.0pt;
        font-family:"Calibri",sans-serif;}
span.PlainTextChar
        {mso-style-name:"Plain Text Char";
        mso-style-priority:99;
        mso-style-link:"Plain Text";
        font-family:Consolas;}
span.BalloonTextChar
        {mso-style-name:"Balloon Text Char";
        mso-style-priority:99;
        mso-style-link:"Balloon Text";
        font-family:"Segoe UI",sans-serif;}
p.a, li.a, div.a
        {mso-style-name:纯文本;
        mso-style-link:"纯文本 Char";
        margin:0cm;
        margin-bottom:.0001pt;
        text-align:justify;
        font-size:10.5pt;
        font-family:"Calibri",sans-serif;}
span.Char
        {mso-style-name:"纯文本 Char";
        mso-style-priority:99;
        mso-style-link:纯文本;
        font-family:"Calibri",sans-serif;}
p.a0, li.a0, div.a0
        {mso-style-name:批注框文本;
        mso-style-link:"批注框文本 Char";
        margin:0cm;
        margin-bottom:.0001pt;
        text-align:justify;
        font-size:10.5pt;
        font-family:"Calibri",sans-serif;}
span.Char0
        {mso-style-name:"批注框文本 Char";
        mso-style-priority:99;
        mso-style-link:批注框文本;}
span.EmailStyle25
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
span.EmailStyle27
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
/* Page Definitions */
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple" style="text-justify-trim:punctuation">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">Hi Wang,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">Thanks for the figures. Unexpected results as you say. Two things come to mind:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">I’m not sure what code you are using but the cycles per packet statistic was broken for a while recently. Ilya posted a patch to fix it so make sure you have that patch included.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">Also remember to reset the pmd stats after you start your traffic and then measure after a short duration.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">Regards,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">Billy. <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><a name="_MailEndCompose"><span style="font-size:11.0pt;color:#1F497D"><o:p>&nbsp;</o:p></span></a></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" align="left" style="text-align:left"><a name="_____replyseparator"></a><b><span style="font-size:11.0pt">From:</span></b><span style="font-size:11.0pt">
</span><span style="font-size:11.0pt;font-family:&quot;MS Gothic&quot;">王志克</span><span style="font-size:11.0pt"> [mailto:wangzhike@jd.com]
<br>
<b>Sent:</b> Friday, September 8, 2017 8:01 AM<br>
<b>To:</b> Jan Scheurich &lt;jan.scheurich@ericsson.com&gt;; O Mahony, Billy &lt;billy.o.mahony@intel.com&gt;; Darrell Ball &lt;dball@vmware.com&gt;; ovs-discuss@openvswitch.org; ovs-dev@openvswitch.org; Kevin Traynor &lt;ktraynor@redhat.com&gt;<br>
<b>Subject:</b> RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal" align="left" style="text-align:left"><o:p>&nbsp;</o:p></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">Hi All,<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">I tested below cases, and get some performance data. The data shows there is little impact for cross NUMA communication, which is different from my expectation. (Previously I mentioned that cross
 NUMA would add 60% cycles, but I can NOT reproduce it any more).<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">@Jan,<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">You mentioned cross NUMA communication would cost lots more cycles. Can you share your data? I am not sure whether I made some mistake or not.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">@All,<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">Welcome your data if you have data for similar cases. Thanks.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case1: VM0-&gt;PMD0-&gt;NIC0<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case2:VM1-&gt;PMD1-&gt;NIC0<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case3:VM1-&gt;PMD0-&gt;NIC0<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case4:NIC0-&gt;PMD0-&gt;VM0<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case5:NIC0-&gt;PMD1-&gt;VM1<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case6:NIC0-&gt;PMD0-&gt;VM1<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span lang="ZH-CN" style="font-family:SimSun;color:#1F497D;mso-fareast-language:ZH-CN"> </span><span style="color:#1F497D;mso-fareast-language:ZH-CN">&nbsp; &nbsp;&nbsp; VM Tx Mpps&nbsp; Host Tx Mpps&nbsp; avg cycles per packet&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; avg processing cycles per
 packet<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case1&nbsp;&nbsp;&nbsp;&nbsp; 1.4&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.4&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 512&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 415<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case2&nbsp;&nbsp;&nbsp;&nbsp; 1.3&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.3&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 537&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 436<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case3&nbsp;&nbsp;&nbsp;&nbsp; 1.35&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.35&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 514&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 390<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span lang="ZH-CN" style="font-family:SimSun;color:#1F497D;mso-fareast-language:ZH-CN"> </span><span style="color:#1F497D;mso-fareast-language:ZH-CN">&nbsp; VM Rx Mpps&nbsp;&nbsp;&nbsp; Host Rx Mpps&nbsp; avg cycles per packet&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; avg processing cycles per
 packet<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case4&nbsp;&nbsp;&nbsp;&nbsp; 1.3&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; 1.3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 549&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 533<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case5&nbsp;&nbsp;&nbsp;&nbsp; 1.3&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; 1.3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 559&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 540<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN">Case6&nbsp;&nbsp;&nbsp;&nbsp; 1.28 &nbsp;&nbsp;&nbsp; 1.28&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 568&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 551<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="color:#1F497D;mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">Br,<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">Wang Zhike<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">-----Original Message-----<br>
From: Jan Scheurich [<a href="mailto:jan.scheurich@ericsson.com">mailto:jan.scheurich@ericsson.com</a>]
<br>
Sent: Wednesday, September 06, 2017 9:33 PM<br>
To: O Mahony, Billy; </span><span lang="ZH-CN" style="font-family:SimSun;mso-fareast-language:ZH-CN">王志克</span><span style="mso-fareast-language:ZH-CN">; Darrell Ball;
<a href="mailto:ovs-discuss@openvswitch.org">ovs-discuss@openvswitch.org</a>; <a href="mailto:ovs-dev@openvswitch.org">
ovs-dev@openvswitch.org</a>; Kevin Traynor<br>
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">Hi Billy,<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; You are going to have to take the hit crossing the NUMA boundary at some point if your NIC and VM are on different NUMAs.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; <o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; So are you saying that it is more expensive to cross the NUMA boundary from the pmd to the VM that to cross it from the NIC to the<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; PMD?<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">Indeed, that is the case: If the NIC crosses the QPI bus when storing packets in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth is typically not a bottleneck.) The PMD
 only performs local memory access.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">On the other hand, if the PMD crosses the QPI when copying packets into a remote VM, there is a huge latency penalty involved, consuming lots of PMD cycles that cannot be spent on processing packets.
 We at Ericsson have observed exactly this behavior.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">This latency penalty becomes even worse when the LLC cache hit rate is degraded due to LLC cache contention with real VNFs and/or unfavorable packet buffer re-use patterns as exhibited by real
 VNFs compared to typical synthetic benchmark apps like DPDK testpmd.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; <o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; If so then in that case you'd like to have two (for example) PMDs polling 2 queues on the same NIC. With the PMDs on each of the<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; NUMA nodes forwarding to the VMs local to that NUMA?<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; <o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; Of course your NIC would then also need to be able know which VM (or at least which NUMA the VM is on) in order to send the frame<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">&gt; to the correct rxq.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">That would indeed be optimal but hard to realize in the general case (e.g. with VXLAN encapsulation) as the actual destination is only known after tunnel pop. Here perhaps some probabilistic steering
 of RSS hash values based on measured distribution of final destinations might help in the future.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">But even without that in place, we need PMDs on both NUMAs anyhow (for NUMA-aware polling of vhostuser ports), so why not use them to also poll remote eth ports. We can achieve better average
 performance with fewer PMDs than with the current limitation to NUMA-local polling.<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN">BR, Jan<o:p></o:p></span></p>
<p class="MsoPlainText"><span style="mso-fareast-language:ZH-CN"><o:p>&nbsp;</o:p></span></p>
</div>
</div>
</body>
</html>