[ovs-dev] 答复: [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

Yi Yang (杨燚)-云服务集团 yangyi01 at inspur.com
Thu Mar 12 01:13:55 UTC 2020


Thanks William, replies inline.

-----邮件原件-----
发件人: William Tu [mailto:u9012063 at gmail.com] 
发送时间: 2020年3月12日 1:51
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01 at inspur.com>
抄送: yang_y_yi at 163.com; ovs-dev at openvswitch.org
主题: Re: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for userspace datapath

On Tue, Mar 10, 2020 at 7:42 PM Yi Yang (杨燚)-云服务集团 <yangyi01 at inspur.com> wrote:
>
> Hi, William
>
> I'll fix some your concerns in next ver, please check other inline replies.
>
> -----邮件原件-----
> 发件人: dev [mailto:ovs-dev-bounces at openvswitch.org] 代表 William Tu
> 发送时间: 2020年3月11日 3:43
> 收件人: yang_y_yi <yang_y_yi at 163.com>
> 抄送: ovs-dev <ovs-dev at openvswitch.org>
> 主题: Re: [ovs-dev] [PATCH v6] Use TPACKET_V3 to accelerate veth for 
> userspace datapath
>
> On Fri, Mar 6, 2020 at 6:35 AM <yang_y_yi at 163.com> wrote:
> >
> > From: Yi Yang <yangyi01 at inspur.com>
> >
> > We can avoid high system call overhead by using TPACKET_V3 and using 
> > DPDK-like poll to receive and send packets (Note: send still needs 
> > to call sendto to trigger final packet transmission).
> >
> > From Linux kernel 3.10 on, TPACKET_V3 has been supported, so all the 
> > Linux kernels current OVS supports can run
> > TPACKET_V3 without any problem.
> >
> > I can see about 30% performance improvement for veth compared to 
> > last recvmmsg optimization if I use TPACKET_V3, it is about 1.98 
> > Gbps, but it was 1.47 Gbps before.
>
> On my testbed, I didn't see any performance gain.
> For a 100 sec TCP iperf3, I see with/without tpacket show the same 1.70Gbps.
> Do you think if we set .is_pmd=true, the performance might be better 
> because tpacket is ring-based?
>
> [Yi Yang] Please make sure userspace-tso-enabled is set to false for 
> your test, if it is true, tpacket_v3 isn't used.
>
> Please use physical machines, it isn't so noticeable if you use it 
> inside VMs. Here is my data for your reference ( I used a 5.5.7 
> kernel, but it is not relevant to kernel version basically).
>
> My physical machine is a low end server, so performance improvement 
> isn't so obvious. But a big improvement is retr value is almost 0. To 
> set is_pmd to true and use dpdk buffer is my next step to improve 
> performance further. I also have a tpacket_v3 patch for tap in hand. 
> In my previous physical server, improvement is very obvious. My goal 
> is about 4Gbps, it is 3.9Gbps in my previous physical server with 
> is_pmd set to true and use dpdk buffer for dp_packet.

With the current patch, is_pmd is always false.
How do you set is_pmd to true?

[Yi Yang] I have patches in my hand to do this, my goal is to use pmd thread to handle such case, it is more scalable than ovs-vswitchd, currently, only one ovs-vswitchd is handling all the such interfaces, I don't think it is an efficient way for the use cases which pursue performance.


>
> No tpacket_v3
> =============
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec  7.90 GBytes  1.13 Gbits/sec  39672             sender
> [  4]   0.00-60.00  sec  7.90 GBytes  1.13 Gbits/sec                  receiver
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-60.00  sec  0.00 Bytes  0.00 bits/sec                  sender
> [  5]   0.00-60.00  sec  7.90 GBytes  1.13 Gbits/sec                  receiver
>
<snip>
>
> iperf Done.
> [yangyi at localhost ovs-master]$ uname -a Linux localhost.localdomain 
> 5.5.7-1.el7.elrepo.x86_64 #1 SMP Fri Feb 28
> 12:21:58 EST 2020 x86_64 x86_64 x86_64 GNU/Linux
> tpacket_v3
> ==========
<snip>
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-60.02  sec  0.00 Bytes  0.00 bits/sec                  sender
> [  5]   0.00-60.02  sec  8.39 GBytes  1.20 Gbits/sec                  receiver
>

So your current result is
no tpacket 1.13G (with some retransmission) with tpacket 1.20G (zero retransmission) This is around 7% improvement.

[Yi Yang] It is so from this test result, but on my high-end server, I did see higher improvement, but I can't use it now, will recheck this once it is available.


>
>
>
> >
> > TPACKET_V3 can support TSO, but its performance isn't good because 
> > of
> > TPACKET_V3 kernel implementation issue, so it falls back to
>
> What's the implementation issue? If we use latest kernel, does the 
> issue still exist?
>
> [Yi Yang] Per my check, the issue is the kernel can't feed enough 
> packets to tpacket_recv, so in many cases, no packets received, no 32 
> packets available, but for original non-tpacket case, one recv will 
> get 32 packets in most cases, throughput is about more than twice for 
> veth, for tap case, it is more than three times, I read kernel source 
> code, but I can't find root cause, I'll check from tpacket maintainer.
>
> > recvmmsg in case userspace-tso-enable is set to true, but its 
> > performance is better than recvmmsg in case userspace-tso-enable is 
> > set to false, so just use TPACKET_V3 in that case.
> >
> > Signed-off-by: Yi Yang <yangyi01 at inspur.com>
> > Co-authored-by: William Tu <u9012063 at gmail.com>
> > Signed-off-by: William Tu <u9012063 at gmail.com>
> > ---
> > diff --git a/include/linux/if_packet.h b/include/linux/if_packet.h 
> > new file mode 100644 index 0000000..e20aacc
> > --- /dev/null
> > +++ b/include/linux/if_packet.h
>
> if OVS_CHECK_LINUX_TPACKET returns false, can we simply fall back to 
> recvmmsg?
> So this is not needed?
>
> [Yi Yang] As you said, ovs support Linux kernel 3.10.0 or above, so no 
> that case existing, isn't it?

I mean if kernel supports it AND if_packet.h header exists, then we enable it.
If kernel supports it AND if_packet.h header does not exist, then just use recvmmsg.

[Yi Yang] I'm confused here, Ben told me it should be built even if  if_packet.h isn't there, that is why I added if_packet,h in include/linux/if_packet.h, I mean tpacket_v3 code should be built in this case.

Thanks
William


More information about the dev mailing list