[ovs-dev] 答复: [PATCH] Use TPACKET_V1/V2/V3 to accelerate veth for DPDK datapath

=?gb2312?B?WWkgWWFuZyAo0e6gRCkt1Ma3/s7xvK/NxQ==?= yangyi01 at inspur.com
Mon Feb 3 04:06:01 UTC 2020


Hi, William

Sorry for last reply, I don't know why I always can't get your comments
email from my outlook, Ben's comments are ok, I also can't see your comments
in outlook junk box.

About your comments in
https://mail.openvswitch.org/pipermail/ovs-dev/2020-January/367146.html, I
checked it in my CentOS 7 which has 3.10.0 kernel, TPACKET_V3 sample code
can work, so I'm ok to remove V1&V2 code.

>Hi Yiyang,
>
>Can we just implement TPACKET v3, and drop v2 and v1?
>V3 is supported since kernel 3.10,

>commit f6fb8f100b807378fda19e83e5ac6828b638603a
>Author: chetan loke <loke.chetan at gmail.com>
>Date:   Fri Aug 19 10:18:16 2011 +0000
>
>    af-packet: TPACKET_V3 flexible buffer implementation.
>
>and based on OVS release
>http://docs.openvswitch.org/en/latest/faq/releases/
>after OVS 2.12, the minimum kernel requirement is 3.10.
>
>Regards,
>William


-----ÓʼþÔ­¼þ-----
·¢¼þÈË: Yi Yang (Ñî D)-ÔÆ·þÎñ¼¯ÍÅ 
·¢ËÍʱ¼ä: 2020Äê2ÔÂ3ÈÕ 10:36
ÊÕ¼þÈË: 'blp at ovn.org' <blp at ovn.org>; 'yang_y_yi at 163.com' <yang_y_yi at 163.com>
³­ËÍ: 'ovs-dev at openvswitch.org' <ovs-dev at openvswitch.org>;
'ian.stokes at intel.com' <ian.stokes at intel.com>
Ö÷Ìâ: ´ð¸´: [PATCH] Use TPACKET_V1/V2/V3 to accelerate veth for DPDK
datapath
ÖØÒªÐÔ: ¸ß

Hi, all

Current tap, internal and system interfaces aren't handled by pmd_thread, so
the performance can't be boosted too high, I have a very simple test just by
setting is_pmd to true for them, the below is my data for veth (using
TPACKET_V3), you can see pmd_thread is much better than ovs_vswitchd
obviously, compared with my previous data 1.98Gbps, my question is if we can
set is_pmd to true by default, I'll set is_pmd to true in next version if no
objection.

$ sudo ip netns exec ns01 iperf3 -t 60 -i 10 -c 10.15.1.3
--get-server-output
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 59590 connected to 10.15.1.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-10.00  sec  3.59 GBytes  3.09 Gbits/sec    0   3.04 MBytes
[  4]  10.00-20.00  sec  3.57 GBytes  3.06 Gbits/sec    0   3.04 MBytes
[  4]  20.00-30.00  sec  3.60 GBytes  3.09 Gbits/sec    0   3.04 MBytes
[  4]  30.00-40.00  sec  3.56 GBytes  3.06 Gbits/sec    0   3.04 MBytes
[  4]  40.00-50.00  sec  3.64 GBytes  3.12 Gbits/sec    0   3.04 MBytes
[  4]  50.00-60.00  sec  3.62 GBytes  3.11 Gbits/sec    0   3.04 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-60.00  sec  21.6 GBytes  3.09 Gbits/sec    0             sender
[  4]   0.00-60.00  sec  21.6 GBytes  3.09 Gbits/sec
receiver

Server output:
-----------------------------------------------------------
Accepted connection from 10.15.1.2, port 59588
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 59590
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  3.57 GBytes  3.07 Gbits/sec
[  5]  10.00-20.00  sec  3.57 GBytes  3.06 Gbits/sec
[  5]  20.00-30.00  sec  3.60 GBytes  3.09 Gbits/sec
[  5]  30.00-40.00  sec  3.56 GBytes  3.06 Gbits/sec
[  5]  40.00-50.00  sec  3.64 GBytes  3.12 Gbits/sec
[  5]  50.00-60.00  sec  3.62 GBytes  3.11 Gbits/sec


iperf Done.
eipadmin at cmp008:~$

-----ÓʼþÔ­¼þ-----
·¢¼þÈË: Ben Pfaff [mailto:blp at ovn.org] 
·¢ËÍʱ¼ä: 2020Äê1ÔÂ22ÈÕ 3:26
ÊÕ¼þÈË: yang_y_yi at 163.com
³­ËÍ: ovs-dev at openvswitch.org; ian.stokes at intel.com; Yi Yang (Ñî D)-ÔÆ·þÎñ¼¯
ÍÅ <yangyi01 at inspur.com>
Ö÷Ìâ: Re: [PATCH] Use TPACKET_V1/V2/V3 to accelerate veth for DPDK datapath

On Tue, Jan 21, 2020 at 02:49:47AM -0500, yang_y_yi at 163.com wrote:
> From: Yi Yang <yangyi01 at inspur.com>
> 
> We can avoid high system call overhead by using TPACKET_V1/V2/V3 and 
> use DPDK-like poll to receive and send packets (Note: send still needs 
> to call sendto to trigger final packet transmission).
> 
> I can see about 30% improvement compared to last recvmmsg optimization 
> if I use TPACKET_V3. TPACKET_V1/V2 is worse than TPACKET_V3, but it 
> still can improve about 20%.
> 
> For veth, it is 1.47 Gbps before this patch, it is about 1.98 Gbps 
> after applied this patch. But it is about 4.00 Gbps if we use 
> af_packet for veth, the bottle neck lies in ovs-vswitchd thread, it 
> will handle too many things for every loop (as below) , so it can't 
> work very efficintly as pmd_thread.
> 
>         memory_run();
>         bridge_run();
>         unixctl_server_run(unixctl);
>         netdev_run();
> 
>         memory_wait();
>         bridge_wait();
>         unixctl_server_wait(unixctl);
>         netdev_wait();
>         poll_block();
> 
> In the next step, it will be better if let pmd_thread to handle tap 
> and veth interface.
> 
> Signed-off-by: Yi Yang <yangyi01 at inspur.com>
> Co-authored-by: William Tu <u9012063 at gmail.com>
> Signed-off-by: William Tu <u9012063 at gmail.com>

Thanks for the patch!

I am a bit concerned about version compatibility issues here.  There are two
relevant kinds of versions.  The first is the version of the kernel/library
headers.  This patch works pretty hard to adapt to the headers that are
available at compile time, only dealing with the versions of the protocols
that are available from the headers.  This approach is sometimes fine, but
an approach can be better is to simply declare the structures or constants
that the headers lack.  This is often pretty easy for Linux data structures.
OVS does this for some structures that it cares about with the headers in
ovs/include/linux.
This approach has two advantages: the OVS code (outside these special
declarations) doesn't have to care whether particular structures are
declared, because they are always declared, and the OVS build always
supports a particular feature regardless of the headers of the system on
which it was built.

The second kind of version is the version of the system that OVS runs on.
Unless a given feature is one that is supported by every version that OVS
cares about, OVS needs to test at runtime whether the feature is supported
and, if not, fall back to the older feature.  I don't see that in this code.
Instead, it looks to me like it assumes that if the feature was available at
build time, then it is available at runtime.
This is not a good way to do things, since we want people to be able to get
builds from distributors such as Red Hat or Debian and then run those builds
on a diverse collection of kernels.

One specific comment I have here is that, in acinclude.m4, it would be
better to use AC_CHECK_TYPE or AC_CHECK_TYPES thatn OVS_GREP_IFELSE.
The latter is for testing for kernel builds only; we can't use the normal
AC_* tests for those because we often can't successfully build kernel
headers using the compiler and flags that Autoconf sets up for building OVS.

Thanks,

Ben.


More information about the dev mailing list