[ovs-dev] Re: [PATCH v5] userspace: Add TCP Segmentation Offload support

Flavio Leitner fbl at sysclose.org
Tue Jan 28 14:43:59 UTC 2020


On Sat, Jan 25, 2020 at 05:06:43PM +0800, txfh2007 wrote:
> Hi Flavio:

Hi!

>        From userspace-tso.rst I have found this version doesn't support TSO over VxLAN,
Correct.

>   the reason is DPDK pmd driver can't add tunnel header for segmentation packet ? 
No, it's just that OvS doesn't enable that specifically in the interfaces yet.

> Also, do you have plan to backport userspace tso function into version 2.10 ?
As far as I know, upstream does not backport new features to stable
branches. In this case it is even more complicated because the
feature is experimental.

HTH,
fbl


> 
> Thanks
> Timo
> 
> 
> ------------------------------------------------------------------
> Flavio Leitner <fbl at sysclose.org>
> William Tu <u9012063 at gmail.com>
> Ben Pfaff <blp at ovn.org>; dev at openvswitch.org <dev at openvswitch.org>; Ilya Maximets <i.maximets at ovn.org>; txfh2007 <txfh2007 at aliyun.com>
> Re: [ovs-dev] [PATCH v5] userspace: Add TCP Segmentation Offload support
> 
> On Wed, Jan 22, 2020 at 10:33:59AM -0800, William Tu wrote:
> > On Wed, Jan 22, 2020 at 12:54 AM Flavio Leitner <fbl at sysclose.org> wrote:
> > >
> > >
> > > Hi Ben,
> > >
> > > Thanks for reviewing it!
> > >
> > > On Tue, Jan 21, 2020 at 01:35:39PM -0800, Ben Pfaff wrote:
> > > > On Sat, Jan 18, 2020 at 12:08:06AM +0100, Ilya Maximets wrote:
> > > > > On 18.01.2020 00:03, Stokes, Ian wrote:
> > > > > > Thanks all for review/testing, pushed to master.
> > > > >
> > > > > OK, thanks Ian.
> > > > >
> > > > > @Ben, even though this patch already merged, I'd ask you to take a look
> > > > > at the code in case you'll spot some issues especially in non-DPDK related
> > > > > parts.
> > > >
> > > > I found the name dp_packet_hwol_is_ipv4(), and similar, confusing.  The
> > > > name suggested to me "test whether the packet is IPv4" not "test whether
> > > > the packet has an offloaded IPv4 checksum".  I guess the "hwol" is
> > > > offload related but...  I like the name dp_packet_hwol_tx_l4_checksum()
> > > > much more, it makes it obvious at a glance that it's a
> > > > checksum-offloading check.
> > >
> > > hwol = hardware offloading. I hear that all the time, but maybe there is a
> > > better name. I will improve that if no one gets on it first.
> > >
> > > > In the case where we actually receive a 64 kB packet, I think that this
> > > > code is going to be relatively inefficient.  If I'm reading the code
> > > > correctly (I did it quickly), then this is what happens:
> > > >
> > > >         - The first 1500 bytes of the packet land in the first
> > > >           dp_packet.
> > > >
> > > >         - The remaining 64000ish bytes land in the second dp_packet.
> > 
> > It's not a dp_packet, it's a preallocated buffer per rxq (aux_bufs).
> > 
> > struct netdev_rxq_linux {
> >     struct netdev_rxq up;
> >     bool is_tap;
> >     int fd;
> >     char *aux_bufs[NETDEV_MAX_BURST]; /* Batch of preallocated TSO buffers. */
> > };
> > 
> > > >
> > > >         - Then we expand the first dp_packet to the needed size and copy
> > > >           the remaining 64000 bytes into it.
> > >
> > > That's correct.
> > >
> > > > An alternative would be:
> > > >
> > > >         - Set up the first dp_packet as currently.
> > > >
> > > >         - Set up the second dp_packet so that the bytes are received
> > > >           into it starting at offset (mtu + headroom).
> > > >
> > > >         - If more than mtu bytes are received, then copy those bytes
> > > >           into the headroom of the second dp_packet and return it to the
> > > >           caller instead of the first dp_packet.
> > >
> > > I wanted to avoid doing more extensive processing if it's not a TSO packet
> > > to avoid performance regressions since it' very sensitive. Right now the 64k
> > > buffer is preallocated and is static for each queue to avoid the malloc
> > > performance issue. Now for TSO case, we have more time per packet for
> > > processing.
> > 
> > Can we implement Ben's idea by
> > 1) set size of aux_buf to 64k + mtu
> > 2) create 2nd dp_packet using this aux_buf and copy first packet to
> > first mtu bytes of aux_buf
> > 3) since we steal this aux_bufs, allocate a new aux_buf by
> > rxq->aux_bufs[i] = xmalloc(64k + mtu)
> > 4) free the first dp_packet, and use the second dp_packet
> 
> I did a quick experiment while at the conference and Ben's idea is
> indeed a bit faster (2.7%) when the packet is not resized due to #1.
> 
> If the buffer gets resized to what's actually used, then it becomes
> a bit slower (1.8%).
> 
> Anyways, feel free to have a look at the code[1]. Perhaps it could
> be changed to be more efficient. Just send me a patch and I will be
> happy to test again.
> 
> [1] https://github.com/fleitner/ovs/tree/tso-cycles-ben
> Thanks,
> fbl
> 

-- 
fbl


More information about the dev mailing list