[ovs-dev] [PATCH v5] userspace: Add TCP Segmentation Offload support

William Tu u9012063 at gmail.com
Wed Jan 22 18:33:59 UTC 2020


On Wed, Jan 22, 2020 at 12:54 AM Flavio Leitner <fbl at sysclose.org> wrote:
>
>
> Hi Ben,
>
> Thanks for reviewing it!
>
> On Tue, Jan 21, 2020 at 01:35:39PM -0800, Ben Pfaff wrote:
> > On Sat, Jan 18, 2020 at 12:08:06AM +0100, Ilya Maximets wrote:
> > > On 18.01.2020 00:03, Stokes, Ian wrote:
> > > > Thanks all for review/testing, pushed to master.
> > >
> > > OK, thanks Ian.
> > >
> > > @Ben, even though this patch already merged, I'd ask you to take a look
> > > at the code in case you'll spot some issues especially in non-DPDK related
> > > parts.
> >
> > I found the name dp_packet_hwol_is_ipv4(), and similar, confusing.  The
> > name suggested to me "test whether the packet is IPv4" not "test whether
> > the packet has an offloaded IPv4 checksum".  I guess the "hwol" is
> > offload related but...  I like the name dp_packet_hwol_tx_l4_checksum()
> > much more, it makes it obvious at a glance that it's a
> > checksum-offloading check.
>
> hwol = hardware offloading. I hear that all the time, but maybe there is a
> better name. I will improve that if no one gets on it first.
>
> > In the case where we actually receive a 64 kB packet, I think that this
> > code is going to be relatively inefficient.  If I'm reading the code
> > correctly (I did it quickly), then this is what happens:
> >
> >         - The first 1500 bytes of the packet land in the first
> >           dp_packet.
> >
> >         - The remaining 64000ish bytes land in the second dp_packet.

It's not a dp_packet, it's a preallocated buffer per rxq (aux_bufs).

struct netdev_rxq_linux {
    struct netdev_rxq up;
    bool is_tap;
    int fd;
    char *aux_bufs[NETDEV_MAX_BURST]; /* Batch of preallocated TSO buffers. */
};

> >
> >         - Then we expand the first dp_packet to the needed size and copy
> >           the remaining 64000 bytes into it.
>
> That's correct.
>
> > An alternative would be:
> >
> >         - Set up the first dp_packet as currently.
> >
> >         - Set up the second dp_packet so that the bytes are received
> >           into it starting at offset (mtu + headroom).
> >
> >         - If more than mtu bytes are received, then copy those bytes
> >           into the headroom of the second dp_packet and return it to the
> >           caller instead of the first dp_packet.
>
> I wanted to avoid doing more extensive processing if it's not a TSO packet
> to avoid performance regressions since it' very sensitive. Right now the 64k
> buffer is preallocated and is static for each queue to avoid the malloc
> performance issue. Now for TSO case, we have more time per packet for
> processing.

Can we implement Ben's idea by
1) set size of aux_buf to 64k + mtu
2) create 2nd dp_packet using this aux_buf and copy first packet to
first mtu bytes of aux_buf
3) since we steal this aux_bufs, allocate a new aux_buf by
rxq->aux_bufs[i] = xmalloc(64k + mtu)
4) free the first dp_packet, and use the second dp_packet

Regards,
William
>
> > The advantage is that we do a 1500-byte copy instead of a 64000-byte
> > copy.  The disadvantage is that we waste any memory leftover in the
> > second dp_packet, e.g. 32 kB if it's only a 32 kB packet instead of 64
> > kB.  Also we need slightly more sophisticated dp_packet allocation (we
> > will need to replenish the supply of aux_bufs).
>
> I also tried to avoid waste of memory, which becomes a problem with multiple
> ports and queues working in parallel.
>
> --
> fbl
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list