[ovs-dev] [PATCH] netdev-linux: Prepend the std packet in the TSO packet

Flavio Leitner fbl at sysclose.org
Fri Jan 31 18:10:23 UTC 2020


Hi Ben,

On Fri, Jan 31, 2020 at 08:54:23AM -0800, Ben Pfaff wrote:
> On Wed, Jan 29, 2020 at 08:15:11PM -0300, Flavio Leitner wrote:
> > Usually TSO packets are close to 50k, 60k bytes long, so to
> > to copy less bytes when receiving a packet from the kernel
> > change the approach. Instead of extending the MTU sized
> > packet received and append with remaining TSO data from
> > the TSO buffer, allocate a TSO packet with enough headroom
> > to prepend the std packet data.
> > 
> > Suggested-by: Ben Pfaff <blp at ovn.org>
> > Signed-off-by: Flavio Leitner <fbl at sysclose.org>
> 
> Did you test this with TSO packets?  I think I see an inconsistency.
> netdev_linux_rxq_recv() constructs dp_packets like this with a size of 0
> (and a tailroom of data_len):
> 
> > +            rx->aux_bufs[i] = dp_packet_new_with_headroom(data_len, std_len);
> 
> and then later on dp_packet_size() on the aux_bufs should report 0,
> which won't work properly:
> 
> > +         if (iovlen == IOV_TSO_SIZE) {
> > +             iovs[i][IOV_AUXBUF].iov_base = dp_packet_data(rx->aux_bufs[i]);
> > +             iovs[i][IOV_AUXBUF].iov_len = dp_packet_size(rx->aux_bufs[i]);
> > +         }
> 
> I think that the above should use dp_packet_tailroom() instead, and the
> inconsistency makes me nervous.

You're right. I did test and the short log is below. It sounds like
iperf3 is not the best tool to verify this feature, so I switched to
'scp' and noticed an issue.

I will fix the test script to run all tests (vm-vm, vm-ns, ns-ns...)
with scp to copy data and sha256 to verify it, and then follow up with
a V2 of this patch.

Short log:
ns1 TX side:
# iperf3 -c 192.168.100.32 -t 10
Connecting to host 192.168.100.32, port 5201
[  5] local 192.168.100.31 port 41354 connected to 192.168.100.32 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.01 GBytes  8.67 Gbits/sec  7339    165 KBytes       
[  5]   1.00-2.00   sec  1.04 GBytes  8.91 Gbits/sec  8331    216 KBytes       
[  5]   2.00-3.00   sec  1006 MBytes  8.44 Gbits/sec  5957    291 KBytes       
[  5]   3.00-4.00   sec  1.02 GBytes  8.79 Gbits/sec  5534    245 KBytes       
[  5]   4.00-5.00   sec  1024 MBytes  8.59 Gbits/sec  8400    219 KBytes       
[  5]   5.00-6.00   sec  1.01 GBytes  8.67 Gbits/sec  7792    223 KBytes       
[  5]   6.00-7.00   sec  1.01 GBytes  8.68 Gbits/sec  7860    170 KBytes       
[  5]   7.00-8.00   sec  1.00 GBytes  8.60 Gbits/sec  7687    246 KBytes       
[  5]   8.00-9.00   sec  1.02 GBytes  8.72 Gbits/sec  8168    182 KBytes       
[  5]   9.00-10.00  sec  1.03 GBytes  8.86 Gbits/sec  9886    225 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.1 GBytes  8.69 Gbits/sec  76954             sender
[  5]   0.00-10.00  sec  10.1 GBytes  8.69 Gbits/sec                  receiver


ns2 RX side:
12:50:53.307719 IP 192.168.100.31.41354 > 192.168.100.32.5201: Flags [P.], seq 4235400:4300560, ack 1, win 502, options [nop,nop,TS val 3623839079 ecr 3548400447], length 65160
12:50:53.307743 IP 192.168.100.31.41354 > 192.168.100.32.5201: Flags [P.], seq 4300560:4365720, ack 1, win 502, options [nop,nop,TS val 3623839079 ecr 3548400447], length 65160
12:50:53.307762 IP 192.168.100.31.41354 > 192.168.100.32.5201: Flags [P.], seq 4365720:4419296, ack 1, win 502, options [nop,nop,TS val 3623839079 ecr 3548400447], length 53576
12:50:53.307759 IP 192.168.100.32.5201 > 192.168.100.31.41354: Flags [.], ack 4300560, win 24317, options [nop,nop,TS val 3548400447 ecr 3623839079], length 0
[...]

Thanks for the review, nice catch!
-- 
fbl


More information about the dev mailing list