[ovs-dev] [PATCH] netdev-linux: Prepend the std packet in the TSO packet
Flavio Leitner
fbl at sysclose.org
Fri Jan 31 18:10:23 UTC 2020
Hi Ben,
On Fri, Jan 31, 2020 at 08:54:23AM -0800, Ben Pfaff wrote:
> On Wed, Jan 29, 2020 at 08:15:11PM -0300, Flavio Leitner wrote:
> > Usually TSO packets are close to 50k, 60k bytes long, so to
> > to copy less bytes when receiving a packet from the kernel
> > change the approach. Instead of extending the MTU sized
> > packet received and append with remaining TSO data from
> > the TSO buffer, allocate a TSO packet with enough headroom
> > to prepend the std packet data.
> >
> > Suggested-by: Ben Pfaff <blp at ovn.org>
> > Signed-off-by: Flavio Leitner <fbl at sysclose.org>
>
> Did you test this with TSO packets? I think I see an inconsistency.
> netdev_linux_rxq_recv() constructs dp_packets like this with a size of 0
> (and a tailroom of data_len):
>
> > + rx->aux_bufs[i] = dp_packet_new_with_headroom(data_len, std_len);
>
> and then later on dp_packet_size() on the aux_bufs should report 0,
> which won't work properly:
>
> > + if (iovlen == IOV_TSO_SIZE) {
> > + iovs[i][IOV_AUXBUF].iov_base = dp_packet_data(rx->aux_bufs[i]);
> > + iovs[i][IOV_AUXBUF].iov_len = dp_packet_size(rx->aux_bufs[i]);
> > + }
>
> I think that the above should use dp_packet_tailroom() instead, and the
> inconsistency makes me nervous.
You're right. I did test and the short log is below. It sounds like
iperf3 is not the best tool to verify this feature, so I switched to
'scp' and noticed an issue.
I will fix the test script to run all tests (vm-vm, vm-ns, ns-ns...)
with scp to copy data and sha256 to verify it, and then follow up with
a V2 of this patch.
Short log:
ns1 TX side:
# iperf3 -c 192.168.100.32 -t 10
Connecting to host 192.168.100.32, port 5201
[ 5] local 192.168.100.31 port 41354 connected to 192.168.100.32 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 1.01 GBytes 8.67 Gbits/sec 7339 165 KBytes
[ 5] 1.00-2.00 sec 1.04 GBytes 8.91 Gbits/sec 8331 216 KBytes
[ 5] 2.00-3.00 sec 1006 MBytes 8.44 Gbits/sec 5957 291 KBytes
[ 5] 3.00-4.00 sec 1.02 GBytes 8.79 Gbits/sec 5534 245 KBytes
[ 5] 4.00-5.00 sec 1024 MBytes 8.59 Gbits/sec 8400 219 KBytes
[ 5] 5.00-6.00 sec 1.01 GBytes 8.67 Gbits/sec 7792 223 KBytes
[ 5] 6.00-7.00 sec 1.01 GBytes 8.68 Gbits/sec 7860 170 KBytes
[ 5] 7.00-8.00 sec 1.00 GBytes 8.60 Gbits/sec 7687 246 KBytes
[ 5] 8.00-9.00 sec 1.02 GBytes 8.72 Gbits/sec 8168 182 KBytes
[ 5] 9.00-10.00 sec 1.03 GBytes 8.86 Gbits/sec 9886 225 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.1 GBytes 8.69 Gbits/sec 76954 sender
[ 5] 0.00-10.00 sec 10.1 GBytes 8.69 Gbits/sec receiver
ns2 RX side:
12:50:53.307719 IP 192.168.100.31.41354 > 192.168.100.32.5201: Flags [P.], seq 4235400:4300560, ack 1, win 502, options [nop,nop,TS val 3623839079 ecr 3548400447], length 65160
12:50:53.307743 IP 192.168.100.31.41354 > 192.168.100.32.5201: Flags [P.], seq 4300560:4365720, ack 1, win 502, options [nop,nop,TS val 3623839079 ecr 3548400447], length 65160
12:50:53.307762 IP 192.168.100.31.41354 > 192.168.100.32.5201: Flags [P.], seq 4365720:4419296, ack 1, win 502, options [nop,nop,TS val 3623839079 ecr 3548400447], length 53576
12:50:53.307759 IP 192.168.100.32.5201 > 192.168.100.31.41354: Flags [.], ack 4300560, win 24317, options [nop,nop,TS val 3548400447 ecr 3623839079], length 0
[...]
Thanks for the review, nice catch!
--
fbl
More information about the dev
mailing list