[ovs-dev] [PATCH] afxdp: Reduce afxdp's batch size to match kernel's xdp batch size
Yifeng Sun
pkusunyifeng at gmail.com
Mon Dec 23 19:03:01 UTC 2019
Thanks Ilya. This patch is actually a quick fix.
Sure, I will check generic mode later.
Thanks,
Yifeng
On Mon, Dec 23, 2019 at 12:22 AM Ilya Maximets <i.maximets at ovn.org> wrote:
>
> On Sat, Dec 21, 2019 at 2:03 AM Yifeng Sun <pkusunyifeng at gmail.com> wrote:
> >
> > William reported that there is iperf TCP issue between two afxdp ports:
> >
> > [ 3] local 10.1.1.2 port 40384 connected with 10.1.1.1 port 5001
> > [ ID] Interval Transfer Bandwidth
> > [ 3] 0.0- 1.0 sec 17.0 MBytes 143 Mbits/sec
> > [ 3] 1.0- 2.0 sec 9.62 MBytes 80.7 Mbits/sec
> > [ 3] 2.0- 3.0 sec 6.75 MBytes 56.6 Mbits/sec
> > [ 3] 3.0- 4.0 sec 11.0 MBytes 92.3 Mbits/sec
> > [ 3] 5.0- 6.0 sec 0.00 Bytes 0.00 bits/sec
> > [ 3] 6.0- 7.0 sec 0.00 Bytes 0.00 bits/sec
> > [ 3] 7.0- 8.0 sec 0.00 Bytes 0.00 bits/sec
> > [ 3] 8.0- 9.0 sec 0.00 Bytes 0.00 bits/sec
> > [ 3] 9.0-10.0 sec 0.00 Bytes 0.00 bits/sec
> > [ 3] 10.0-11.0 sec 0.00 Bytes 0.00 bits/sec
> >
> > The reason is, currently, netdev-afxdp's batch size is 32 while kernel's
> > xdp batch size is only 16. This can result in exhausting of sock wmem if
> > netdev-afxdp keeps sending large number of packets. Later on, when ARP
> > expires at one side of TCP connection, ARP packets can be delayed or
> > even dropped because sock wmen is already full.
> >
> > This patch fixes this issue by reducing netdev-afxdp's batch size so
> > as to match kernel's xdp batch size. Now iperf TCP works correctly.
>
> I didn't look at the veth driver implementation yet, but if your issue
> analysis is correct, driver doesn't process all the packets we're
> trying to send. In this case changing the batch size should not fully
> fix the issue since we're still could push packets fast enough to fill
> queues that will not be drained by kernel or some packets could stuck
> inside queues if we'll not send other packets. This sounds more like
> a missing napi rescheduling or incorrect work with need-wakeup feature
> inside the veth driver. I could look at it on the next week
> (travelling now).
>
> Anyway, we should not ultimately change batch size because it will
> affect performance on all modes and all drivers. Since your
> workaround fixes the issue at least partially, same multi-kick
> workaround for this case as we have for generic mode might work here
> too. Could you, please, check?
>
> Best regards, Ilya Maximets.
More information about the dev
mailing list