[ovs-dev] [PATCHv15 2/2] netdev-afxdp: add new netdev type for AF_XDP.
u9012063 at gmail.com
Fri Jul 12 23:15:24 UTC 2019
On Thu, Jul 11, 2019 at 6:42 AM Ilya Maximets <i.maximets at samsung.com> wrote:
> On 09.07.2019 22:35, William Tu wrote:
> > The patch introduces experimental AF_XDP support for OVS netdev.
> > AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket
> > type built upon the eBPF and XDP technology. It is aims to have comparable
> > performance to DPDK but cooperate better with existing kernel's networking
> > stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program
> > attached to the netdev, by-passing a couple of Linux kernel's subsystems
> > As a result, AF_XDP socket shows much better performance than AF_PACKET
> > For more details about AF_XDP, please see linux kernel's
> > Documentation/networking/af_xdp.rst. Note that by default, this feature is
> > not compiled in.
> > Signed-off-by: William Tu <u9012063 at gmail.com>
> > ---
> > v14:
> > * Mainly address issue reported by Ilya
> > https://protect2.fireeye.com/url?k=0b6c291c248670fb.0b6da253-6021601b254970fd&u=https://patchwork.ozlabs.org/patch/1118972/
> > when doing 'make check-afxdp'
> > * Fix xdp frame headroom issue
> > * Fix vlan test cases by disabling txvlan offload
> > * Skip cvlan
> > * Document TCP limitation (currently all tcp tests fail due to
> > kernel veth driver)
> > * Fix tunnel test cases due to --disable-system (another patch)
> > * Switch to use pthread_spin_lock, suggested by Ben
> > * Add coverage counter for debugging
> > * Fix buffer starvation issue at batch_send reported by Eelco
> > when using tap device with type=afxdp
> > v15:
> > * address review feedback from Ilay
> > https://protect2.fireeye.com/url?k=ceb755d3074c79a5.ceb6de9c-b1b2f6a490a479b8&u=https://patchwork.ozlabs.org/patch/1125476/
> > * skip TCP related test cases
> > * reclaim all CONS_NUM_DESC at complete tx
> > * add retries to kick_tx
> > * increase memory pool size
> > * remove redundant xdp flag and bind flag
> > * remove unused rx_dropped var
> > * make tx_dropped counter atomic
> > * refactor dp_packet_init_afxdp using dp_packet_init__
> > * rebase to ovs master, test with latest bpf-next kernel commit b14a260e33ddb4
> > Ilya's kernel patches are required
> > commit 455302d1c9ae ("xdp: fix hang while unregistering device bound to xdp socket")
> > commit 162c820ed896 ("xdp: hold device for umem regardless of zero-copy mode")
> > Possible issues:
> > * still lots of afxdp_cq_skip (ovs-appctl coverage/show)
> > afxdp_cq_skip 44325273.6/sec 34362312.683/sec 572705.2114/sec total: 2106010377
> > * TODO:
> > 'make check-afxdp' still not all pass
> > IP fragmentation expiry test not fix yet, need to implement
> > deferral memory free, s.t like dpdk_mp_sweep. Currently hit
> > some missing umem descs when reclaiming.
> Hi. Regarding this issue: We don't need to reclaim everything from the rings.
> We only need to count number of descriptors that are currently in rings.
> When we're xlosing xdp socket kernel stops processing rings, also, all the
> buffers in the rings are buffers from current umem. So, we could just count them
> and wait for the number of elements in umem pool to become (size - n_packets_in_rings).
> 'outstanding_tx' already counts all the packets that are in TX and CQ or in the
> middle of processing in kernel. If we'll count the same way number of packets
> in RX and FQ, we'll know the total number of buffers currently in kernel.
I think this idea is great.
I tried to reclaim descriptors from rx, tx, cq, fq but did not
get a consistent number. I will apply your diff below.
> It might be hard or even impossible to reclaim all the packets from rings
> because kernel updates consumer/producer heads not for every packet and it
> depends on the kernel implementation in which state rings will be after the
> closing of the socket.
More information about the dev