[ovs-dev] [PATCHv18] netdev-afxdp: add new netdev type for AF_XDP.

Eelco Chaudron echaudro at redhat.com
Thu Aug 8 11:42:12 UTC 2019



On 19 Jul 2019, at 16:54, Ilya Maximets wrote:

> On 18.07.2019 23:11, William Tu wrote:
>> The patch introduces experimental AF_XDP support for OVS netdev.
>> AF_XDP, the Address Family of the eXpress Data Path, is a new Linux 
>> socket
>> type built upon the eBPF and XDP technology.  It is aims to have 
>> comparable
>> performance to DPDK but cooperate better with existing kernel's 
>> networking
>> stack.  An AF_XDP socket receives and sends packets from an eBPF/XDP 
>> program
>> attached to the netdev, by-passing a couple of Linux kernel's 
>> subsystems
>> As a result, AF_XDP socket shows much better performance than 
>> AF_PACKET
>> For more details about AF_XDP, please see linux kernel's
>> Documentation/networking/af_xdp.rst. Note that by default, this 
>> feature is
>> not compiled in.
>>
>> Signed-off-by: William Tu <u9012063 at gmail.com>
>
>
> Thanks, William, Eelco and Ben!
>
> I fixed couple of things and applied to master!

Good to see this got merged into master while on PTO. However, when I 
got back I decided to test it once more…

When testing PVP I got a couple of packets trough, and then it would 
stall. I thought it might be my kernel, so updated to yesterdays latest, 
no luck…

I did see a bunch of “eno1: send failed due to exhausted memory 
pool.” messages in the log. Putting back patch v14, made my problems 
go away…

After some debugging, I noticed the problem was with the “continue” 
case in the afxdp_complete_tx() function.
Applying the following patch made it work again:

diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index b7cc0d988..9b335ddf0 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -823,16 +823,21 @@ afxdp_complete_tx(struct xsk_socket_info 
*xsk_info)

          if (tx_to_free == BATCH_SIZE || j == tx_done - 1) {
              umem_elem_push_n(&umem->mpool, tx_to_free, elems_push);
              xsk_info->outstanding_tx -= tx_to_free;
              tx_to_free = 0;
          }
      }

+    if (tx_to_free) {
+        umem_elem_push_n(&umem->mpool, tx_to_free, elems_push);
+        xsk_info->outstanding_tx -= tx_to_free;
+    }
+
      if (tx_done > 0) {
          xsk_ring_cons__release(&umem->cq, tx_done);
      } else {
          COVERAGE_INC(afxdp_cq_empty);
      }
  }


Which made me wonder why we do mark elements as being used? To my 
knowledge (and looking at some of the code and examples), after the  
xsk_ring_cons__release() function a xsk_ring_cons__peek() should not 
receive any duplicate slots.

I see a rather high number of afxdp_cq_skip, which should to my 
knowledge never happen?

$ ovs-appctl coverage/show  | grep xdp
afxdp_cq_empty             0.0/sec   339.600/sec        5.6606/sec   
total: 20378
afxdp_tx_full              0.0/sec    29.967/sec        0.4994/sec   
total: 1798
afxdp_cq_skip              0.0/sec 61884770.167/sec  1174238.3644/sec   
total: 4227258112


You mentioned you saw this high number in your v15 change notes, did you 
do any research on why?

Cheers,

Eelco



More information about the dev mailing list