[ovs-dev] [PATCHv11] netdev-afxdp: add new netdev type for AF_XDP.
Eelco Chaudron
echaudro at redhat.com
Tue Jun 18 09:45:54 UTC 2019
On 17 Jun 2019, at 20:23, William Tu wrote:
> Hi Eelco,
>
> On Mon, Jun 17, 2019 at 3:12 AM Eelco Chaudron <echaudro at redhat.com>
> wrote:
>>
>> Hi William,
>>
>> See below parts of an offline email discussion I had with Magnus
>> before,
>> and some research I did in the end, which explains that by design you
>> might not get all the descriptors ready.
>
> I think it's different issues. The behavior you described is a hickup
> waiting
> for queuing 16 rx packets. Here, at the afxdp_complete_tx, the
> xsk_ring_cons__peek
> returns descs that already been released, causing ovs push more elems
> and thus
> crash.
You are right did not read it thoroughly… Looks like a bug to me,
after __release() I would assume it will not return the same elements in
__peek().
>
>> Hope this helps change your design…
>>
>> In addition, the Point to Point test is working with you change,
>> however, the PVP test is still failing due to buffer starvation (see
>> my
>> comments in Patchv8 for a possible cause).
>>
> Thanks, looking back v8
> https://patchwork.ozlabs.org/patch/1097740/
> Hopefully next version will fix this issue.
>
>> Also on OVS restart system crashes in the following part:
>>
>> #0 netdev_afxdp_rxq_recv (rxq_=0x173c080, batch=0x7fe1397f80d0,
>> qfill=0x0) at lib/netdev-afxdp.c:583
>> #1 0x0000000000907f21 in netdev_rxq_recv (rx=<optimized out>,
>> batch=batch at entry=0x7fe1397f80d0, qfill=<optimized out>) at
>> lib/netdev.c:710
>> #2 0x00000000008dd1c3 in dp_netdev_process_rxq_port
>> (pmd=pmd at entry=0x175d990, rxq=0x175a460, port_no=2) at
>> lib/dpif-netdev.c:4257
>> #3 0x00000000008dd63d in pmd_thread_main (f_=<optimized out>) at
>> lib/dpif-netdev.c:5449
>> #4 0x000000000095e94d in ovsthread_wrapper (aux_=<optimized out>) at
>> lib/ovs-thread.c:352
>> #5 0x00007fe1633872de in start_thread () from /lib64/libpthread.so.0
>> #6 0x00007fe162b2ca63 in clone () from /lib64/libc.so.6
>>
> How do you restart the system? So I have two afxdp port
> Port "eth3"
> Interface "eth3"
> type: afxdp
> options: {n_rxq="1", xdpmode=drv}
> Port "eth5"
> Interface "eth5"
> type: afxdp
> options: {n_rxq="1", xdpmode=drv}
>
> I tested using
> # ovs-vsctl del-port eth3
> # ovs-vsctl del-port eth5
> # ovs-vsctl del-br br0
> # ovs-appctl -t ovs-vswitchd exit
> Looks ok.
I’m using an RHEL7 instance and use systemd to restart openvswitch
with “systemctl restart openvswitch”.
It uses ovs-ctl to stat/stop, see here for some details:
https://github.com/openvswitch/ovs/blob/master/rhel/usr_lib_systemd_system_ovs-vswitchd.service.in
> <snip>
>
>>> This means, that if you rely on (the naive :-)) code in the sample
>>> application, you can endup in a situation where you can receive from
>>> the
>>> Rx ring, but not post to the fill ring.
>>>
>>> So, the reason for the 16 packet hickup is as following:
>>>
>>> 1. Userland: The fill ring is completely filled.
>>> 2. Kernel: One packet is received, one entry picked from the fill
>>> ring,
>>> but the consumer pointer is not bumped, and packet is placed on
>>> the
>>> Rx ring.
>>> 3. Userland: One packet is picked from the Rx ring.
>>> 4. Userland: Tries to put an entry on fill ring. The fill ring is
>>> full,
>>> so userland spins.
>>> 5. Kernel: When 16 packets has been picked from the fill ring the
>>> consumer ptr is released.
>>> 6. Userland: Exists the while loop.
>
> Based on the above, there is no starvation problem here if there are
> more
> than 16 packets, correct? And at step 4, we can skip spinning and try
> to
> process more rx ring.
>
> For next version, I will first check the fill ring by using
> xsk_prod_nb_free(),
> to avoid the step 4.
Yes, a __free() check here will skip this problem. I was running a
single ping only test and it would spin forever…
> Thanks
> William
More information about the dev
mailing list