[ovs-dev] [PATCHv13] netdev-afxdp: add new netdev type for AF_XDP.

Ilya Maximets i.maximets at samsung.com
Mon Jun 24 17:26:30 UTC 2019


On 24.06.2019 20:23, Ilya Maximets wrote:
> On 22.06.2019 9:18, William Tu wrote:
>> Hi Ilya,
>>
>> Thanks for such a detailed review!
>>
>> I wasn't thinking about making all "make check-afxdp" test cases
>> passed because there are some errors not related to XDP.
>> But since you've done lots of investigation, let's fix all and make it passed.
>>
>>> Hi!
>>> I finally managed to successfully run 'make check-afxdp' with the
>>> following results:
>>>
>>>   ERROR: 76 tests were run,
>>>   6 failed unexpectedly.
>>>   48 tests were skipped.
>>>
>>> Failed tests are IP fragmentation expiry for conntrack (I'll send a
>>> separate e-mail about this issue) and NSH tests which are broken for
>>> a while now (not related to XDP).
>>>
>>> However, here is the list of issues I faced and had to fix/workaround
>>> to make in work:
>>>
>>> 1. Abort while trying to push any header:
>>>
>>>   Thread 10 "pmd8" received signal SIGABRT, Aborted.
>>>   [Switching to Thread 0x7ff3ab9d9700 (LWP 12151)]
>>>   0x00007ff3b311793f in raise () from /lib64/libc.so.6
>>>   (gdb) bt
>>>   #0  0x00007ff3b311793f in raise () from /lib64/libc.so.6
>>>   #1  0x00007ff3b3101c95 in abort () from /lib64/libc.so.6
>>>   #2  0x0000000000683930 in dp_packet_resize__ (b=0x7ff3a9d346b0, new_headroom=64, new_tailroom=<out>) at ./lib/dp-packet.h:613
>>>   #3  0x000000000068547c in dp_packet_prealloc_headroom (b=<out>, size=4) at lib/dp-packet.c:315
>>>   #4  dp_packet_push_uninit (b=<out>, size=4) at lib/dp-packet.c:427
>>>   #5  dp_packet_resize_l2_5 (b=0x7ff3a9d346b0, increment=4) at lib/dp-packet.c:493
>>>   #6  0x000000000089a841 in push_mpls (packet=0x2, ethtype=<>, lse=1076953088) at lib/packets.c:391
>>>   #7  0x0000000000762b5d in odp_execute_actions  at lib/odp-execute.c:875
>>>   #8  0x00000000006a4493 in dp_netdev_execute_actions  at lib/dpif-netdev.c:7264
>>>   #9  handle_packet_upcall  at lib/dpif-netdev.c:6545
>>>   #10 fast_path_processing  at lib/dpif-netdev.c:6641
>>>   #11 0x00000000006a2b6f in dp_netdev_input__  at lib/dpif-netdev.c:6729
>>>   #12 0x000000000069f973 in dp_netdev_input  at lib/dpif-netdev.c:6767
>>>   #13 dp_netdev_process_rxq_port  at lib/dpif-netdev.c:4277
>>>   #14 0x000000000069ba0b in pmd_thread_main  at lib/dpif-netdev.c:5451
>>>   #15 0x000000000085f6b0 in ovsthread_wrapper  at lib/ovs-thread.c:352
>>>   #16 0x00007ff3b3e532de in start_thread () from /lib64/libpthread.so.0
>>>   #17 0x00007ff3b31dca63 in clone () from /lib64/libc.so.6
>>>
>>> Reason: Wrong headroom management. In current implementation headroom of
>>> the dp-packet is always zero. This leads to resize attempts failures on
>>> OVS_NOT_REACHED().
>>>
>>> Here is the patch that could solve the issue:
>>>
>>> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
>>> index e6a794707..c9593515a 100644
>>> --- a/lib/dp-packet.c
>>> +++ b/lib/dp-packet.c
>>> @@ -65,10 +65,11 @@ dp_packet_use(struct dp_packet *b, void *base, size_t allocated)
>>>   * memory starting at AF_XDP umem base.
>>>   */
>>>  void
>>> -dp_packet_use_afxdp(struct dp_packet *b, void *base, size_t allocated)
>>> +dp_packet_use_afxdp(struct dp_packet *b, void *data,
>>> +                    size_t allocated, size_t headroom)
>>>  {
>>> -    dp_packet_set_base(b, base);
>>> -    dp_packet_set_data(b, base);
>>> +    dp_packet_set_base(b, (char *) data - headroom);
>>> +    dp_packet_set_data(b, data);
>>>      dp_packet_set_size(b, 0);
>>>
>>>      dp_packet_set_allocated(b, allocated);
>>> diff --git a/lib/dp-packet.h b/lib/dp-packet.h
>>> index e3438226e..47ea14b94 100644
>>> --- a/lib/dp-packet.h
>>> +++ b/lib/dp-packet.h
>>> @@ -132,7 +132,7 @@ void dp_packet_use(struct dp_packet *, void *, size_t);
>>>  void dp_packet_use_stub(struct dp_packet *, void *, size_t);
>>>  void dp_packet_use_const(struct dp_packet *, const void *, size_t);
>>>  #if HAVE_AF_XDP
>>> -void dp_packet_use_afxdp(struct dp_packet *, void *, size_t);
>>> +void dp_packet_use_afxdp(struct dp_packet *, void *, size_t, size_t);
>>>  #endif
>>>  void dp_packet_init_dpdk(struct dp_packet *);
>>>
>>> diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
>>> index 33d861215..518389a58 100644
>>> --- a/lib/netdev-afxdp.c
>>> +++ b/lib/netdev-afxdp.c
>>> @@ -591,7 +591,7 @@ netdev_afxdp_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch,
>>>          packet = &xpacket->packet;
>>>
>>>          /* Initialize the struct dp_packet */
>>> -        dp_packet_use_afxdp(packet, pkt, FRAME_SIZE - FRAME_HEADROOM);
>>> +        dp_packet_use_afxdp(packet, pkt, FRAME_SIZE, FRAME_HEADROOM);
>>
>> I have to double check.
>> The FRAME_HEADROOM is actually the XDP_PACKET_HEADROOM, which is
>> reserved for driver to put metadata. I'm afraid we will over-write
>> something above.
>> Maybe we should reserve our own headroom by calling umem api, setting the
>> frame_headroom below in xsk.h
>> struct xsk_umem_config {
>>     __u32 fill_size;
>>     __u32 comp_size;
>>     __u32 frame_size;
>>     __u32 frame_headroom;
>> };
> 
> Yes. You're right, we only guaranteed to have 'umem->headroom' bytes before the
> 'addr' read from the rx ring. So, we should set 'frame_headroom' to some value
> (128 at least) and perform same calculations as I made, but with 'umem->headroom'
> instead of FRAME_HEADROOM. Like:
>     dp_packet_use_afxdp(packet, pkt, FRAME_SIZE, umem->headroom);

      dp_packet_use_afxdp(packet, pkt,
                          FRAME_SIZE - umem->headroom - FRAME_HEADROOM,
                          umem->headroom);

> 
> Looks like 2 chunks below are not needed in this case.
> 
>>
>>>          dp_packet_set_size(packet, len);
>>>
>>>          /* Add packet into batch, increase batch->count */
>>> @@ -646,7 +646,7 @@ free_afxdp_buf(struct dp_packet *p)
>>>      if (xpacket->mpool) {
>>>          void *base = dp_packet_base(p);
>>>
>>> -        addr = (uintptr_t)base & (~FRAME_SHIFT_MASK);
>>> +        addr = ((uintptr_t)base + FRAME_HEADROOM) & (~FRAME_SHIFT_MASK);
>>>          umem_elem_push(xpacket->mpool, (void *)addr);
>>>      }
>>>  }
>>> @@ -664,7 +664,7 @@ free_afxdp_buf_batch(struct dp_packet_batch *batch)
>>>          if (xpacket->mpool) {
>>>              void *base = dp_packet_base(packet);
>>>
>>> -            addr = (uintptr_t)base & (~FRAME_SHIFT_MASK);
>>> +            addr = ((uintptr_t)base + FRAME_HEADROOM) & (~FRAME_SHIFT_MASK);
>>>              elems[i] = (void *)addr;
>>>          }
>>>      }
>>> ---
>>>
>>> Works for me. Please, re-check. I might miss something.
>>
>> Thanks, I will do it.
>>
>>>
>>>
>>>
>>> 2. vlan tests fails due to kernel issues. We need to disable vlan offloading
>>> along with tx offloading, otherwise packets dissapears somewhere inside the
>>> kernel. I didn't find the root cause. In v8 of this patch you disabled 'rxvlan'
>>> and 'txvlan' for tests, but these bits are missing in newer versions.
>>> Most probably we only need to disable 'txvlan'.
>>>
>> I will test it, thanks.
>>>
>>>
>>> 3. cvlan tests fails due to kernel issues. I didn't found a workaround for this.
>>> Disabling the vlan offloading doesn't work. I had to force afxdp testsuite to
>>> skip these tests:
>>>
>>> diff --git a/tests/system-afxdp-macros.at b/tests/system-afxdp-macros.at
>>> index 1e6f7a46b..91f2ef91c 100644
>>> --- a/tests/system-afxdp-macros.at
>>> +++ b/tests/system-afxdp-macros.at
>>> @@ -18,3 +18,6 @@ m4_define([ADD_VETH],
>>>        on_exit 'ip link del ovs-$1'
>>>      ]
>>>  )
>>> +
>>> +m4_define([OVS_CHECK_8021AD],
>>> +    [AT_SKIP_IF([:])])
>>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>>> index d23ee897b..22f814fba 100644
>>> --- a/tests/system-traffic.at
>>> +++ b/tests/system-traffic.at
>>> @@ -71,6 +71,7 @@ AT_CLEANUP
>>>
>>>  AT_SETUP([datapath - ping between two ports on cvlan])
>>>  OVS_TRAFFIC_VSWITCHD_START()
>>> +OVS_CHECK_8021AD()
>>>
>>>  AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
>>>
>>> @@ -161,6 +162,7 @@ AT_CLEANUP
>>>
>>>  AT_SETUP([datapath - ping6 between two ports on cvlan])
>>>  OVS_TRAFFIC_VSWITCHD_START()
>>> +OVS_CHECK_8021AD()
>>>
>>>  AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
>>>
>>> ---
>>>
>>>
>>>
>>> 4. 'system-afxdp-testsuite' doesn't re-build on 'system-userspace-macros.at'
>>> changes. Missing file dependencies in automake:
>>>
>>> diff --git a/tests/automake.mk b/tests/automake.mk
>>> index 131564bb0..f0449c395 100644
>>> --- a/tests/automake.mk
>>> +++ b/tests/automake.mk
>>> @@ -163,6 +163,7 @@ SYSTEM_USERSPACE_TESTSUITE_AT = \
>>>         tests/system-userspace-packet-type-aware.at
>>>
>>>  SYSTEM_AFXDP_TESTSUITE_AT = \
>>> +       tests/system-userspace-macros.at \
>>>         tests/system-afxdp-testsuite.at \
>>>         tests/system-afxdp-macros.at
>>>
>> Thanks!
>>
>>> ---
>>>
>>>
>>> 5. 'make check-afxdp' executes 'make install' which is unwanted:
>>>
>>> diff --git a/tests/automake.mk b/tests/automake.mk
>>> index 131564bb0..d6ab51732 100644
>>> --- a/tests/automake.mk
>>> +++ b/tests/automake.mk
>>> @@ -325,7 +326,6 @@ check-system-userspace: all
>>>         "$$@" $(TESTSUITEFLAGS) -j1 || (test X'$(RECHECK)' = Xyes && "$$@" --recheck)
>>>
>>>  check-afxdp: all
>>> -       $(MAKE) install
>>>         set $(SHELL) '$(SYSTEM_AFXDP_TESTSUITE)' -C tests  AUTOTEST_PATH='$(AUTOTEST_PATH)' $(TESTSUITEFLAGS) -j1; \
>>>         "$$@" || (test X'$(RECHECK)' = Xyes && "$$@" --recheck)
>>>
>> Thanks!
>>
>>> ---
>>>
>>>
>>> 6. TCP doesn't work with XDP over veth interfaces. This is a known kernel
>>> issue that could be workarounded by the following patch to a kernel:
>>>     https://protect2.fireeye.com/url?k=9dce561be343bd6d.9dcfdd54-4c63fa8a4d3dbe51&u=https://github.com/cilium/cilium/issues/3077#issuecomment-430801467 .
>>> Until proper solution implemented in kernel we probably should skip all the
>>> tests that involves TCP.
>>
>> OK, I will do it in next version.
>>
>>>
>>>
>>> 7. It's a known issue that tunneling is not working right now in system-traffic
>>> userspace tests. Could be workarounded by removing '--disable-system' from
>>> OVS_TRAFFIC_VSWITCHD_START in tests/system-userspace-macros.at. I'm going to
>>> prepare a patch for this issue in a near future.
>>>
>>
>> Right, we also try to fix this issue before. But after removing
>> '--disable-system',
>> we hit another issue related to revalidator. I will provide more
>> details next week.
>>
>>>
>>> 8. As I said, I'll send a separate main about IP fragmented conntrack issues.
>>>
>>>
>>> P.S. We probably should mention TCP, vlan and 8021ad issues on veth interfaces
>>>      somewhere in docs, so users will be aware of them.
>>>
>>> Best regards, Ilya Maximets.
>>
>> btw, do you know any CI system, such as travis, so we can run
>> make check-system-userspace , or make check-afxdp?
> 
> Since travis migrated to VM based workloads we could try to run 'make check-system-userspace'
> there. Regarding 'make check-afxdp', I don't know the public CI with recent enough
> kernel or where we could use our custom kernel.
> 
>>
>> Regards,
>> William
>>
>>
> 
> 


More information about the dev mailing list