[ovs-dev] [PATCHv4] netdev-afxdp: Enable loading XDP program.

Eelco Chaudron echaudro at redhat.com
Thu Nov 21 10:40:13 UTC 2019



On 20 Nov 2019, at 21:17, William Tu wrote:

> On Tue, Nov 19, 2019 at 10:51:01AM -0800, William Tu wrote:
>> Hi Eelco,
>>
>> Thanks for your testing.
>>
>> On Tue, Nov 12, 2019 at 11:25:55AM +0100, Eelco Chaudron wrote:
>>> See one remark below, however when I did a quick test with a program 
>>> that
>>> would not load it goes into some re-try loop:
>>>
>>> 2019-11-12T10:13:21.658Z|01609|netdev_afxdp|INFO|eno1: Removing xdp 
>>> program.
>>> 2019-11-12T10:13:21.658Z|01610|netdev_afxdp|INFO|Removed program ID: 
>>> 0, fd:
>>> 0
>>> 2019-11-12T10:13:21.658Z|01611|netdev_afxdp|INFO|eno1: Setting XDP 
>>> mode to
>>> DRV.
>>> 2019-11-12T10:13:22.224Z|01612|netdev_afxdp|INFO|eno1: Load custom 
>>> XDP
>>> program at /root/xdp_pass_kern.o.
>>> 2019-11-12T10:13:22.274Z|01613|netdev_afxdp|ERR|xsk_socket__create 
>>> failed
>>> (Resource temporarily unavailable) mode: DRV use-need-wakeup: true 
>>> qid: 0
>>
>> I couldn't reproduce this issue.
>> Also looking at xsk_socket__create, I couln't figure out how it 
>> returns
>> EAGAIN (Resource temporarily unavailable) in your case.
>>
>>> 2019-11-12T10:13:22.300Z|01614|netdev_afxdp|ERR|Failed to create 
>>> AF_XDP
>>> socket on queue 0.
>>> 2019-11-12T10:13:22.300Z|01615|netdev_afxdp|INFO|eno1: Removing xdp 
>>> program.
>>> 2019-11-12T10:13:22.658Z|00047|ovs_rcu(urcu2)|WARN|blocked 1000 ms 
>>> waiting
>>> for main to quiesce
>>> 2019-11-12T10:13:22.735Z|01616|netdev_afxdp|INFO|Removed program ID: 
>>> 320,
>>> fd: 0
>>> 2019-11-12T10:13:22.735Z|01617|netdev_afxdp|ERR|AF_XDP device eno1 
>>> reconfig
>>> failed.
>>> 2019-11-12T10:13:22.735Z|01618|dpif_netdev|ERR|Failed to set 
>>> interface eno1
>>> new configuration
>>> 2019-11-12T10:13:22.735Z|01619|netdev_afxdp|INFO|FIXME: Device tapVM 
>>> always
>>> use numa id 0.
>>> 2019-11-12T10:13:22.735Z|01620|dpif_netdev|INFO|Core 1 on numa node 
>>> 0
>>> assigned port 'tapVM' rx queue 0 (measured processing cycles 0).
>>> 2019-11-12T10:13:22.735Z|01621|dpif|WARN|netdev at ovs-netdev: failed 
>>> to add
>>> eno1 as port: Invalid argument
>>> 2019-11-12T10:13:22.735Z|01622|netdev_afxdp|INFO|eno1: Removing xdp 
>>> program.
>>> 2019-11-12T10:13:22.736Z|01623|netdev_afxdp|INFO|Removed program ID: 
>>> 0, fd:
>>> 0
>>> 2019-11-12T10:13:22.736Z|01624|timeval|WARN|Unreasonably long 1079ms 
>>> poll
>>> interval (7ms user, 80ms system)
>>> 2019-11-12T10:13:22.736Z|01625|timeval|WARN|faults: 16387 minor, 0 
>>> major
>>> 2019-11-12T10:13:22.736Z|01626|timeval|WARN|context switches: 327 
>>> voluntary,
>>> 337 involuntary
>>> 2019-11-12T10:13:22.738Z|01627|netdev_afxdp|INFO|eno1: Removing xdp 
>>> program.
>>> 2019-11-12T10:13:22.738Z|01628|netdev_afxdp|INFO|Removed program ID: 
>>> 0, fd:
>>> 0
>>> 2019-11-12T10:13:22.739Z|01629|netdev_afxdp|INFO|eno1: Setting XDP 
>>> mode to
>>> DRV.
>>> 2019-11-12T10:13:23.312Z|01630|netdev_afxdp|INFO|eno1: Load custom 
>>> XDP
>>> program at /root/xdp_pass_kern.o.
>>> 2019-11-12T10:13:23.363Z|01631|netdev_afxdp|ERR|xsk_socket__create 
>>> failed
>>> (Resource temporarily unavailable) mode: DRV use-need-wakeup: true 
>>> qid: 0
>>> 2019-11-12T10:13:23.392Z|01632|netdev_afxdp|ERR|Failed to create 
>>> AF_XDP
>>> socket on queue 0.
>>> 2019-11-12T10:13:23.392Z|01633|netdev_afxdp|INFO|eno1: Removing xdp 
>>> program.
>>> 2019-11-12T10:13:23.738Z|00048|ovs_rcu(urcu2)|WARN|blocked 1000 ms 
>>> waiting
>>> for main to quiesce
>>> 2019-11-12T10:13:23.823Z|01634|netdev_afxdp|INFO|Removed program ID: 
>>> 324,
>>> fd: 0
>>> 2019-11-12T10:13:23.823Z|01635|netdev_afxdp|ERR|AF_XDP device eno1 
>>> reconfig
>>> failed.
>>> 2019-11-12T10:13:23.823Z|01636|dpif_netdev|ERR|Failed to set 
>>> interface eno1
>>> new configuration
>>> 2019-11-12T10:13:23.823Z|01637|netdev_afxdp|INFO|FIXME: Device tapVM 
>>> always
>>> use numa id 0.
>>> 2019-11-12T10:13:23.823Z|01638|dpif_netdev|INFO|Core 1 on numa node 
>>> 0
>>> assigned port 'tapVM' rx queue 0 (measured processing cycles 0).
>>> 2019-11-12T10:13:23.823Z|01639|dpif|WARN|netdev at ovs-netdev: failed 
>>> to add
>>> eno1 as port: Invalid argument
>>>
>>> But in addition during this loop it’s not freeing it’s 
>>> resources:
>>>
>>> $  bpftool prog list && bpftool map
>>> 4: xdp  tag 80b55d8a76303785
>>> 	loaded_at 2019-11-12T05:11:15-0500  uid 0
>>> 	xlated 136B  jited 96B  memlock 4096B  map_ids 4
>>> 12: xdp  name xdp_prog_simple  tag 3b185187f1855c4c  gpl
>>> 	loaded_at 2019-11-12T05:11:58-0500  uid 0
>>> 	xlated 16B  jited 35B  memlock 4096B
>>> 16: xdp  name xdp_prog_simple  tag 3b185187f1855c4c  gpl
>>> 	loaded_at 2019-11-12T05:11:59-0500  uid 0
>>> 	xlated 16B  jited 35B  memlock 4096B
>>> 20: xdp  name xdp_prog_simple  tag 3b185187f1855c4c  gpl
>>> 	loaded_at 2019-11-12T05:12:00-0500  uid 0
>>> 	xlated 16B  jited 35B  memlock 4096B
>>>>>>>>> 120: (null)  name xsks_map  flags 0x0
>>> 	key 4B  value 4B  max_entries 32  memlock 4096B
>>> 122: (null)  name xsks_map  flags 0x0
>>> 	key 4B  value 4B  max_entries 32  memlock 4096B
>>> 124: (null)  name xsks_map  flags 0x0
>>> 	key 4B  value 4B  max_entries 32  memlock 4096B
>>> 126: (null)  name xsks_map  flags 0x0
>>> 	key 4B  value 4B  max_entries 32  memlock 4096B
>>> 128: (null)  name xsks_map  flags 0x0
>>> 	key 4B  value 4B  max_entries 32  memlock 4096B
>>> 130: (null)  name xsks_map  flags 0x0
>>> 	key 4B  value 4B  max_entries 32  memlock 4096B
>>> 132: (null)  name xsks_map  flags 0x0
>>> 	key 4B  value 4B  max_entries 32  memlock 4096B
>>>
>>
>> And base on the above, the XDP program and map are already
>> loaded, so in xsk_socket__create(), it already reaches the
>> end:
>> 708     if (!(xsk->config.libbpf_flags & 
>> XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
>> 709         err = xsk_setup_xdp_prog(xsk);
>> 710         if (err)
>> 711             goto out_mmap_tx;
>>
>> So I assume something wrong with xsk_setup_xdp_prog()?
>> I suspect bpf_get_link_xdp_id() returns EAGAIN?
>>
>> I will keep debugging but if you have time, could you help
>> finding out why EAGAIN is returned? Thanks!
>>
>> --William
>
> Thanks for your help. I've found the root cause.
> At the error handling path, the prog_fd and map_fd is not
> saved, so xsk_destroy_all() does not clean them properly.
>
> Fixed it by diff below, I will send out new version.
> William
>
> diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
> index e079cd33bc9a..e9b8a43bb7c0 100644
> --- a/lib/netdev-afxdp.c
> +++ b/lib/netdev-afxdp.c
> @@ -484,6 +484,8 @@ xsk_configure_all(struct netdev *netdev)
>
>              VLOG_INFO("%s: Load custom XDP program at %s.",
>                        netdev_get_name(netdev), dev->xdpobj);
> +            dev->prog_fd = prog_fd;
> +            dev->map_fd = map_fd;
>          }
>
>          VLOG_DBG("%s: configure queue %d mode %s use-need-wakeup 
> %s.",
> @@ -502,8 +504,6 @@ xsk_configure_all(struct netdev *netdev)
>          atomic_init(&xsk_info->tx_dropped, 0);
>          xsk_info->outstanding_tx = 0;
>          xsk_info->available_rx = PROD_NUM_DESCS;
> -        dev->prog_fd = prog_fd;
> -        dev->map_fd = map_fd;
>      }
>
>      n_txq = netdev_n_txq(netdev);

Cool, thanks!

//Eelco



More information about the dev mailing list