[ovs-dev] [PATCHv14 2/2] netdev-afxdp: add new netdev type for AF_XDP.

Ilya Maximets i.maximets at samsung.com
Fri Jul 5 17:48:23 UTC 2019


Few more comments inline.

On 01.07.2019 19:08, William Tu wrote:
> The patch introduces experimental AF_XDP support for OVS netdev.
> AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket
> type built upon the eBPF and XDP technology.  It is aims to have comparable
> performance to DPDK but cooperate better with existing kernel's networking
> stack.  An AF_XDP socket receives and sends packets from an eBPF/XDP program
> attached to the netdev, by-passing a couple of Linux kernel's subsystems
> As a result, AF_XDP socket shows much better performance than AF_PACKET
> For more details about AF_XDP, please see linux kernel's
> Documentation/networking/af_xdp.rst. Note that by default, this feature is
> not compiled in.
> 
> Signed-off-by: William Tu <u9012063 at gmail.com>
> ---

<snip>

> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index 0976a35e758b..cc746b70af89 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -19,6 +19,7 @@
>  #include <string.h>
>  
>  #include "dp-packet.h"
> +#include "netdev-afxdp.h"
>  #include "netdev-dpdk.h"
>  #include "openvswitch/dynamic-string.h"
>  #include "util.h"
> @@ -59,6 +60,28 @@ dp_packet_use(struct dp_packet *b, void *base, size_t allocated)
>      dp_packet_use__(b, base, allocated, DPBUF_MALLOC);
>  }
>  
> +#if HAVE_AF_XDP
> +/* Initialize 'b' as an empty dp_packet that contains
> + * memory starting at AF_XDP umem base.
> + */
> +void
> +dp_packet_use_afxdp(struct dp_packet *b, void *data, size_t allocated,
> +                    size_t headroom)
> +{
> +    dp_packet_set_base(b, (char *)data - headroom);
> +    dp_packet_set_data(b, data);
> +    dp_packet_set_size(b, 0);
> +
> +    dp_packet_set_allocated(b, allocated);
> +    b->source = DPBUF_AFXDP;
> +    dp_packet_reset_offsets(b);
> +    pkt_metadata_init(&b->md, 0);
> +    dp_packet_reset_cutlen(b);
> +    dp_packet_reset_offload(b);
> +    b->packet_type = htonl(PT_ETH);

Above block could be replaced with:

    dp_packet_init__(b, allocated, DPBUF_AFXDP);

At least it misses init_specific().

> +}
> +#endif
> +
>  /* Initializes 'b' as an empty dp_packet that contains the 'allocated' bytes of
>   * memory starting at 'base'.  'base' should point to a buffer on the stack.
>   * (Nothing actually relies on 'base' being allocated on the stack.  It could
> @@ -122,6 +145,8 @@ dp_packet_uninit(struct dp_packet *b)
>               * created as a dp_packet */
>              free_dpdk_buf((struct dp_packet*) b);
>  #endif
> +        } else if (b->source == DPBUF_AFXDP) {
> +            free_afxdp_buf(b);
>          }
>      }
>  }

<snip>

> +int
> +netdev_afxdp_get_stats(const struct netdev *netdev,
> +                       struct netdev_stats *stats)
> +{
> +    struct netdev_linux *dev = netdev_linux_cast(netdev);
> +    struct xsk_socket_info *xsk_info;
> +    struct netdev_stats dev_stats;
> +    int error, i;
> +
> +    ovs_mutex_lock(&dev->mutex);
> +
> +    error = get_stats_via_netlink(netdev, &dev_stats);
> +    if (error) {
> +        VLOG_WARN_RL(&rl, "Error getting AF_XDP statistics");
> +    } else {
> +        /* Use kernel netdev's packet and byte counts */
> +        stats->rx_packets = dev_stats.rx_packets;
> +        stats->rx_bytes = dev_stats.rx_bytes;
> +        stats->tx_packets = dev_stats.tx_packets;
> +        stats->tx_bytes = dev_stats.tx_bytes;
> +
> +        stats->rx_errors           += dev_stats.rx_errors;
> +        stats->tx_errors           += dev_stats.tx_errors;
> +        stats->rx_dropped          += dev_stats.rx_dropped;
> +        stats->tx_dropped          += dev_stats.tx_dropped;
> +        stats->multicast           += dev_stats.multicast;
> +        stats->collisions          += dev_stats.collisions;
> +        stats->rx_length_errors    += dev_stats.rx_length_errors;
> +        stats->rx_over_errors      += dev_stats.rx_over_errors;
> +        stats->rx_crc_errors       += dev_stats.rx_crc_errors;
> +        stats->rx_frame_errors     += dev_stats.rx_frame_errors;
> +        stats->rx_fifo_errors      += dev_stats.rx_fifo_errors;
> +        stats->rx_missed_errors    += dev_stats.rx_missed_errors;
> +        stats->tx_aborted_errors   += dev_stats.tx_aborted_errors;
> +        stats->tx_carrier_errors   += dev_stats.tx_carrier_errors;
> +        stats->tx_fifo_errors      += dev_stats.tx_fifo_errors;
> +        stats->tx_heartbeat_errors += dev_stats.tx_heartbeat_errors;
> +        stats->tx_window_errors    += dev_stats.tx_window_errors;
> +
> +        /* Account the dropped in each xsk */
> +        for (i = 0; i < netdev_n_rxq(netdev); i++) {
> +            xsk_info = dev->xsks[i];
> +            if (xsk_info) {
> +                stats->rx_dropped += xsk_info->rx_dropped;
> +                stats->tx_dropped += xsk_info->tx_dropped;

'xsk_info->rx_dropped' is never increased. I'm not sure if we'll ever
need this field. Maybe it's better to just drop it?

Also, 'xsk_info->tx_dropped' updated concurrently. It should be under
special 'stats_lock' like we do in netdev-dpdk. Another option is to
make it atomic, since this is the only stat variable, this might be
a better solution.


Best regards, Ilya Maximets.


More information about the dev mailing list