[ovs-dev] [PATCH v2] dpif-netlink: don't allocate per thread netlink sockets

Guru Shetty guru at ovn.org
Fri Oct 5 23:27:25 UTC 2018


On Tue, 25 Sep 2018 at 01:51, Matteo Croce <mcroce at redhat.com> wrote:

> When using the kernel datapath, OVS allocates a pool of sockets to handle
> netlink events. The number of sockets is: ports * n-handler-threads, where
> n-handler-threads is user configurable and defaults to 3/4*number of cores.
>
> This because vswitchd starts n-handler-threads threads, each one with a
> netlink socket for every port of the switch. Every thread then, starts
> listening on events on its set of sockets with epoll().
>
> On setup with lot of CPUs and ports, the number of sockets easily hits
> the process file descriptor limit, and ovs-vswitchd will exit with -EMFILE.
>
> Change the number of allocated sockets to just one per port by moving
> the socket array from a per handler structure to a per datapath one,
> and let all the handlers share the same sockets by using EPOLLEXCLUSIVE
> epoll flag which avoids duplicate events, on systems that support it.
>
> The patch was tested on a 56 core machine running Linux 4.18 and latest
> Open vSwitch. A bridge was created with 2000+ ports, some of them being
> veth interfaces with the peer outside the bridge. The latency of the upcall
> is measured by setting a single 'action=controller,local' OpenFlow rule to
> force all the packets going to the slow path and then to the local port.
> A tool[1] injects some packets to the veth outside the bridge, and measures
> the delay until the packet is captured on the local port. The rx timestamp
> is get from the socket ancillary data in the attribute SO_TIMESTAMPNS, to
> avoid having the scheduler delay in the measured time.
>
> The first test measures the average latency for an upcall generated from
> a single port. To measure it 100k packets, one every msec, are sent to a
> single port and the latencies are measured.
>
> The second test is meant to check latency fairness among ports, namely if
> latency is equal between ports or if some ports have lower priority.
> The previous test is repeated for every port, the average of the average
> latencies and the standard deviation between averages is measured.
>
> The third test serves to measure responsiveness under load. Heavy traffic
> is sent through all ports, latency and packet loss is measured
> on a single idle port.
>
> The fourth test is all about fairness. Heavy traffic is injected in all
> ports but one, latency and packet loss is measured on the single idle port.
>
> This is the test setup:
>
>   # nproc
>   56
>   # ovs-vsctl show |grep -c Port
>   2223
>   # ovs-ofctl dump-flows ovs_upc_br
>    cookie=0x0, duration=4.827s, table=0, n_packets=0, n_bytes=0,
> actions=CONTROLLER:65535,LOCAL
>   # uname -a
>   Linux fc28 4.18.7-200.fc28.x86_64 #1 SMP Mon Sep 10 15:44:45 UTC 2018
> x86_64 x86_64 x86_64 GNU/Linux
>
> And these are the results of the tests:
>
>                                           Stock OVS                 Patched
>   netlink sockets
>   in use by vswitchd
>   lsof -p $(pidof ovs-vswitchd) \
>       |grep -c GENERIC                        91187                    2227
>
>   Test 1
>   one port latency
>   min/avg/max/mdev (us)           2.7/6.6/238.7/1.8       1.6/6.8/160.6/1.7
>
>   Test 2
>   all port
>   avg latency/mdev (us)                   6.51/0.97               6.86/0.17
>
>   Test 3
>   single port latency
>   under load
>   avg/mdev (us)                             7.5/5.9                 3.8/4.8
>   packet loss                                  95 %                    62 %
>
>   Test 4
>   idle port latency
>   under load
>   min/avg/max/mdev (us)           0.8/1.5/210.5/0.9       1.0/2.1/344.5/1.2
>   packet loss                                  94 %                     4 %
>
> CPU and RAM usage seems not to be affected, the resource usage of vswitchd
> idle with 2000+ ports is unchanged:
>
>   # ps u $(pidof ovs-vswitchd)
>   USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>   openvsw+  5430 54.3  0.3 4263964 510968 pts/1  RLl+ 16:20   0:50
> ovs-vswitchd
>
> Additionally, to check if vswitchd is thread safe with this patch, the
> following test was run for circa 48 hours: on a 56 core machine, a
> bridge with kernel datapath is filled with 2200 dummy interfaces and 22
> veth, then 22 traffic generators are run in parallel piping traffic into
> the veths peers outside the bridge.
> To generate as many upcalls as possible, all packets were forced to the
> slowpath with an openflow rule like 'action=controller,local' and packet
> size was set to 64 byte. Also, to avoid overflowing the FDB early and
> slowing down the upcall processing, generated mac addresses were restricted
> to a small interval. vswitchd ran without problems for 48+ hours,
> obviously with all the handler threads with almost 99% CPU usage.
>
> [1] https://github.com/teknoraver/network-tools/blob/master/weed.c
>
> Signed-off-by: Matteo Croce <mcroce at redhat.com>
>

I get a segfault with this patch with the following backtrace. I have not
investigated.

Program received signal SIGSEGV, Segmentation fault.
nl_sock_pid (sock=0x0) at lib/netlink-socket.c:1424
1424     return sock->pid;
(gdb) where
#0  nl_sock_pid (sock=0x0) at lib/netlink-socket.c:1424
#1  0x000000000052fb79 in dpif_netlink_port_add__ (dpif=dpif at entry=0x89de90,
name=name at entry=0x7fffffffdf20 "genev_sys_6081", type=type at entry
=OVS_VPORT_TYPE_GENEVE,
    options=0x7fffffffdee0, port_nop=0x7fffffffdf9c) at
lib/dpif-netlink.c:731
#2  0x000000000052fdc7 in dpif_netlink_port_add_compat
(port_nop=0x7fffffffdf9c, netdev=<optimized out>, dpif=0x89de90) at
lib/dpif-netlink.c:836
#3  dpif_netlink_port_add (dpif_=0x89de90, netdev=<optimized out>,
port_nop=0x7fffffffdf9c) at lib/dpif-netlink.c:882
#4  0x00000000004778be in dpif_port_add (dpif=0x89de90,
netdev=netdev at entry=0x911830,
port_nop=port_nop at entry=0x7fffffffdffc) at lib/dpif.c:579
#5  0x0000000000426efe in port_add (ofproto_=0x8a97f0, netdev=0x911830) at
ofproto/ofproto-dpif.c:3689
#6  0x000000000041d951 in ofproto_port_add (ofproto=0x8a97f0,
netdev=0x911830, ofp_portp=ofp_portp at entry=0x7fffffffe0d8) at
ofproto/ofproto.c:2008
#7  0x000000000040c709 in iface_do_create (errp=0x7fffffffe0e8,
netdevp=0x7fffffffe0e0, ofp_portp=0x7fffffffe0d8, iface_cfg=0x8a0620,
br=0x8a8bb0) at vswitchd/bridge.c:1803
#8  iface_create (port_cfg=0x8a4e60, iface_cfg=0x8a0620, br=0x8a8bb0) at
vswitchd/bridge.c:1841
#9  bridge_add_ports__ (br=br at entry=0x8a8bb0,
wanted_ports=wanted_ports at entry=0x8a8c90,
with_requested_port=with_requested_port at entry=false) at
vswitchd/bridge.c:935
#10 0x000000000040e02f in bridge_add_ports (wanted_ports=0x8a8c90,
br=0x8a8bb0) at vswitchd/bridge.c:951
#11 bridge_reconfigure (ovs_cfg=ovs_cfg at entry=0x8a7fc0) at
vswitchd/bridge.c:665
#12 0x00000000004114b9 in bridge_run () at vswitchd/bridge.c:3023
#13 0x00000000004080c5 in main (argc=2, argv=0x7fffffffe5e8) at
vswitchd/ovs-vswitchd.c:125






> ---
> v1 -> v2:
>   - define EPOLLEXCLUSIVE on systems with older kernel headers
>   - explain the thread safety test in the commit message
>
>  lib/dpif-netlink.c | 311 ++++++++++++---------------------------------
>  1 file changed, 82 insertions(+), 229 deletions(-)
>
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index e6d5a6ec5..bb565ffee 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -78,6 +78,10 @@ enum { MAX_PORTS = USHRT_MAX };
>  #define FLOW_DUMP_MAX_BATCH 50
>  #define OPERATE_MAX_OPS 50
>
> +#ifndef EPOLLEXCLUSIVE
> +#define EPOLLEXCLUSIVE (1u << 28)
> +#endif
> +
>  struct dpif_netlink_dp {
>      /* Generic Netlink header. */
>      uint8_t cmd;
> @@ -170,7 +174,6 @@ struct dpif_windows_vport_sock {
>  #endif
>
>  struct dpif_handler {
> -    struct dpif_channel *channels;/* Array of channels for each handler.
> */
>      struct epoll_event *epoll_events;
>      int epoll_fd;                 /* epoll fd that includes channel
> socks. */
>      int n_events;                 /* Num events returned by epoll_wait().
> */
> @@ -193,6 +196,7 @@ struct dpif_netlink {
>      struct fat_rwlock upcall_lock;
>      struct dpif_handler *handlers;
>      uint32_t n_handlers;           /* Num of upcall handlers. */
> +    struct dpif_channel *channels; /* Array of channels for each port. */
>      int uc_array_size;             /* Size of 'handler->channels' and */
>                                     /* 'handler->epoll_events'. */
>
> @@ -331,43 +335,6 @@ open_dpif(const struct dpif_netlink_dp *dp, struct
> dpif **dpifp)
>      return 0;
>  }
>
> -/* Destroys the netlink sockets pointed by the elements in 'socksp'
> - * and frees the 'socksp'.  */
> -static void
> -vport_del_socksp__(struct nl_sock **socksp, uint32_t n_socks)
> -{
> -    size_t i;
> -
> -    for (i = 0; i < n_socks; i++) {
> -        nl_sock_destroy(socksp[i]);
> -    }
> -
> -    free(socksp);
> -}
> -
> -/* Creates an array of netlink sockets.  Returns an array of the
> - * corresponding pointers.  Records the error in 'error'. */
> -static struct nl_sock **
> -vport_create_socksp__(uint32_t n_socks, int *error)
> -{
> -    struct nl_sock **socksp = xzalloc(n_socks * sizeof *socksp);
> -    size_t i;
> -
> -    for (i = 0; i < n_socks; i++) {
> -        *error = nl_sock_create(NETLINK_GENERIC, &socksp[i]);
> -        if (*error) {
> -            goto error;
> -        }
> -    }
> -
> -    return socksp;
> -
> -error:
> -    vport_del_socksp__(socksp, n_socks);
> -
> -    return NULL;
> -}
> -
>  #ifdef _WIN32
>  static void
>  vport_delete_sock_pool(struct dpif_handler *handler)
> @@ -422,129 +389,34 @@ error:
>      vport_delete_sock_pool(handler);
>      return error;
>  }
> -
> -/* Returns an array pointers to netlink sockets.  The sockets are picked
> from a
> - * pool. Records the error in 'error'. */
> -static struct nl_sock **
> -vport_create_socksp_windows(struct dpif_netlink *dpif, int *error)
> -    OVS_REQ_WRLOCK(dpif->upcall_lock)
> -{
> -    uint32_t n_socks = dpif->n_handlers;
> -    struct nl_sock **socksp;
> -    size_t i;
> -
> -    ovs_assert(n_socks <= 1);
> -    socksp = xzalloc(n_socks * sizeof *socksp);
> -
> -    /* Pick netlink sockets to use in a round-robin fashion from each
> -     * handler's pool of sockets. */
> -    for (i = 0; i < n_socks; i++) {
> -        struct dpif_handler *handler = &dpif->handlers[i];
> -        struct dpif_windows_vport_sock *sock_pool =
> handler->vport_sock_pool;
> -        size_t index = handler->last_used_pool_idx;
> -
> -        /* A pool of sockets is allocated when the handler is
> initialized. */
> -        if (sock_pool == NULL) {
> -            free(socksp);
> -            *error = EINVAL;
> -            return NULL;
> -        }
> -
> -        ovs_assert(index < VPORT_SOCK_POOL_SIZE);
> -        socksp[i] = sock_pool[index].nl_sock;
> -        socksp[i] = sock_pool[index].nl_sock;
> -        ovs_assert(socksp[i]);
> -        index = (index == VPORT_SOCK_POOL_SIZE - 1) ? 0 : index + 1;
> -        handler->last_used_pool_idx = index;
> -    }
> -
> -    return socksp;
> -}
> -
> -static void
> -vport_del_socksp_windows(struct dpif_netlink *dpif, struct nl_sock
> **socksp)
> -{
> -    free(socksp);
> -}
>  #endif /* _WIN32 */
>
> -static struct nl_sock **
> -vport_create_socksp(struct dpif_netlink *dpif, int *error)
> -{
> -#ifdef _WIN32
> -    return vport_create_socksp_windows(dpif, error);
> -#else
> -    return vport_create_socksp__(dpif->n_handlers, error);
> -#endif
> -}
> -
> -static void
> -vport_del_socksp(struct dpif_netlink *dpif, struct nl_sock **socksp)
> -{
> -#ifdef _WIN32
> -    vport_del_socksp_windows(dpif, socksp);
> -#else
> -    vport_del_socksp__(socksp, dpif->n_handlers);
> -#endif
> -}
> -
> -/* Given the array of pointers to netlink sockets 'socksp', returns
> - * the array of corresponding pids. If the 'socksp' is NULL, returns
> - * a single-element array of value 0. */
> -static uint32_t *
> -vport_socksp_to_pids(struct nl_sock **socksp, uint32_t n_socks)
> -{
> -    uint32_t *pids;
> -
> -    if (!socksp) {
> -        pids = xzalloc(sizeof *pids);
> -    } else {
> -        size_t i;
> -
> -        pids = xzalloc(n_socks * sizeof *pids);
> -        for (i = 0; i < n_socks; i++) {
> -            pids[i] = nl_sock_pid(socksp[i]);
> -        }
> -    }
> -
> -    return pids;
> -}
> -
> -/* Given the port number 'port_idx', extracts the pids of netlink sockets
> - * associated to the port and assigns it to 'upcall_pids'. */
> +/* Given the port number 'port_idx', extracts the pid of netlink socket
> + * associated to the port and assigns it to 'upcall_pid'. */
>  static bool
> -vport_get_pids(struct dpif_netlink *dpif, uint32_t port_idx,
> -               uint32_t **upcall_pids)
> +vport_get_pid(struct dpif_netlink *dpif, uint32_t port_idx,
> +              uint32_t *upcall_pid)
>  {
> -    uint32_t *pids;
> -    size_t i;
> -
>      /* Since the nl_sock can only be assigned in either all
> -     * or none "dpif->handlers" channels, the following check
> +     * or none "dpif" channels, the following check
>       * would suffice. */
> -    if (!dpif->handlers[0].channels[port_idx].sock) {
> +    if (!dpif->channels[port_idx].sock) {
>          return false;
>      }
>      ovs_assert(!WINDOWS || dpif->n_handlers <= 1);
>
> -    pids = xzalloc(dpif->n_handlers * sizeof *pids);
> -
> -    for (i = 0; i < dpif->n_handlers; i++) {
> -        pids[i] = nl_sock_pid(dpif->handlers[i].channels[port_idx].sock);
> -    }
> -
> -    *upcall_pids = pids;
> +    *upcall_pid = nl_sock_pid(dpif->channels[port_idx].sock);
>
>      return true;
>  }
>
>  static int
> -vport_add_channels(struct dpif_netlink *dpif, odp_port_t port_no,
> -                   struct nl_sock **socksp)
> +vport_add_channel(struct dpif_netlink *dpif, odp_port_t port_no,
> +                  struct nl_sock *socksp)
>  {
>      struct epoll_event event;
>      uint32_t port_idx = odp_to_u32(port_no);
> -    size_t i, j;
> +    size_t i;
>      int error;
>
>      if (dpif->handlers == NULL) {
> @@ -553,7 +425,7 @@ vport_add_channels(struct dpif_netlink *dpif,
> odp_port_t port_no,
>
>      /* We assume that the datapath densely chooses port numbers, which can
>       * therefore be used as an index into 'channels' and 'epoll_events' of
> -     * 'dpif->handler'. */
> +     * 'dpif'. */
>      if (port_idx >= dpif->uc_array_size) {
>          uint32_t new_size = port_idx + 1;
>
> @@ -563,15 +435,15 @@ vport_add_channels(struct dpif_netlink *dpif,
> odp_port_t port_no,
>              return EFBIG;
>          }
>
> -        for (i = 0; i < dpif->n_handlers; i++) {
> -            struct dpif_handler *handler = &dpif->handlers[i];
> +        dpif->channels = xrealloc(dpif->channels,
> +                                  new_size * sizeof *dpif->channels);
>
> -            handler->channels = xrealloc(handler->channels,
> -                                         new_size * sizeof
> *handler->channels);
> +        for (i = dpif->uc_array_size; i < new_size; i++) {
> +            dpif->channels[i].sock = NULL;
> +        }
>
> -            for (j = dpif->uc_array_size; j < new_size; j++) {
> -                handler->channels[j].sock = NULL;
> -            }
> +        for (i = 0; i < dpif->n_handlers; i++) {
> +            struct dpif_handler *handler = &dpif->handlers[i];
>
>              handler->epoll_events = xrealloc(handler->epoll_events,
>                  new_size * sizeof *handler->epoll_events);
> @@ -581,33 +453,33 @@ vport_add_channels(struct dpif_netlink *dpif,
> odp_port_t port_no,
>      }
>
>      memset(&event, 0, sizeof event);
> -    event.events = EPOLLIN;
> +    event.events = EPOLLIN | EPOLLEXCLUSIVE;
>      event.data.u32 = port_idx;
>
>      for (i = 0; i < dpif->n_handlers; i++) {
>          struct dpif_handler *handler = &dpif->handlers[i];
>
>  #ifndef _WIN32
> -        if (epoll_ctl(handler->epoll_fd, EPOLL_CTL_ADD,
> nl_sock_fd(socksp[i]),
> +        if (epoll_ctl(handler->epoll_fd, EPOLL_CTL_ADD,
> nl_sock_fd(socksp),
>                        &event) < 0) {
>              error = errno;
>              goto error;
>          }
>  #endif
> -        dpif->handlers[i].channels[port_idx].sock = socksp[i];
> -        dpif->handlers[i].channels[port_idx].last_poll = LLONG_MIN;
>      }
> +    dpif->channels[port_idx].sock = socksp;
> +    dpif->channels[port_idx].last_poll = LLONG_MIN;
>
>      return 0;
>
>  error:
> -    for (j = 0; j < i; j++) {
>  #ifndef _WIN32
> -        epoll_ctl(dpif->handlers[j].epoll_fd, EPOLL_CTL_DEL,
> -                  nl_sock_fd(socksp[j]), NULL);
> -#endif
> -        dpif->handlers[j].channels[port_idx].sock = NULL;
> +    while (i--) {
> +        epoll_ctl(dpif->handlers[i].epoll_fd, EPOLL_CTL_DEL,
> +                  nl_sock_fd(socksp), NULL);
>      }
> +#endif
> +    dpif->channels[port_idx].sock = NULL;
>
>      return error;
>  }
> @@ -618,14 +490,8 @@ vport_del_channels(struct dpif_netlink *dpif,
> odp_port_t port_no)
>      uint32_t port_idx = odp_to_u32(port_no);
>      size_t i;
>
> -    if (!dpif->handlers || port_idx >= dpif->uc_array_size) {
> -        return;
> -    }
> -
> -    /* Since the sock can only be assigned in either all or none
> -     * of "dpif->handlers" channels, the following check would
> -     * suffice. */
> -    if (!dpif->handlers[0].channels[port_idx].sock) {
> +    if (!dpif->handlers || port_idx >= dpif->uc_array_size
> +        || !dpif->channels[port_idx].sock) {
>          return;
>      }
>
> @@ -633,12 +499,14 @@ vport_del_channels(struct dpif_netlink *dpif,
> odp_port_t port_no)
>          struct dpif_handler *handler = &dpif->handlers[i];
>  #ifndef _WIN32
>          epoll_ctl(handler->epoll_fd, EPOLL_CTL_DEL,
> -                  nl_sock_fd(handler->channels[port_idx].sock), NULL);
> -        nl_sock_destroy(handler->channels[port_idx].sock);
> +                  nl_sock_fd(dpif->channels[port_idx].sock), NULL);
>  #endif
> -        handler->channels[port_idx].sock = NULL;
>          handler->event_offset = handler->n_events = 0;
>      }
> +#ifndef _WIN32
> +    nl_sock_destroy(dpif->channels[port_idx].sock);
> +#endif
> +    dpif->channels[port_idx].sock = NULL;
>  }
>
>  static void
> @@ -655,10 +523,7 @@ destroy_all_channels(struct dpif_netlink *dpif)
>          struct dpif_netlink_vport vport_request;
>          uint32_t upcall_pids = 0;
>
> -        /* Since the sock can only be assigned in either all or none
> -         * of "dpif->handlers" channels, the following check would
> -         * suffice. */
> -        if (!dpif->handlers[0].channels[i].sock) {
> +        if (!dpif->channels[i].sock) {
>              continue;
>          }
>
> @@ -679,11 +544,11 @@ destroy_all_channels(struct dpif_netlink *dpif)
>
>          dpif_netlink_handler_uninit(handler);
>          free(handler->epoll_events);
> -        free(handler->channels);
>      }
> -
> +    free(dpif->channels);
>      free(dpif->handlers);
>      dpif->handlers = NULL;
> +    dpif->channels = NULL;
>      dpif->n_handlers = 0;
>      dpif->uc_array_size = 0;
>  }
> @@ -846,13 +711,12 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif,
> const char *name,
>  {
>      struct dpif_netlink_vport request, reply;
>      struct ofpbuf *buf;
> -    struct nl_sock **socksp = NULL;
> -    uint32_t *upcall_pids;
> +    struct nl_sock *socksp = NULL;
> +    uint32_t upcall_pids;
>      int error = 0;
>
>      if (dpif->handlers) {
> -        socksp = vport_create_socksp(dpif, &error);
> -        if (!socksp) {
> +        if (nl_sock_create(NETLINK_GENERIC, &socksp)) {
>              return error;
>          }
>      }
> @@ -864,9 +728,9 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif,
> const char *name,
>      request.name = name;
>
>      request.port_no = *port_nop;
> -    upcall_pids = vport_socksp_to_pids(socksp, dpif->n_handlers);
> -    request.n_upcall_pids = socksp ? dpif->n_handlers : 1;
> -    request.upcall_pids = upcall_pids;
> +    upcall_pids = nl_sock_pid(socksp);
> +    request.n_upcall_pids = 1;
> +    request.upcall_pids = &upcall_pids;
>
>      if (options) {
>          request.options = options->data;
> @@ -882,31 +746,27 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif,
> const char *name,
>                        dpif_name(&dpif->dpif), *port_nop);
>          }
>
> -        vport_del_socksp(dpif, socksp);
> +        nl_sock_destroy(socksp);
>          goto exit;
>      }
>
> -    if (socksp) {
> -        error = vport_add_channels(dpif, *port_nop, socksp);
> -        if (error) {
> -            VLOG_INFO("%s: could not add channel for port %s",
> -                      dpif_name(&dpif->dpif), name);
> -
> -            /* Delete the port. */
> -            dpif_netlink_vport_init(&request);
> -            request.cmd = OVS_VPORT_CMD_DEL;
> -            request.dp_ifindex = dpif->dp_ifindex;
> -            request.port_no = *port_nop;
> -            dpif_netlink_vport_transact(&request, NULL, NULL);
> -            vport_del_socksp(dpif, socksp);
> -            goto exit;
> -        }
> +    error = vport_add_channel(dpif, *port_nop, socksp);
> +    if (error) {
> +        VLOG_INFO("%s: could not add channel for port %s",
> +                    dpif_name(&dpif->dpif), name);
> +
> +        /* Delete the port. */
> +        dpif_netlink_vport_init(&request);
> +        request.cmd = OVS_VPORT_CMD_DEL;
> +        request.dp_ifindex = dpif->dp_ifindex;
> +        request.port_no = *port_nop;
> +        dpif_netlink_vport_transact(&request, NULL, NULL);
> +        nl_sock_destroy(socksp);
> +        goto exit;
>      }
> -    free(socksp);
>
>  exit:
>      ofpbuf_delete(buf);
> -    free(upcall_pids);
>
>      return error;
>  }
> @@ -1131,7 +991,7 @@ dpif_netlink_port_query_by_name(const struct dpif
> *dpif_, const char *devname,
>
>  static uint32_t
>  dpif_netlink_port_get_pid__(const struct dpif_netlink *dpif,
> -                            odp_port_t port_no, uint32_t hash)
> +                            odp_port_t port_no, uint32_t hash OVS_UNUSED)
>      OVS_REQ_RDLOCK(dpif->upcall_lock)
>  {
>      uint32_t port_idx = odp_to_u32(port_no);
> @@ -1141,14 +1001,13 @@ dpif_netlink_port_get_pid__(const struct
> dpif_netlink *dpif,
>          /* The ODPP_NONE "reserved" port number uses the "ovs-system"'s
>           * channel, since it is not heavily loaded. */
>          uint32_t idx = port_idx >= dpif->uc_array_size ? 0 : port_idx;
> -        struct dpif_handler *h = &dpif->handlers[hash % dpif->n_handlers];
>
>          /* Needs to check in case the socket pointer is changed in between
>           * the holding of upcall_lock.  A known case happens when the main
>           * thread deletes the vport while the handler thread is handling
>           * the upcall from that port. */
> -        if (h->channels[idx].sock) {
> -            pid = nl_sock_pid(h->channels[idx].sock);
> +        if (dpif->channels[idx].sock) {
> +            pid = nl_sock_pid(dpif->channels[idx].sock);
>          }
>      }
>
> @@ -2382,42 +2241,40 @@ dpif_netlink_refresh_channels(struct dpif_netlink
> *dpif, uint32_t n_handlers)
>      dpif_netlink_port_dump_start__(dpif, &dump);
>      while (!dpif_netlink_port_dump_next__(dpif, &dump, &vport, &buf)) {
>          uint32_t port_no = odp_to_u32(vport.port_no);
> -        uint32_t *upcall_pids = NULL;
> +        uint32_t upcall_pid;
>          int error;
>
>          if (port_no >= dpif->uc_array_size
> -            || !vport_get_pids(dpif, port_no, &upcall_pids)) {
> -            struct nl_sock **socksp = vport_create_socksp(dpif, &error);
> +            || !vport_get_pid(dpif, port_no, &upcall_pid)) {
> +            struct nl_sock *socksp;
>
> -            if (!socksp) {
> +            if (nl_sock_create(NETLINK_GENERIC, &socksp)) {
>                  goto error;
>              }
>
> -            error = vport_add_channels(dpif, vport.port_no, socksp);
> +            error = vport_add_channel(dpif, vport.port_no, socksp);
>              if (error) {
>                  VLOG_INFO("%s: could not add channels for port %s",
>                            dpif_name(&dpif->dpif), vport.name);
> -                vport_del_socksp(dpif, socksp);
> +                nl_sock_destroy(socksp);
>                  retval = error;
>                  goto error;
>              }
> -            upcall_pids = vport_socksp_to_pids(socksp, dpif->n_handlers);
> -            free(socksp);
> +            upcall_pid = nl_sock_pid(socksp);
>          }
>
>          /* Configure the vport to deliver misses to 'sock'. */
>          if (vport.upcall_pids[0] == 0
> -            || vport.n_upcall_pids != dpif->n_handlers
> -            || memcmp(upcall_pids, vport.upcall_pids, n_handlers * sizeof
> -                      *upcall_pids)) {
> +            || vport.n_upcall_pids != 1
> +            || upcall_pid != vport.upcall_pids[0]) {
>              struct dpif_netlink_vport vport_request;
>
>              dpif_netlink_vport_init(&vport_request);
>              vport_request.cmd = OVS_VPORT_CMD_SET;
>              vport_request.dp_ifindex = dpif->dp_ifindex;
>              vport_request.port_no = vport.port_no;
> -            vport_request.n_upcall_pids = dpif->n_handlers;
> -            vport_request.upcall_pids = upcall_pids;
> +            vport_request.n_upcall_pids = 1;
> +            vport_request.upcall_pids = &upcall_pid;
>              error = dpif_netlink_vport_transact(&vport_request, NULL,
> NULL);
>              if (error) {
>                  VLOG_WARN_RL(&error_rl,
> @@ -2438,11 +2295,9 @@ dpif_netlink_refresh_channels(struct dpif_netlink
> *dpif, uint32_t n_handlers)
>          if (port_no < keep_channels_nbits) {
>              bitmap_set1(keep_channels, port_no);
>          }
> -        free(upcall_pids);
>          continue;
>
>      error:
> -        free(upcall_pids);
>          vport_del_channels(dpif, vport.port_no);
>      }
>      nl_dump_done(&dump);
> @@ -2701,7 +2556,7 @@ dpif_netlink_recv__(struct dpif_netlink *dpif,
> uint32_t handler_id,
>
>      while (handler->event_offset < handler->n_events) {
>          int idx = handler->epoll_events[handler->event_offset].data.u32;
> -        struct dpif_channel *ch =
> &dpif->handlers[handler_id].channels[idx];
> +        struct dpif_channel *ch = &dpif->channels[idx];
>
>          handler->event_offset++;
>
> @@ -2803,16 +2658,14 @@ dpif_netlink_recv_purge__(struct dpif_netlink
> *dpif)
>      OVS_REQ_WRLOCK(dpif->upcall_lock)
>  {
>      if (dpif->handlers) {
> -        size_t i, j;
> +        size_t i;
>
> +        if (!dpif->channels[0].sock) {
> +            return;
> +        }
>          for (i = 0; i < dpif->uc_array_size; i++ ) {
> -            if (!dpif->handlers[0].channels[i].sock) {
> -                continue;
> -            }
>
> -            for (j = 0; j < dpif->n_handlers; j++) {
> -                nl_sock_drain(dpif->handlers[j].channels[i].sock);
> -            }
> +            nl_sock_drain(dpif->channels[i].sock);
>          }
>      }
>  }
> --
> 2.17.1
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list