[ovs-discuss] Event on vhostuser file descriptor 1024 terminates vswitchd thread

Loftus, Ciara ciara.loftus at intel.com
Thu Mar 3 15:56:53 UTC 2016


> Hello,

Hi Patrik,

The majority of your concerns seem to be to do with the DPDK code itself so I would suggest posting to the dev at dpdk.org mailing list to raise your concerns, especially questions 2 & 3. Either way some comments/questions below.

Thanks,
Ciara

> 
> while doing robustness tests on OVS with DPDK datapath, I'm experiencing
> vswitchd terminating due to a buffer overrun in DPDK
> fdset_event_dispatch().
> The underlying root cause for this, I'm quite sure, lies in our modified
> version
> of OpenStack or tests cases where some vhostuser ports happen to be left
> behind when VMs with more than 16 vNICs are deleted using "nova delete".
How does "nova delete" bring down VMs? Graceful shutdown or another way? Are the ports explicitly deleted after the VM is deleted, with del-port?

> 
> Thus, currently, the test leads to an accumulation of vhostuser sockets and
> the termination is triggered when around >500 sockets are open at the
> same time (accumulated sockets + sockets for 4 X VMs X 24 vNICs/VM).
> 
> Neither OVS (2.4.0, with corrections from the 2.4 branch) or DPDK (2.1.0)
> properly protects against a situation where file descriptor numbers exceeds
> what can be held by the fd_set bit mask, as far as I can determine. The
> question then is if this is a bug or if I'm missing something?
> I'm well aware that this is an unlikely scenario but I do prefer graceful
> handling if possible.
> 
>   1) should there be a limit to the number of vhostuser ports that can be
>       created in OVS or should the limit be dictated by DPDK only?
I think we first need to figure out what that limit is/if there is one and then if it can be avoided.

> 
>   2) in rte_vhost_driver_register() there seems to be an implicit
> assumption
>       that the file descriptor number and the vserver_cnt are related,
> which in
>       general, does not hold. Can this check be improved?
This would be a question for the DPDK Mailing List

> 
>   3) should fdset_event_dispatch() protect against events from file
> descriptors
>       greater than or equal to FD_SETSIZE (=__FD_SETSIZE=1024)?
> 
Likewise

> 
> What is your opinion?
> 
> 
> ---
> 
> 
> In short, what happens is roughly that a vhostuser port is registered in
> (OVS, netdev-dpdk.c):
> 
> static int
> netdev_dpdk_vhost_user_construct(struct netdev *netdev_)
> {
>      struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_);
>      int err;
> 
>      ovs_mutex_lock(&dpdk_mutex);
>      /* Take the name of the vhost-user port and append it to the
> location where
>       * the socket is to be created, then register the socket.
>       */
>      snprintf(netdev->vhost_id, sizeof(netdev->vhost_id), "%s/%s",
>              vhost_sock_dir, netdev_->name);
>      err = rte_vhost_driver_register(netdev->vhost_id);
> ....
> }
> 
> 
> The call to rte_vhost_driver_register() ends up in (DPDK,
> vhost-net-user.c):
> 
> int
> rte_vhost_driver_register(const char *path)
> {
>      struct vhost_server *vserver;
> 
>      pthread_mutex_lock(&g_vhost_server.server_mutex);
>      if (ops == NULL)
>          ops = get_virtio_net_callbacks();
> 
>      if (g_vhost_server.vserver_cnt == MAX_VHOST_SERVER) {
>          RTE_LOG(ERR, VHOST_CONFIG,
>              "error: the number of servers reaches maximum\n");
>          pthread_mutex_unlock(&g_vhost_server.server_mutex);
>          return -1;
>      }
> ...
> }
> 
> Here a check is done against MAX_VHOST_SERVER (=1024),
> the number of vhostuser ports are not yet at maximum number.
> But still the process can have more file descriptors open
> than it has vhostuser sockets.
> 
> 
> Then we come to the actual failure, where an event on a file
> descriptor (fd=1024) leads to a memory access outside of the
> fd_set bit mask, at (DPDK, fd_man.c, FD_ISSET line shown by "-->"):
> 
> void
> fdset_event_dispatch(struct fdset *pfdset)
> {
>      fd_set rfds, wfds;
>      int i, maxfds;
>      struct fdentry *pfdentry;
>      int num = MAX_FDS;
>      fd_cb rcb, wcb;
>      void *dat;
>      int fd;
>      int remove1, remove2;
>      int ret;
> 
>      if (pfdset == NULL)
>          return;
> 
>      while (1) {
>          struct timeval tv;
>          tv.tv_sec = 1;
>          tv.tv_usec = 0;
>          FD_ZERO(&rfds);
>          FD_ZERO(&wfds);
>          pthread_mutex_lock(&pfdset->fd_mutex);
> 
>          maxfds = fdset_fill(&rfds, &wfds, pfdset);
> 
>          pthread_mutex_unlock(&pfdset->fd_mutex);
> 
>          /*
>           * When select is blocked, other threads might unregister
>           * listenfds from and register new listenfds into fdset.
>           * When select returns, the entries for listenfds in the fdset
>           * might have been updated. It is ok if there is unwanted call
>           * for new listenfds.
>           */
>          ret = select(maxfds + 1, &rfds, &wfds, NULL, &tv);
>          if (ret <= 0)
>              continue;
> 
>          for (i = 0; i < num; i++) {
>              remove1 = remove2 = 0;
>              pthread_mutex_lock(&pfdset->fd_mutex);
>              pfdentry = &pfdset->fd[i];
>              fd = pfdentry->fd;
>              rcb = pfdentry->rcb;
>              wcb = pfdentry->wcb;
>              dat = pfdentry->dat;
>              pfdentry->busy = 1;
>              pthread_mutex_unlock(&pfdset->fd_mutex);
> 
> ---->      if (fd >= 0 && FD_ISSET(fd, &rfds) && rcb)
>                  rcb(fd, dat, &remove1);
> 
> 
>              if (fd >= 0 && FD_ISSET(fd, &wfds) && wcb)
>                  wcb(fd, dat, &remove2);
> ...
>          }
>      }
> }
> 
> 
> Regards,
> 
> Patrik
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> http://openvswitch.org/mailman/listinfo/discuss


More information about the discuss mailing list