[ovs-dev] [PATCH v7 1/3] netdev: Add rxq callback function rxq_length()

Fri Jan 19 09:53:30 UTC 2018


> -----Original Message-----
> From: Jan Scheurich [mailto:jan.scheurich at ericsson.com]
> Sent: Thursday, January 18, 2018 4:59 PM
> To: Ilya Maximets <i.maximets at samsung.com>; O Mahony, Billy
> <billy.o.mahony at intel.com>; dev at openvswitch.org
> Cc: ktraynor at redhat.com; Stokes, Ian <ian.stokes at intel.com>
> Subject: RE: [PATCH v7 1/3] netdev: Add rxq callback function rxq_length()
> 
> > >>>
> > >>> OK. Not necessary for our use case, as it will only be called by
> > >>> the PMD
> > >> after having received a full batch of 32 packets, but in general I
> > >> agree those checks are needed.
> > >>
> > >> It's necessary because vhost device could be disconnected between
> > >> rxq_recv() and rxq_length(). In this case we will call
> > >> rte_vhost_rx_queue_count() with vid == -1. This will produce access
> > >> to the random memory inside dpdk and likely a segmentation fault.
> > >>
> > >> See commit daf22bf7a826 ("netdev-dpdk: Fix calling vhost API with
> > >> negative
> > >> vid.") for a example of a similar issue. And I'm taking this
> > >> opportunity to recall that you should retrieve the vid only once.
> > >
> > >  [[BO'M]] Is there not also the possibility that the vhost device gets
> disconnected between the call to get_vid() and rxq_recv()?
> >
> > You mean disconnect between netdev_dpdk_get_vid(dev) and
> rte_vhost_dequeue_burst(vid) ?
> > There is no issue in this case, because 'destroy_device()' will wait
> > for other threads to quiesce. This means that device structure inside
> > dpdk will not be freed while we're inside netdev_rxq_recv(). We can
> > safely call any rte_vhost API for the old vid until device not freed inside
> dpdk.
> >
> > >
> > > Also, given these required calls to get_vid (which afaik requires
> > > some slow memory fencing) wouldn't that argue for the original
> > approach where the rxq len is returned from rxq_recv(). As the call to
> > rxq_length()  would be made once per batch once the queue is not being
> drained rxq_recv() the overhead could be significant.
> >
> > I'm not sure (I hope that Jan tested the performance of this version),
> > but I feel that 'rte_vhost_rx_queue_count()' is more heavy operation.
> 
> I have not done any performance tests yet with the new
> netdev_rxq_length() call after adding the vid and other checks. The actual
> 'rte_vhost_rx_queue_count()' is very lightweight. So if the memory fencing
> for get_vid() is expensive it might mean a hit. (Note: The
> rte_eth_queue_count() functions for physical ixgbe queues was much more
> costly).
> 
> Thinking about it, I would tend to agree with Billy that it seems simpler as well
> as more accurate and efficient to let the caller provide an optional output
> parameter "uint32_t *rxq_len" in rqx_recv() if they are interested in the
> queue fill level and retrieve both in one atomic operation, so that we avoid
> the duplicate vid checks.
[[BO'M]] I like the idea of supplying the pointer which makes it optional and so gets around other issues we discussed - such as a client may only be interested in rxq len for certain netdev types or may know a priori that it's an expensive operation for other netdev types. 
> 
> Can we agree on that?
> 
> BR, Jan