[ovs-dev] [PATCH v4 2/5] netdev-dpdk: Add netdev_dpdk_vhost_txq_flush function.
Bodireddy, Bhanuprakash
bhanuprakash.bodireddy at intel.com
Fri Aug 11 13:11:09 UTC 2017
>On 09.08.2017 15:35, Bodireddy, Bhanuprakash wrote:
>>>>
>>>> +static int
>>>> +netdev_dpdk_vhost_tx_burst(struct netdev_dpdk *dev, int qid) {
>>>> + struct dpdk_tx_queue *txq = &dev->tx_q[qid];
>>>> + struct rte_mbuf **cur_pkts = (struct rte_mbuf
>>>> +**)txq->vhost_burst_pkts;
>>>> +
>>>> + int tx_vid = netdev_dpdk_get_vid(dev);
>>>> + int tx_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
>>>> + uint32_t sent = 0;
>>>> + uint32_t retries = 0;
>>>> + uint32_t sum, total_pkts;
>>>> +
>>>> + total_pkts = sum = txq->vhost_pkt_cnt;
>>>> + do {
>>>> + uint32_t ret;
>>>> + ret = rte_vhost_enqueue_burst(tx_vid, tx_qid,
>>>> + &cur_pkts[sent],
>>> sum);
>>>> + if (OVS_UNLIKELY(!ret)) {
>>>> + /* No packets enqueued - do not retry. */
>>>> + break;
>>>> + } else {
>>>> + /* Packet have been sent. */
>>>> + sent += ret;
>>>> +
>>>> + /* 'sum' packet have to be retransmitted. */
>>>> + sum -= ret;
>>>> + }
>>>> + } while (sum && (retries++ < VHOST_ENQ_RETRY_NUM));
>>>> +
>>>> + for (int i = 0; i < total_pkts; i++) {
>>>> + dp_packet_delete(txq->vhost_burst_pkts[i]);
>>>> + }
>>>> +
>>>> + /* Reset pkt count. */
>>>> + txq->vhost_pkt_cnt = 0;
>>>> +
>>>> + /* 'sum' refers to packets dropped. */
>>>> + return sum;
>>>> +}
>>>> +
>>>> +/* Flush the txq if there are any packets available. */ static int
>>>> +netdev_dpdk_vhost_txq_flush(struct netdev *netdev, int qid,
>>>> + bool concurrent_txq OVS_UNUSED) {
>>>> + struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>> + struct dpdk_tx_queue *txq;
>>>> +
>>>> + qid = dev->tx_q[qid % netdev->n_txq].map;
>>>> +
>>>> + /* The qid may be disabled in the guest and has been set to
>>>> + * OVS_VHOST_QUEUE_DISABLED.
>>>> + */
>>>> + if (OVS_UNLIKELY(qid < 0)) {
>>>> + return 0;
>>>> + }
>>>> +
>>>> + txq = &dev->tx_q[qid];
>>>> + /* Increment the drop count and free the memory. */
>>>> + if (OVS_UNLIKELY(!is_vhost_running(dev) ||
>>>> + !(dev->flags & NETDEV_UP))) {
>>>> +
>>>> + if (txq->vhost_pkt_cnt) {
>>>> + rte_spinlock_lock(&dev->stats_lock);
>>>> + dev->stats.tx_dropped+= txq->vhost_pkt_cnt;
>>>> + rte_spinlock_unlock(&dev->stats_lock);
>>>> +
>>>> + for (int i = 0; i < txq->vhost_pkt_cnt; i++) {
>>>> + dp_packet_delete(txq->vhost_burst_pkts[i]);
>>>
>>> Spinlock (tx_lock) must be held here to avoid queue and mempool
>breakage.
>>
>> I think you are right. tx_lock might be acquired for freeing the packets.
>
>I think that 'vhost_pkt_cnt' reads and updates also should be protected to
>avoid races.
From the discussion in the thread https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337133.html,
We are going to acquire tx_lock for updating the map and flushing the queue inside vring_state_changed().
That triggers a deadlock in the flushing function as we have already acquired the same lock in netdev_dpdk_vhost_txq_flush().
This is the same problem for freeing the memory and protecting the updates to vhost_pkt_cnt.
if (OVS_LIKELY(txq->vhost_pkt_cnt)) {
rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
netdev_dpdk_vhost_tx_burst(dev, qid);
rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
}
As the problem is triggered when the guest queues are enabled/disabled, with a small race window where packets can get enqueued in to the queue just after the flush and before map value is updated in cb function(vring_state_changed()), how abt this?
Technically as the queues are disabled, there is no point in flushing the packets, so let's free the packets and set the txq->vhost_pkt_cnt in vring_state_changed() itself instead of calling flush().
vring_state_changed().
------------------------------------------------------
rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
mapped_qid = dev->tx_q[qid].map;
if (OVS_UNLIKELY(qid != mapped_qid)) {
rte_spinlock_lock(&dev->tx_q[mapped_qid].tx_lock);
}
if (enable) {
dev->tx_q[qid].map = qid;
} else {
struct dpdk_tx_queue *txq = &dev->tx_q[qid];
if (txq->vhost_pkt_cnt) {
rte_spinlock_lock(&dev->stats_lock);
dev->stats.tx_dropped+= txq->vhost_pkt_cnt;
rte_spinlock_unlock(&dev->stats_lock);
for (int i = 0; i < txq->vhost_pkt_cnt; i++) {
dp_packet_delete(txq->vhost_burst_pkts[i]);
}
txq->vhost_pkt_cnt = 0;
}
dev->tx_q[qid].map = OVS_VHOST_QUEUE_DISABLED;
}
-------------------------------------------------------------------------
Regards,
Bhanuprakash.
>
>> ---------------------------------------------------------------------------
>> rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>> for (int i = 0; i < txq->vhost_pkt_cnt; i++) {
>> dp_packet_delete(txq->vhost_burst_pkts[i]);
>> }
>> rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>>
>> - Bhanuprakash
>>
More information about the dev
mailing list