[ovs-dev] [RFC v7 11/13] netdev-dpdk: copy large packet to multi-seg. mbufs

Lam, Tiago tiago.lam at intel.com
Fri Jun 8 09:10:12 UTC 2018

On 05/06/2018 13:03, Ilya Maximets wrote:
> On 23.05.2018 19:47, Tiago Lam wrote:
>> From: Mark Kavanagh <mark.b.kavanagh at intel.com>
>> Currently, packets are only copied to a single segment in
>> the function dpdk_do_tx_copy(). This could be an issue in
>> the case of jumbo frames, particularly when multi-segment
>> mbufs are involved.
>> This patch calculates the number of segments needed by a
>> packet and copies the data to each segment.
>> Co-authored-by: Michael Qiu <qiudayu at chinac.com>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
>> Signed-off-by: Michael Qiu <qiudayu at chinac.com>
>> ---
>>  lib/netdev-dpdk.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++-------
>>  1 file changed, 68 insertions(+), 10 deletions(-)
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index d3abdde..c6dfe6d 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -2178,6 +2178,71 @@ out:
>>      }
>>  }
>> +static int
>> +dpdk_prep_tx_buf(struct dp_packet *packet, struct rte_mbuf **head,
>> +                 struct rte_mempool *mp)
>> +{
>> +    struct rte_mbuf *temp;
>> +    uint32_t size = dp_packet_size(packet);
>> +    uint16_t max_data_len, data_len;
>> +    uint32_t nb_segs = 0;
>> +    int i;
>> +
>> +    temp = *head = rte_pktmbuf_alloc(mp);
> All the functions around OVS code *must* use 'dpdk_buf_alloc()' instead
> of 'rte_pktmbuf_alloc()'.  Otherwise, the introduced 'nonpmd_mp_mutex'
> is useless.
> Also, all the calls to 'rte_pktmbuf_free()' must be replaced with
> 'free_dpdk_buf()' for the same reason.
> And it's better to do this right in the patch 08/13 and maintain in
> the subsequent patches.

Hi Ilya,

Thanks for pointing that out. I did miss the call to `free_dpdk_buf()`
in `dp_packet_set_size()` in patch 04/13. However, from what I can see
allocating mbufs directly from 'rte_pktmbuf_free()' was already
happening before this patch, in `dpdk_do_tx_copy()`. This patch just
happens to allocate more mbufs if multi-segment mbufs are being used.

Now, this could have been introduced in the past and been overlooked,
but it seems `dpdk_do_tx_copy()` is only called from
`netdev_dpdk_vhost_send()` and `netdev_dpdk_send__()`, and that seems to
be called with the `non_pmd_mutex` mutex held already (in
dpif_netdev.c). Am I missing other call path here? Otherwise it seems to
be safe.

I'm just confirming that this is the case though, as your suggestions do
make sense to keep the code uniform, since we now have the wrappers
around the `nonpm_mp_mutex` mutex (and because it shouldn't increase
contention, since the `dp->non_pmd_mutex` should already be held for
non-pmds for this call path). I'll include it for the next iteration.


More information about the dev mailing list