[ovs-dev] [PATCH v8 04/13] netdev-dpdk: Serialise non-pmds mbufs' alloc/free.

Eelco Chaudron echaudro at redhat.com
Tue Jun 26 12:32:18 UTC 2018



On 26 Jun 2018, at 12:02, Ilya Maximets wrote:

> On 26.06.2018 12:19, Eelco Chaudron wrote:
>>
>>
>> On 22 Jun 2018, at 21:03, Lam, Tiago wrote:
>>
>>> On 18/06/2018 12:28, Eelco Chaudron wrote:
>>>>
>>>>
>>>> On 11 Jun 2018, at 18:21, Tiago Lam wrote:
>>>>
>>>>> A new mutex, 'nonpmd_mp_mutex', has been introduced to serialise
>>>>> allocation and free operations by non-pmd threads on a given 
>>>>> mempool.
>>>>>
>>>>
>>>> Can you explain why we need the mutex here? Can't see any reason 
>>>> why
>>>> rte_pktmbuf_free() needs to be protected for non-pmd threads?
>>>
>>> My understanding is that each PMD has a thread-local cache of its 
>>> own,
>>> which it can access without locking. However, all non-PMD threads 
>>> have
>>> direct access to the mempool, and thus need to lock in order to 
>>> access
>>> it. Otherwise memory corruption could ensue.
>>>
>>> For OvS this means all these functions which can be called outside 
>>> of a
>>> PMD context, or without holding the `non_pmd_mutex` mutex, in struct
>>> dp_netdev, need to lock before issue operations that modify a 
>>> mempool.
>>>
>>> Hopefully I've put it succinctly enough without botching up the 
>>> concept
>>> too much.
>>
>> I do not think this is a problem, as DPDK uses 
>> rte_mempool_default_cache() to figure out which cache to use.
>> As the non PMD threads have a higher than RTE_MAX_LCORE id the cache 
>> will not be used.
>
>
> It's not the problem. The main reason is that implementation of ring
> buffers on which the mempools based is not preemptible:
>
>   lib/librte_mempool/rte_mempool.h:
>   * Note: the mempool implementation is not preemptible. An lcore must 
> not be
>   * interrupted by another task that uses the same mempool (because it 
> uses a
>   * ring which is not preemptible).
>
> As soon as non-PMD threads are not pinned, they could be scheduled on 
> the same
> core and preempted while executing mempool operations.

Thanks for the additional clarification! Tiago, maybe its good to add 
this info some where in the code as a comment?

>>
>>>>
>>>>> free_dpdk_buf() has been modified to make use of the introduced 
>>>>> mutex.
>>>>>
>>>>> Signed-off-by: Tiago Lam <tiago.lam at intel.com>
>>>>> ---
>>>>>  lib/netdev-dpdk.c | 21 ++++++++++++++++++++-
>>>>>  1 file changed, 20 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>>>> index f546507..efd7c20 100644
>>>>> --- a/lib/netdev-dpdk.c
>>>>> +++ b/lib/netdev-dpdk.c
>>>>> @@ -294,6 +294,10 @@ static struct ovs_mutex dpdk_mp_mutex
>>>>> OVS_ACQ_AFTER(dpdk_mutex)
>>>>>  static struct ovs_list dpdk_mp_free_list
>>>>> OVS_GUARDED_BY(dpdk_mp_mutex)
>>>>>      = OVS_LIST_INITIALIZER(&dpdk_mp_free_list);
>>>>>
>>>>> +/* This mutex must be used by non pmd threads when allocating or
>>>>> freeing
>>>>> + * mbufs through mempools. */
>>>>> +static struct ovs_mutex nonpmd_mp_mutex = OVS_MUTEX_INITIALIZER;
>>>>> +
>>>>>  /* Wrapper for a mempool released but not yet freed. */
>>>>>  struct dpdk_mp {
>>>>>       struct rte_mempool *mp;
>>>>> @@ -461,6 +465,8 @@ struct netdev_rxq_dpdk {
>>>>>      dpdk_port_t port_id;
>>>>>  };
>>>>>
>>>>> +static bool dpdk_thread_is_pmd(void);
>>>>> +
>>>>>  static void netdev_dpdk_destruct(struct netdev *netdev);
>>>>>  static void netdev_dpdk_vhost_destruct(struct netdev *netdev);
>>>>>
>>>>> @@ -494,6 +500,12 @@ dpdk_buf_size(int mtu)
>>>>>                       
>>>>> NETDEV_DPDK_MBUF_ALIGN);
>>>>>  }
>>>>>
>>>>> +static bool
>>>>> +dpdk_thread_is_pmd(void)
>>>>> +{
>>>>> +     return rte_lcore_id() != NON_PMD_CORE_ID;
>>>>
>>>> Will this continue to work with newer DPDK versions? I've not 
>>>> looked at
>>>> it in detail, but I did notice that the vhost threads in the newer 
>>>> DPDK
>>>> now get created with rte_ctrl_thread_create() and does some lcore
>>>> mangling.
>>>>
>>>
>>> I had a look.  This seems to be a wrapper for creating management
>>> threads only, to standardize pthread creation around that in DPDK. 
>>> Usual
>>> EAL thread creation and assigning of LCORE_ID_ANY to non-EAL threads
>>> seems to still be the same.
>>
>> Thanks for checking.
>>>
>>>>> +}
>>>>> +
>>>>>  /* Allocates an area of 'sz' bytes from DPDK.  The memory is 
>>>>> zero'ed.
>>>>>   *
>>>>>   * Unlike xmalloc(), this function can return NULL on failure. 
>>>>> */
>>>>> @@ -506,9 +518,16 @@ dpdk_rte_mzalloc(size_t sz)
>>>>>  void
>>>>>  free_dpdk_buf(struct dp_packet *p)
>>>>>  {
>>>>> -    struct rte_mbuf *pkt = (struct rte_mbuf *) p;
>>>>> +    if (!dpdk_thread_is_pmd()) {
>>>>> +        ovs_mutex_lock(&nonpmd_mp_mutex);
>>>>> +    }
>>>>>
>>>>> +    struct rte_mbuf *pkt = (struct rte_mbuf *) p;
>>>>
>>>> Should we not use a container_of macro here?
>>>>
>>>
>>> I'll use it for the next iteration.
>>>
>>> Tiago.
>>
>>
>>


More information about the dev mailing list