[ovs-dev] [PATCH v5 01/14] netdev-dpdk: fix mbuf sizing

Lam, Tiago tiago.lam at intel.com
Thu Jul 12 16:22:47 UTC 2018


On 12/07/2018 17:07, Ian Stokes wrote:
> On 7/12/2018 4:40 PM, Lam, Tiago wrote:
>> On 12/07/2018 14:37, Ian Stokes wrote:
>>> On 7/11/2018 7:23 PM, Tiago Lam wrote:
>>>> From: Mark Kavanagh <mark.b.kavanagh at intel.com>
>>>>
>>>> There are numerous factors that must be considered when calculating
>>>> the size of an mbuf:
>>>> - the data portion of the mbuf must be sized in accordance With Rx
>>>>     buffer alignment (typically 1024B). So, for example, in order to
>>>>     successfully receive and capture a 1500B packet, mbufs with a
>>>>     data portion of size 2048B must be used.
>>>> - in OvS, the elements that comprise an mbuf are:
>>>>     * the dp packet, which includes a struct rte mbuf (704B)
>>>>     * RTE_PKTMBUF_HEADROOM (128B)
>>>>     * packet data (aligned to 1k, as previously described)
>>>>     * RTE_PKTMBUF_TAILROOM (typically 0)
>>>>
>>>> Some PMDs require that the total mbuf size (i.e. the total sum of all
>>>> of the above-listed components' lengths) is cache-aligned. To satisfy
>>>> this requirement, it may be necessary to round up the total mbuf size
>>>> with respect to cacheline size. In doing so, it's possible that the
>>>> dp_packet's data portion is inadvertently increased in size, such that
>>>> it no longer adheres to Rx buffer alignment. Consequently, the
>>>> following property of the mbuf no longer holds true:
>>>>
>>>>       mbuf.data_len == mbuf.buf_len - mbuf.data_off
>>>>
>>>> This creates a problem in the case of multi-segment mbufs, where that
>>>> assumption is assumed to be true for all but the final segment in an
>>>> mbuf chain. Resolve this issue by adjusting the size of the mbuf's
>>>> private data portion, as opposed to the packet data portion when
>>>> aligning mbuf size to cachelines.
>>>
>>> Hi Tiago,
>>>
>>> with this patch I still don't see mbuf.data_len == mbuf.buf_len -
>>> mbuf.data_off to be true.
>>>
>>> I've tested with both Jumbo frames and non jumbo packets by examining
>>> the mbufs on both tx and rx. mbuf.data_len is always smaller than
>>> mbuf.buf_len - mbuf.data_off.
>>>
>>> Maybe I've missed something here?
>>>
>>
>> Thanks for looking into this, Ian.
>>
>> `mbuf.data_len == mbuf.buf_len - mbuf.data_off` isn't always true.
>> Actually, `mbuf.data_len <= mbuf.buf_len - mbuf.data_off` would be a
>> better representation.
>>
>> If there's a chain of mbufs that are linked together, the expectation is
>> that `mbuf.data_len == mbuf.buf_len - mbuf.data_off` holds true for all
>> of them, except maybe for the last in the chain, since there may not be
>> enough data to fill the whole mbuf.
>>
>> So, for non jumbo frames I would expect `data_len < mbuf_len -
>> data_off`, but for jumbo frames I'd expect that to happen only on the
>> last mbuf in the chain, and in the rest we should see `data_len ==
>> mbuf_len - data_off` hold true. Is that what you're seeing here?
>>
> 
> I had tested this with jumbo frames (MTU 9000) with just this patch 
> applied, in that case I was seeing 'data_len < mbuf_len - data_off'.
> 
> Is this because its 1 mbuf for the entire 9000, if it was segmented then 
> the segments would be equal besides the final mbuf.
> 
> Ian
> 

Ah, yes. If you had just this patch applied then you wouldn't have
multi-segment mbufs enabled (as that only comes in patch 11/14). In
which case the mbufs are still being resized to fit the entire packet
(and most likely the data you're sending < 9000B, hence why the
`data_len < buf_len - data_off`).

In other words, this patch doesn't change any functionality, as mbufs
are still resized to hold the maximum MTU set.

Tiago.


More information about the dev mailing list