[ovs-dev] [PATCH v11 05/14] dp-packet: Fix data_len handling multi-seg mbufs.

Stokes, Ian ian.stokes at intel.com
Mon Oct 22 10:31:10 UTC 2018


> When a dp_packet is from a DPDK source, and it contains multi-segment
> mbufs, the data_len is not equal to the packet size, pkt_len. Instead, the
> data_len of each mbuf in the chain should be considered while distributing
> the new (provided) size.
> 
> To account for the above dp_packet_set_size() has been changed so that, in
> the multi-segment mbufs case, only the data_len on the last mbuf of the
> chain and the total size of the packet, pkt_len, are changed. The data_len
> on the intermediate mbufs preceeding the last mbuf is not changed by
> dp_packet_set_size(). Furthermore, in some cases
> dp_packet_set_size() may be used to set a smaller size than the current
> packet size, thus effectively trimming the end of the packet. In the
> multi-segment mbufs case this may lead to lingering mbufs that may need
> freeing.
> 
> __dp_packet_set_data() now also updates an mbufs' data_len after setting
> the data offset. This is so that both fields are always in sync for each
> mbuf in a chain.
> 
> Co-authored-by: Michael Qiu <qiudayu at chinac.com>
> Co-authored-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
> Co-authored-by: Przemyslaw Lal <przemyslawx.lal at intel.com>
> Co-authored-by: Marcin Ksiadz <mksiadz at gmail.com>
> Co-authored-by: Yuanhan Liu <yliu at fridaylinux.org>
> 
> Signed-off-by: Michael Qiu <qiudayu at chinac.com>
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
> Signed-off-by: Przemyslaw Lal <przemyslawx.lal at intel.com>
> Signed-off-by: Marcin Ksiadz <mksiadz at gmail.com>
> Signed-off-by: Yuanhan Liu <yliu at fridaylinux.org>
> Signed-off-by: Tiago Lam <tiago.lam at intel.com>
> Acked-by: Eelco Chaudron <echaudro at redhat.com>
> ---
>  lib/dp-packet.h | 83 ++++++++++++++++++++++++++++++++++++++++++++++++----
> -----
>  1 file changed, 71 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 6376039..223efe2
> 100644
> --- a/lib/dp-packet.h
> +++ b/lib/dp-packet.h
> @@ -426,20 +426,60 @@ dp_packet_size(const struct dp_packet *b)
>      return b->mbuf.pkt_len;
>  }
> 
> +/* Sets the size of the packet 'b' to 'v'. For non-DPDK packets this
> +only means
> + * setting b->size_, but if used in a DPDK packet it means adjusting
> +the first
> + * mbuf pkt_len and last mbuf data_len, to reflect the real size, which
> +can
> + * lead to free'ing tail mbufs that are no longer used.
> + *
> + * This function should be used for setting the size only, and if
> +there's an
> + * assumption that the tail end of 'b' will be trimmed. For adjustng

Minor typo above 'adjustng' -> 'adjusting'.

> +the head
> + * 'end' of 'b', dp_packet_pull() should be used instead. */
>  static inline void
>  dp_packet_set_size(struct dp_packet *b, uint32_t v)  {
> -    /* netdev-dpdk does not currently support segmentation; consequently,
> for
> -     * all intents and purposes, 'data_len' (16 bit) and 'pkt_len' (32
> bit) may
> -     * be used interchangably.
> -     *
> -     * On the datapath, it is expected that the size of packets
> -     * (and thus 'v') will always be <= UINT16_MAX; this means that there
> is no
> -     * loss of accuracy in assigning 'v' to 'data_len'.
> -     */
> -    b->mbuf.data_len = (uint16_t)v;  /* Current seg length. */
> -    b->mbuf.pkt_len = v;             /* Total length of all segments
> linked to
> -                                      * this segment. */
> +    if (b->source == DPBUF_DPDK) {
> +        struct rte_mbuf *mbuf = &b->mbuf;
> +        uint16_t new_len = v;
> +        uint16_t data_len;
> +        uint16_t nb_segs = 0;
> +        uint16_t pkt_len = 0;
> +
> +        /* Trim 'v' length bytes from the end of the chained buffers,
> freeing

A general comment on the code block below, there is a number of conditions at play here. In reviewing I had to go pen to paper to keep track of the cases and see what would be the outcome in each case.

It might be worth outlining these cases in a comment first before the block i.e. what happens when v is larger/smaller than mbuf->data len. What happens when mbuf->next is already null. The relationship between mbuf->buf_len and mbuf->data_len would be useful also to give context when assigning more mbufs.

> +           any buffers that may be left floating */
> +        while (mbuf) {
> +            data_len = MIN(new_len, mbuf->data_len);
> +            mbuf->data_len = data_len;
> +
> +            if (new_len - data_len <= 0) {
> +                /* Free the rest of chained mbufs */
> +                free_dpdk_buf(CONTAINER_OF(mbuf->next, struct dp_packet,
> +                                           mbuf));
> +                mbuf->next = NULL;
> +            } else if (!mbuf->next) {
> +                /* Don't assign more than what we have available */
> +                mbuf->data_len = MIN(new_len,
> +                                     mbuf->buf_len - mbuf->data_off);
> +            }
> +
> +            new_len -= data_len;
> +            nb_segs += 1;
> +            pkt_len += mbuf->data_len;
> +            mbuf = mbuf->next;
> +        }
> +
> +        /* pkt_len != v would effectively mean that pkt_len < than 'v'
> (as
> +         * being bigger is logically impossible). Being < than 'v' would
> mean
> +         * the 'v' provided was bigger than the available room, which is
> the
> +         * responsibility of the caller to make sure there is enough room
> */
> +        ovs_assert(pkt_len == v);
> +
> +        b->mbuf.nb_segs = nb_segs;
> +        b->mbuf.pkt_len = pkt_len;
> +    } else {
> +        b->mbuf.data_len = v;
> +        /* Total length of all segments linked to this segment. */
> +        b->mbuf.pkt_len = v;
> +    }
>  }
> 
>  static inline uint16_t
> @@ -451,7 +491,26 @@ __packet_data(const struct dp_packet *b)  static
> inline void  __packet_set_data(struct dp_packet *b, uint16_t v)  {
> -    b->mbuf.data_off = v;
> +    if (b->source == DPBUF_DPDK) {
> +        /* Moving data_off away from the first mbuf in the chain is not a
> +         * possibility using DPBUF_DPDK dp_packets */

Can you clarify the comment above? On first reading I though it meant the data_off cannot be changed in the first mbuf but  we change it below then. I think it means it can't be changed beyond the first mbuf?

Ian
> +        ovs_assert(v == UINT16_MAX || v <= b->mbuf.buf_len);
> +
> +        uint16_t prev_ofs = b->mbuf.data_off;
> +        b->mbuf.data_off = v;
> +        int16_t ofs_diff = prev_ofs - b->mbuf.data_off;
> +
> +        /* When dealing with DPDK mbufs, keep data_off and data_len in
> sync.
> +         * Thus, update data_len if the length changes with the move of
> +         * data_off. However, if data_len is 0, there's no data to move
> and
> +         * data_len should remain 0. */
> +
> +        if (b->mbuf.data_len != 0) {
> +            b->mbuf.data_len += ofs_diff;
> +        }
> +    } else {
> +        b->mbuf.data_off = v;
> +    }
>  }
> 
>  static inline uint16_t
> --
> 2.7.4



More information about the dev mailing list