[ovs-dev] 答复: 答复: [PATCH V3 2/4] Add GSO support for DPDK data path

William Tu u9012063 at gmail.com
Mon Feb 8 19:54:05 UTC 2021


On Mon, Feb 8, 2021 at 6:36 AM Ilya Maximets <i.maximets at ovn.org> wrote:
>
> On 2/8/21 1:56 AM, Yi Yang (杨燚)-云服务集团 wrote:
> > Yes, GSO  is ok, but GRO may be have that issue, I didn't see that issue in my openstack environment, so maybe it will be great if we can have a test case to trigger that issue.
> >
> > -----邮件原件-----
> > 发件人: William Tu [mailto:u9012063 at gmail.com]
> > 发送时间: 2021年2月7日 23:46
> > 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01 at inspur.com>
> > 抄送: i.maximets at ovn.org; yang_y_yi at 163.com; ovs-dev at openvswitch.org; fbl at sysclose.org
> > 主题: Re: [ovs-dev] 答复: [PATCH V3 2/4] Add GSO support for DPDK data path
> >
> > On Tue, Oct 27, 2020 at 6:02 PM Yi Yang (杨燚)-云服务集团 <yangyi01 at inspur.com> wrote:
> >>
> >> -----邮件原件-----
> >> 发件人: dev [mailto:ovs-dev-bounces at openvswitch.org] 代表 Ilya Maximets
> >> 发送时间: 2020年10月27日 21:03
> >> 收件人: yang_y_yi at 163.com; ovs-dev at openvswitch.org
> >> 抄送: fbl at sysclose.org; i.maximets at ovn.org
> >> 主题: Re: [ovs-dev] [PATCH V3 2/4] Add GSO support for DPDK data path
> >>
> >> On 8/7/20 12:56 PM, yang_y_yi at 163.com wrote:
> >>> From: Yi Yang <yangyi01 at inspur.com>
> >>>
> >>> GSO(Generic Segment Offload) can segment large UDP  and TCP packet
> >>> to small packets per MTU of destination , especially for the case
> >>> that physical NIC can't do hardware offload VXLAN TSO and VXLAN UFO,
> >>> GSO can make sure userspace TSO can still work but not drop.
> >>>
> >>> In addition, GSO can help improve UDP performane when UFO is enabled
> >>> in VM.
> >>>
> >>> GSO can support TCP, UDP, VXLAN TCP, VXLAN UDP, it is done in Tx
> >>> function of physical NIC.
> >>>
> >>> Signed-off-by: Yi Yang <yangyi01 at inspur.com>
> >>> ---
> >>>  lib/dp-packet.h    |  21 +++-
> >>>  lib/netdev-dpdk.c  | 358
> >>> +++++++++++++++++++++++++++++++++++++++++++++++++----
> >>>  lib/netdev-linux.c |  17 ++-
> >>>  lib/netdev.c       |  67 +++++++---
> >>>  4 files changed, 417 insertions(+), 46 deletions(-)
> >
> > snip
> >
> >>>
> >>> @@ -2339,24 +2428,19 @@ netdev_dpdk_prep_hwol_batch(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
> >>>      return cnt;
> >>>  }
> >>>
> >>> -/* Tries to transmit 'pkts' to txq 'qid' of device 'dev'.  Takes
> >>> ownership of
> >>> - * 'pkts', even in case of failure.
> >>> - *
> >>> - * Returns the number of packets that weren't transmitted. */
> >>> static inline int -netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid,
> >>> -                         struct rte_mbuf **pkts, int cnt)
> >>> +__netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid,
> >>> +                           struct rte_mbuf **pkts, int cnt)
> >>>  {
> >>>      uint32_t nb_tx = 0;
> >>> -    uint16_t nb_tx_prep = cnt;
> >>> +    uint32_t nb_tx_prep;
> >>>
> >>> -    if (userspace_tso_enabled()) {
> >>> -        nb_tx_prep = rte_eth_tx_prepare(dev->port_id, qid, pkts, cnt);
> >>> -        if (nb_tx_prep != cnt) {
> >>> -            VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. "
> >>> -                         "Only %u/%u are valid: %s", dev->up.name, nb_tx_prep,
> >>> -                         cnt, rte_strerror(rte_errno));
> >>> -        }
> >>> +    nb_tx_prep = rte_eth_tx_prepare(dev->port_id, qid, pkts, cnt);
> >>> +    if (nb_tx_prep != cnt) {
> >>> +        VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. "
> >>> +                          "Only %u/%u are valid: %s",
> >>> +                     dev->up.name, nb_tx_prep,
> >>> +                     cnt, rte_strerror(rte_errno));
> >>>      }
> >>>
> >>>      while (nb_tx != nb_tx_prep) {
> >>> @@ -2384,6 +2468,200 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid,
> >>>      return cnt - nb_tx;
> >>>  }
> >>>
> >>> +static inline void
> >>> +set_multiseg_udptcp_cksum(struct rte_mbuf *mbuf)
> >>
> >> I didn't review the patch, only had a quick glance, but this part bothers me.  OVS doesn't support multi-segment mbufs, so it should not be possible for such mbufs being transmitted by OVS.  So, I do not understand why this function needs to work with such mbufs.
> >>
> >> [Yi Yang] Only DPDK driver/Tx function will use it, not OVS, set_multiseg_udptcp_cksum is called in GSO part, it is last step before Tx function, it is a big external mbuf before rte_gso_segment, that isn't a multi-segmented mbuf.
> >>
> >
> > Hi Ilya,
> >
> > Now I understand Yi Yang's point better and I agree with him.
> > Looks like the patch does the GSO at the DPDK TX function.
> > It creates multi-seg mbuf after rte_gso_segment(), but will immediately send out the multi-seg mbuf to DPDK port, without traversing inside other part of OVS code. I guess this case it should work OK?
>
> Hi.  GSO itself should be possible to implement, but we should
> not enable support for multi-segment mbufs in the NIC, since this
> might end up in multi-segment packets being received by OVS
> causing lots of trouble (for example dp_packet_clone() uses simple
> memcpy() that will result in invalid memory reads.)

I see your point, thanks!
I guess you're saying that we have to do s.t like:
nic->tx_offloads & DEV_TX_OFFLOAD_MULTI_SEGS
to enable NIC's multi-segment support for GSO.
And once enabling it, OVS might receive multi-segment
mbuf from the NIC, and OVS doesn't support it.

>
> GSO should be implemented in a same way TSO implemented.  Right now
> TSO implementation has no software fallback to segment large packets
> if netdev doesn't support - it it simply drops them.  However,
> software fallback should be implemented someday and it should be
> implemented in a generic way for all the netdev types instead of
> doing it only for one (netdev-dpdk).  Brief look at the patch tells
> me that the generic check from netdev_send_prepare_packet() was
> removed.  This means that huge packets could reach e.g. netdev-afxdp
> that will not behave correctly.
> Generic software fallback should produce sequence of 'dp_packet's
> instead of multi-segment mbufs, so there should be no need to enable
> support for multi-segment mbufs in DPDK PMD.
>
I agree, thanks!
So which one do we need, for dp_packet:
1) GSO 2) Tunneled GSO, ex: vxlan and geneve, or 3) Both?

I feel we can assume all HW nic today support TSO, but not
tunnel packet's TSO. So we only need to implement (2).
William


More information about the dev mailing list