[ovs-dev] [PATCH V3 18/19] netdev-offload-dpdk: Support offload of clone tnl_push/output actions

Sathya Perla sathya.perla at broadcom.com
Tue Dec 10 16:29:56 UTC 2019


On Tue, Dec 10, 2019 at 7:56 PM Eli Britstein <elibr at mellanox.com> wrote:
>
>
> On 12/10/2019 2:53 PM, Sathya Perla wrote:
> > On Tue, Dec 10, 2019 at 5:25 PM Eli Britstein <elibr at mellanox.com> wrote:
> >>
> >> On 12/10/2019 12:09 PM, Sathya Perla wrote:
> >>> On Tue, Dec 10, 2019 at 1:21 PM Eli Britstein <elibr at mellanox.com> wrote:
> >>>> On 12/10/2019 8:52 AM, Sathya Perla wrote:
> >>>>> On Sun, Dec 8, 2019 at 6:53 PM Eli Britstein <elibr at mellanox.com> wrote:
> >>>>> ...
> >>>>>> +static int
> >>>>>> +parse_clone_actions(struct netdev *netdev,
> >>>>>> +                    struct flow_actions *actions,
> >>>>>> +                    const struct nlattr *clone_actions,
> >>>>>> +                    const size_t clone_actions_len,
> >>>>>> +                    struct offload_info *info)
> >>>>>> +{
> >>>>>> +    const struct nlattr *ca;
> >>>>>> +    unsigned int cleft;
> >>>>>> +
> >>>>>> +    NL_ATTR_FOR_EACH_UNSAFE (ca, cleft, clone_actions, clone_actions_len) {
> >>>>>> +        int clone_type = nl_attr_type(ca);
> >>>>>> +
> >>>>>> +        if (clone_type == OVS_ACTION_ATTR_TUNNEL_PUSH) {
> >>>>>> +            const struct ovs_action_push_tnl *tnl_push = nl_attr_get(ca);
> >>>>>> +            struct rte_flow_action_raw_encap *raw_encap =
> >>>>>> +                xzalloc(sizeof *raw_encap);
> >>>>>> +
> >>>>>> +            raw_encap->data = (uint8_t *)tnl_push->header;
> >>>>>> +            raw_encap->preserve = NULL;
> >>>>>> +            raw_encap->size = tnl_push->header_len;
> >>>>>> +
> >>>>>> +            add_flow_action(actions, RTE_FLOW_ACTION_TYPE_RAW_ENCAP,
> >>>>>> +                            raw_encap);
> >>>>> Hi, converting OVS_TUNNLE_PUSH into RTE_RAW_ENCAP loses the 'tnl_type'
> >>>>> information provided by OVS. RAW_ENCAP provides the tunnel header just
> >>>>> as a blob of bits which may not be ideal for NIC HW to offload.
> >>>>>
> >>>>> How about using tnl_push->tnl_type field to parse the header and
> >>>>> compose specific tunnel encap actions like RTE_VXLAN_ENCAP,
> >>>>> RTE_NVGRE_ENCAP etc.
> >>>>> This would be also be more inline with how it's done with TC-offload.
> >>>> I see your point. However, struct ovs_action_push_tnl has the "header"
> >>>> field just as a raw binary buffer, and not detailed like in TC.
> >>>> "tnl_type" has a comment /* For logging. */. It is indeed used for
> >>>> logging, as in lib/odp-util.c, in function format_odp_tnl_push_header.
> >>> This is not entirely true. Checkout propagate_tunnel_data_to_flow()
> >>> where tnl_type is being used
> >>> to figure out nw_proto.
> >>>
> >>>> using VXLAN_ENCAP instead of RAW_ENCAP will make the push less generic,
> >>>> as I will have to parse the header to build the vxlan_encap fields, just
> >>>> so they can re-build as a raw header in the PMD level, so I don't see
> >>>> the offload benefit of it.
> >>> Why are you assuming that all PMDs will rebuild the tunnel header
> >>> fields as a 'raw header'?
> >> As far as I saw in DPDK code, if I'm not wrong, all PMDs that support it
> >> do it like this. Anyway, it is indeed not relevant.
> >>> What if the NIC HW needs tunnel specific headers to be programmed
> >>> separately for each tunnel type?
> >>>
> >>>> Another point is that this way it will support any header, not just
> >>>> VXLAN (as an example).
> >>> Given that VXLAN and NVGRE ENCAP and DECAP actions are already defined
> >>> in rte_flow, how about
> >>> we use them for vxlan and nvgre and use RAW for the rest of the tunnel types?
> >>> We can support all tunnel headers even this way.....
> >> Again, I see your point.
> >>
> >> However, in OVS level, we have the encap as a raw buffer and DPDK
> >> supports it natively using RAW_ENCAP. For that reason I think we should
> >> use the straight forward method.
> >>
> >> I think that if there is a PMD that requires the fields separately, it
> >> is under its responsibility to parse it, and not forcing the application
> >> to do it.
> > How should a PMD parse the header buffer? For e.g., for vxlan, should the PMD
> > look for dst UDP port 4789? What if a different port number is used
> > for vxlan as it's the case with some deployments?
> > I think it's important to deal with this in OVS because the 'tnl_type'
> > info is available in OVS and it should be relayed to the PMD
> > is some form.
> >
> >> As it's a valid attribute in DPDK, and it's the natural way for OVS to
> >> use, I think we should use it. If it is from some reason not valid, in
> >> the future, DPDK will deprecate it.
> > I agree RAW_ENCAP is a valid RTE action. But I feel we must choose the
> > action that conveys/translates as much
> > data as possible from the OVS layer to the PMD. This will enable
> > offload support from more number of NIC HWs.
>
> AFAIU, currently there is no such NIC HW that actually uses the
> additional info in VXLAN_ENCAP vs RAW_ENCAP, but they would just
> re-build the header as if it would been received with RAW, so for now I
> don't see any benefit of doing so. In the future, if there is such HW as
> you suggest, maybe it can be enhanced.
>
> What do you think?

If you see TC-offload for bnxt_en driver (bnxt_tc.c), it programs vxlan encap
by setting the L2, L3, L4 header fields specifically. The bnxt PMD
would have to do something very similar!
So, we do have NIC HW that suits/needs this.

I'm fine if this is a follow-on patch....but it best done now because
as soon as this patch-set is committed,
PMDs may start integration with the RTE attributes used here....

If you agree with this in principle, maybe we can help in enhancing this patch?
Pls let me know.
thanks!


More information about the dev mailing list