[ovs-dev] [PATCH ovs V2 12/21] dpif-netlink: Use netdev flow put api to insert a flow

Joe Stringer joe at ovn.org
Tue Jan 10 22:34:48 UTC 2017


On 10 January 2017 at 07:50, Paul Blakey <paulb at mellanox.com> wrote:
>>> +            nl_msg_put_u32(&buf, OVS_ACTION_ATTR_OUTPUT, ifindex_out);
>>> +            if (tnl_cfg && tnl_cfg->dst_port != 0) {
>>> +                nl_msg_put_u32(&buf, OVS_TUNNEL_KEY_ATTR_TP_DST,
>>> tnl_cfg->dst_port);
>>> +            }
>>> +            nl_msg_end_nested(&buf, out_off);
>>> +
>>> +            if (outdev) {
>>> +                netdev_close(outdev);
>>> +            }
>>> +        } else {
>>> +            nl_msg_put_unspec(&buf, nl_attr_type(nla), nl_attr_get(nla),
>>> +                              nl_attr_get_size(nla));
>>> +        }
>>> +    }
>>> +    nl_msg_end_nested(&buf, offset);
>>
>> Hmm. I'm a bit uneasy about this whole copying/rewriting of the
>> actions. Firstly, the tunnel stuff just looks wrong. A quick "git
>> grep" shows that OVS_TUNNEL_KEY_ATTR_TP_DST is using the wrong type,
>> and looking closer at the other places it is used, it looks like this
>> function is shortcutting a whole bunch of the tunnel config.
>
> Besides the wrong type, I'm not sure what you mean, dpif_netlink_port_add__
> seems to access the tunnel dst port the same way.

Yes, data type, bitwise differences compared to
dpif_netlink_port_add__(). Checking the ifindex.

Does tnl_cfg->exts need to be checked too? Are any of the other
'struct netdev_tunnel_config' fields relevant for output (or will
ignoring them potentially lead to a correctness issue)?

>>
>>
>> I recognise that the ODP port number needs to be translated into an
>> ifindex. Do we need to do anything else?
>
> tunnel dst port for set action was missing in ACTION_SET ,  TUNNEL pair.
> We needed that for offloading.

Broadly I was thinking that any and all fields that may be relevant to
tunnel outputting should be handled and checked - fail out if it can't
be interpreted or implemented on lower layers. Looking at 'enum
ovs_tunnel_key_attr', there's a whole bunch of fields which are
similar to OVS_TUNNEL_KEY_ATTR_TP_DST and it wasn't immediately
obvious to me why this function only serialises this one member of the
enum and doesn't need to handle any of the rest.

If I follow correctly, the answer is that for traditional OVS vports,
we configure the vport in kernel to include TP_DST (and a few other
things), then omit these from the flow. When the datapath performs
output action, it will fetch this information from the vport in
kernel. However, for the TC hardware offloads case there is no such
port configuration in kernel, so now we need to ensure that it is
encoded in the flow. Is that right?

>>> +
>>> +    if (outputs > 1) {
>>> +        return false;
>>> +    }
>>> +
>>> +    act = ofpbuf_at_assert(&buf, offset, sizeof(struct nlattr));
>>> +    dev = netdev_hmap_port_get(in_port, dpif->dpif.dpif_class);
>>> +    err = netdev_flow_put(dev, &match, CONST_CAST(struct nlattr *,
>>> +                                                  nl_attr_get(act)),
>>> +                          nl_attr_get_size(act), put->stats,
>>> +                          CONST_CAST(ovs_u128 *, put->ufid));
>>> +    netdev_close(dev);
>>> +
>>> +    if (!err) {
>>> +        if (put->flags & DPIF_FP_MODIFY) {
>>> +            struct dpif_op *opp;
>>> +            struct dpif_op op;
>>> +
>>> +            op.type = DPIF_OP_FLOW_DEL;
>>> +            op.u.flow_del.key = put->key;
>>> +            op.u.flow_del.key_len = put->key_len;
>>> +            op.u.flow_del.ufid = put->ufid;
>>> +            op.u.flow_del.pmd_id = put->pmd_id;
>>> +            op.u.flow_del.stats = NULL;
>>> +            op.u.flow_del.terse = false;
>>> +
>>> +            opp = &op;
>>> +            dpif_netlink_operate__(dpif, &opp, 1);
>>
>> Won't this just delete the flow that was just added?
>
> The above FLOW_DEL calls the original operate (without offload) to delete it
> if it from netlink kernel datapath. This will happen if we got a modify
> request and we now
> can offload the flow and previously couldn't (so its in kernel datapath).

Ah, I missed that it was the "dpif_netlink_operate__" variation. OK.

>>
>> Is there a modify at the tc layer?
>
> Yes there is, and implemented by given a already used handle.
> So if we find the ufid in the map (ufid to handle/prio) already, we modify
> it by using
> the same handle.

OK, this seems fine at a glance.


More information about the dev mailing list