[ovs-dev] [PATCH 2/3] datapath: Mega flow implementation

Andy Zhou azhou at nicira.com
Mon Jun 10 16:10:00 UTC 2013


Thanks for the review. I will make the suggested changes. 2 comments
inline:

On Fri, Jun 7, 2013 at 4:43 PM, Jesse Gross <jesse at nicira.com> wrote:

> OK, here's the second half:
>
> On Wed, Jun 5, 2013 at 10:46 PM, Andy Zhou <azhou at nicira.com> wrote:
> > diff --git a/datapath/flow.c b/datapath/flow.c
> > index 7f897bd..fc8fb86 100644
> > --- a/datapath/flow.c
> > +++ b/datapath/flow.c
> >  static struct kmem_cache *flow_cache;
> > +static LIST_HEAD(mask_list);
>
> We should add a comment indicating that this is protected by ovs_mutex.
>
> > +static void update_desc__(struct sw_flow_match *match,
> > +                         size_t offset, size_t size, bool is_mask)
> > +{
> > +       struct sw_flow_desc *desc = NULL;
> > +       size_t start= offset;
> > +       size_t end= offset + size;
>
> Can you put spaces on both sides of the equals sign?
>
> > +bool ovs_match_validate(const struct sw_flow_match *match,
> > +               u64 key_attrs, u64 mask_attrs)
>
> I think the halves of this function (expectations and prerequisites)
> are basically two aspects of the same problem and should be handled
> together. For example, if the ICMPv6 type is not
> NDISC_NEIGHBOUR_SOLICITATION or NDSIC_NEIGHBOUR_ADVERTISEMENT then we
> shouldn't have a corresponding attribute for that. Doing it in a
> single pass would make it less likely that we miss something. I think
> we could handle it this way:
>  - When extracting match values, record attributes that are all
> wildcards and remove then from the value attributes completely.
>  - When computing expected attributes during validation, when we have
> a protocol value check, also verify that the mask is exact match.
>  - AND the key_attributes with the mask values so we get a list of
> fields that have some matches in them.
>  - Check that the attributes with matches is exactly the same as the
> expected attributes (not a subset in either direction).
>
> One downside of this is that it won't validate attributes that aren't
> used for matching, which doesn't really matter to the kernel but if we
> get an invalid flow then we won't detect it at setup time. Instead
> we'll just silently return a different ID when we serialize back to
> Netlink. To avoid this, we may want to continue to do a check on the
> unmasked attributes.
>
> > +static void flow_key_mask(struct sw_flow_key *dst,
> > +                         const struct sw_flow_key *src,
> > +                         const struct sw_flow_mask *mfm)
> > +{
> > +       u8 *m = (u8 *)&mfm->key + mfm->desc.start;
> > +       u8 *s = (u8 *)src + mfm->desc.start;
> > +       u8 *d = (u8 *)dst + mfm->desc.start;
>
> You could improve the performance here by using longs instead of chars.
>
What about CPU architectures requires longs to be aligned? What is the
conventional #ifdef style here?

>
> > +static bool __flow_cmp_key_mask(struct sw_flow *flow, u32 hash, u8 *key,
> > +                               int key_start, int key_len,
> > +                               struct sw_flow_mask *mfm)
> > +{
> > +       return (flow->hash == hash && (flow->mfm == mfm) &&
> > +               !memcmp((u8 *)&flow->key + key_start, key, (key_len -
> key_start)));
> > +}
>
> Since the flows are non-overlapping, do we need to compare the masks?
>
Yes, we are comparing masked key, which is no longer unique without
comparing the mask.

>
> > +struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *table,
> > +                                   const struct sw_flow_key *flow_key,
> > +                                   struct sw_flow_mask *mfm,
> > +                                   int key_len)
> > +{
> > +       const struct sw_flow_key *key;
> > +       struct sw_flow *flow;
> > +       struct hlist_head *head;
> > +       u8 *_key;
> > +       int key_start = mfm->desc.start;
> > +       u32 hash;
> > +       struct sw_flow_key masked_key;
> > +       int end_roundup = mfm->desc.start
> > +                       + sw_flow_mask_roundup_size(mfm);
>
> Do we have to round up the size at this point? It seems like the
> individual components might be able to do this based on their own
> needs (for example, ovs_flow_hash() already rounds up to 4 bytes).
>
> > +
> > +       if (end_roundup < key_len)
> > +               key_len = end_roundup;
>
> Can't we just always use end_roundup?
>
> I'm not sure that it makes sense to calculate key_len in
> ovs_flow_extract() anymore. At this point, it's only used as a quick
> check to see whether a mask could potentially ever match. However,
> it's a pretty coarse filter though and I suspect that we could come up
> with something (if necessary) that is less complicated.
>
> > +       flow_key_mask(&masked_key, flow_key, mfm);
> > +       key = &masked_key;
> > +       hash = ovs_flow_hash(key, key_start, key_len);
> > +       _key = (u8 *) key + key_start;
> >         head = find_bucket(table, hash);
> >         hlist_for_each_entry_rcu(flow, head, hash_node[table->node_ver])
> {
> > -
> > -               if (flow->hash == hash &&
> > -                   !memcmp((u8 *)&flow->key + key_start, _key, key_len
> - key_start)) {
> > +               if (__flow_cmp_key_mask(flow, hash, _key, key_start,
> key_len, mfm))
>
> I'm not sure that we need to pass in both _key and key_start - they
> are more or less equivalent.
>
> >                         return flow;
> > -               }
> > +
> >         }
> >         return NULL;
> >  }
> >
> > +
>
> Extra blank line here.
>
> >  void ovs_flow_tbl_insert(struct flow_table *table, struct sw_flow *flow,
> > -                        struct sw_flow_key *key, int key_len)
> > +                        const struct sw_flow_key *key, int key_len)
> >  {
> > -       flow->hash = ovs_flow_hash(key, flow_key_start(key), key_len);
> > -       memcpy(&flow->key, key, sizeof(flow->key));
> > -       __flow_tbl_insert(table, flow);
> > +       flow->user_id = *key;
> > +       flow->id_hash = ovs_flow_hash(key, flow_key_start(key), key_len);
> > +       flow_key_mask(&flow->key, &flow->user_id, flow->mfm);
>
> We should annotate the definition of flow->mfm with __rcu so that
> sparse would complain here (since we're not using ovsl_dereference,
> which we should be).
>
> > @@ -1012,37 +1159,48 @@ int ipv4_tun_from_nlattr(const struct nlattr
> *attr,
> [...]
> >         }
> > +       if (is_mask) {
> > +               SW_FLOW_KEY_PUT(match, tun_key.tun_flags, 0xffff,
> is_mask);
> > +       }else {
> > +               SW_FLOW_KEY_PUT(match, tun_key.tun_flags, tun_flags,
> is_mask);
> > +       }
> > +
>
> Do we need to force the tunnel flags to be exact match? Can't we allow
> userspace to specify which flags to mask using the usual logic?
>
> You could also drop the braces above and add a blank line before the
> if statement.
>
> >  int ipv4_tun_to_nlattr(struct sk_buff *skb,
> [...]
> > +       if (tun_key == output) {
>
> Similar to the other direction, I think it's OK to just directly
> translate the flags even for masks.
>
> > +static int __metadata_from_nlattrs(struct sw_flow_match *match,  u64
> *attrs,
> > +               const struct nlattr **a, bool is_mask)
>
> I'm not sure that the underscores at the beginning are necessary since
> this isn't the internal version of any function.
>
> > +/**
> > + * ovs_match_from_nlattrs - parses Netlink attributes into a flow key
> and
> > + * mask. In case the mask is specified (mask == NULL), the flow is
> treated
>
> I think this should say if the mask is NOT specified.
>
> > +int ovs_match_from_nlattrs(struct sw_flow_match *match,
> > +                          const struct nlattr *key,
> > +                          const struct nlattr *mask)
> > +{
> > +       const struct nlattr *a[OVS_KEY_ATTR_MAX + 1];
> > +       const struct nlattr *m[OVS_KEY_ATTR_MAX + 1];
> > +       const struct nlattr *encap;
> > +       u64 key_attrs = 0;
> > +       u64 mask_attrs = 0;
> > +       bool encap_valid = false;
> > +       int err;
> > +
> > +       err = parse_flow_nlattrs(key, a, &key_attrs);
> > +       if (err)
> > +               return err;
> > +
> > +       if (key_attrs & 1ULL << OVS_KEY_ATTR_ENCAP) {
> > +               encap = a[OVS_KEY_ATTR_ENCAP];
> > +               key_attrs &= ~(1ULL << OVS_KEY_ATTR_ENCAP);
> > +               if (nla_len(encap)) {
> > +                       __be16 tci = 0;
> > +                       __be16 eth_type = 0; /* ETH_P_8021Q */
> > +
> > +                       if (a[OVS_KEY_ATTR_ETHERTYPE])
> > +                               eth_type =
> nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE]);
>
> It might be good to mask out this EtherType because we've effectively
> consumed it and are expecting a new one.
>
> > +                       if (a[OVS_KEY_ATTR_VLAN])
> > +                               tci = nla_get_be16(a[OVS_KEY_ATTR_VLAN]);
>
> We may want to start enforcing the condition that we don't have
> duplicate attributes in parse_flow_nlattrs(). Otherwise, an inner vlan
> (which we don't support currently) could replace the value from the
> outer tag here and we would silently accept it.
>
> > +                       if  ( (eth_type == htons(ETH_P_8021Q))
> > +                                       && (tci &
> htons(VLAN_TAG_PRESENT)) ) {
>
> I would probably move the check for VLAN_TAG_PRESENT to
> ovs_key_from_nlattrs() and then just check for OVS_KEY_ATTR_VLAN here.
>
> It's somewhat unusual in kernel style to have extra spaces around the
> parentheses.
>
> > +                               encap_valid = true;
> > +                               err = parse_flow_nlattrs(encap, a,
> &key_attrs);
> > +                       }
> > +                       else
> > +                               err = -EINVAL;
>
> Similarly, it's normal to have the else on the same line as the closing
> brace.
>
> > +       if (ovs_match_validate(match, key_attrs, mask_attrs) == false)
> >                 return -EINVAL;
>
> My usual comment about NULL/false checks.
>
> > +struct sw_flow_mask *sw_flow_mask_alloc(void)
> > +{
> > +       struct sw_flow_mask *mfm;
> > +
> > +       mfm = kmalloc(sizeof *mfm, GFP_KERNEL);
> > +       if (mfm) {
> > +               kref_init(&mfm->kref);
> > +               INIT_LIST_HEAD(&mfm->list);
>
> You should have to init this member, since it's not a list head.
>
> > +/**
> > + * Set 'desc' fileds in the mask to the value of 'val'.
> > + */
> > +void sw_flow_mask_set(struct sw_flow_mask *mfm,
> > +               struct sw_flow_desc *desc, u8 val)
>
> Can this be static?
>
> In ovs_flow_cmd_new_or_set(), I think we should have a check to make
> sure that the mask is not changed when updating an existing flow.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-dev/attachments/20130610/125493d9/attachment-0003.html>


More information about the dev mailing list