[ovs-dev] [PATCH 2/3] datapath: Mega flow implementation
Andy Zhou
azhou at nicira.com
Mon Jun 10 16:10:00 UTC 2013
Thanks for the review. I will make the suggested changes. 2 comments
inline:
On Fri, Jun 7, 2013 at 4:43 PM, Jesse Gross <jesse at nicira.com> wrote:
> OK, here's the second half:
>
> On Wed, Jun 5, 2013 at 10:46 PM, Andy Zhou <azhou at nicira.com> wrote:
> > diff --git a/datapath/flow.c b/datapath/flow.c
> > index 7f897bd..fc8fb86 100644
> > --- a/datapath/flow.c
> > +++ b/datapath/flow.c
> > static struct kmem_cache *flow_cache;
> > +static LIST_HEAD(mask_list);
>
> We should add a comment indicating that this is protected by ovs_mutex.
>
> > +static void update_desc__(struct sw_flow_match *match,
> > + size_t offset, size_t size, bool is_mask)
> > +{
> > + struct sw_flow_desc *desc = NULL;
> > + size_t start= offset;
> > + size_t end= offset + size;
>
> Can you put spaces on both sides of the equals sign?
>
> > +bool ovs_match_validate(const struct sw_flow_match *match,
> > + u64 key_attrs, u64 mask_attrs)
>
> I think the halves of this function (expectations and prerequisites)
> are basically two aspects of the same problem and should be handled
> together. For example, if the ICMPv6 type is not
> NDISC_NEIGHBOUR_SOLICITATION or NDSIC_NEIGHBOUR_ADVERTISEMENT then we
> shouldn't have a corresponding attribute for that. Doing it in a
> single pass would make it less likely that we miss something. I think
> we could handle it this way:
> - When extracting match values, record attributes that are all
> wildcards and remove then from the value attributes completely.
> - When computing expected attributes during validation, when we have
> a protocol value check, also verify that the mask is exact match.
> - AND the key_attributes with the mask values so we get a list of
> fields that have some matches in them.
> - Check that the attributes with matches is exactly the same as the
> expected attributes (not a subset in either direction).
>
> One downside of this is that it won't validate attributes that aren't
> used for matching, which doesn't really matter to the kernel but if we
> get an invalid flow then we won't detect it at setup time. Instead
> we'll just silently return a different ID when we serialize back to
> Netlink. To avoid this, we may want to continue to do a check on the
> unmasked attributes.
>
> > +static void flow_key_mask(struct sw_flow_key *dst,
> > + const struct sw_flow_key *src,
> > + const struct sw_flow_mask *mfm)
> > +{
> > + u8 *m = (u8 *)&mfm->key + mfm->desc.start;
> > + u8 *s = (u8 *)src + mfm->desc.start;
> > + u8 *d = (u8 *)dst + mfm->desc.start;
>
> You could improve the performance here by using longs instead of chars.
>
What about CPU architectures requires longs to be aligned? What is the
conventional #ifdef style here?
>
> > +static bool __flow_cmp_key_mask(struct sw_flow *flow, u32 hash, u8 *key,
> > + int key_start, int key_len,
> > + struct sw_flow_mask *mfm)
> > +{
> > + return (flow->hash == hash && (flow->mfm == mfm) &&
> > + !memcmp((u8 *)&flow->key + key_start, key, (key_len -
> key_start)));
> > +}
>
> Since the flows are non-overlapping, do we need to compare the masks?
>
Yes, we are comparing masked key, which is no longer unique without
comparing the mask.
>
> > +struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *table,
> > + const struct sw_flow_key *flow_key,
> > + struct sw_flow_mask *mfm,
> > + int key_len)
> > +{
> > + const struct sw_flow_key *key;
> > + struct sw_flow *flow;
> > + struct hlist_head *head;
> > + u8 *_key;
> > + int key_start = mfm->desc.start;
> > + u32 hash;
> > + struct sw_flow_key masked_key;
> > + int end_roundup = mfm->desc.start
> > + + sw_flow_mask_roundup_size(mfm);
>
> Do we have to round up the size at this point? It seems like the
> individual components might be able to do this based on their own
> needs (for example, ovs_flow_hash() already rounds up to 4 bytes).
>
> > +
> > + if (end_roundup < key_len)
> > + key_len = end_roundup;
>
> Can't we just always use end_roundup?
>
> I'm not sure that it makes sense to calculate key_len in
> ovs_flow_extract() anymore. At this point, it's only used as a quick
> check to see whether a mask could potentially ever match. However,
> it's a pretty coarse filter though and I suspect that we could come up
> with something (if necessary) that is less complicated.
>
> > + flow_key_mask(&masked_key, flow_key, mfm);
> > + key = &masked_key;
> > + hash = ovs_flow_hash(key, key_start, key_len);
> > + _key = (u8 *) key + key_start;
> > head = find_bucket(table, hash);
> > hlist_for_each_entry_rcu(flow, head, hash_node[table->node_ver])
> {
> > -
> > - if (flow->hash == hash &&
> > - !memcmp((u8 *)&flow->key + key_start, _key, key_len
> - key_start)) {
> > + if (__flow_cmp_key_mask(flow, hash, _key, key_start,
> key_len, mfm))
>
> I'm not sure that we need to pass in both _key and key_start - they
> are more or less equivalent.
>
> > return flow;
> > - }
> > +
> > }
> > return NULL;
> > }
> >
> > +
>
> Extra blank line here.
>
> > void ovs_flow_tbl_insert(struct flow_table *table, struct sw_flow *flow,
> > - struct sw_flow_key *key, int key_len)
> > + const struct sw_flow_key *key, int key_len)
> > {
> > - flow->hash = ovs_flow_hash(key, flow_key_start(key), key_len);
> > - memcpy(&flow->key, key, sizeof(flow->key));
> > - __flow_tbl_insert(table, flow);
> > + flow->user_id = *key;
> > + flow->id_hash = ovs_flow_hash(key, flow_key_start(key), key_len);
> > + flow_key_mask(&flow->key, &flow->user_id, flow->mfm);
>
> We should annotate the definition of flow->mfm with __rcu so that
> sparse would complain here (since we're not using ovsl_dereference,
> which we should be).
>
> > @@ -1012,37 +1159,48 @@ int ipv4_tun_from_nlattr(const struct nlattr
> *attr,
> [...]
> > }
> > + if (is_mask) {
> > + SW_FLOW_KEY_PUT(match, tun_key.tun_flags, 0xffff,
> is_mask);
> > + }else {
> > + SW_FLOW_KEY_PUT(match, tun_key.tun_flags, tun_flags,
> is_mask);
> > + }
> > +
>
> Do we need to force the tunnel flags to be exact match? Can't we allow
> userspace to specify which flags to mask using the usual logic?
>
> You could also drop the braces above and add a blank line before the
> if statement.
>
> > int ipv4_tun_to_nlattr(struct sk_buff *skb,
> [...]
> > + if (tun_key == output) {
>
> Similar to the other direction, I think it's OK to just directly
> translate the flags even for masks.
>
> > +static int __metadata_from_nlattrs(struct sw_flow_match *match, u64
> *attrs,
> > + const struct nlattr **a, bool is_mask)
>
> I'm not sure that the underscores at the beginning are necessary since
> this isn't the internal version of any function.
>
> > +/**
> > + * ovs_match_from_nlattrs - parses Netlink attributes into a flow key
> and
> > + * mask. In case the mask is specified (mask == NULL), the flow is
> treated
>
> I think this should say if the mask is NOT specified.
>
> > +int ovs_match_from_nlattrs(struct sw_flow_match *match,
> > + const struct nlattr *key,
> > + const struct nlattr *mask)
> > +{
> > + const struct nlattr *a[OVS_KEY_ATTR_MAX + 1];
> > + const struct nlattr *m[OVS_KEY_ATTR_MAX + 1];
> > + const struct nlattr *encap;
> > + u64 key_attrs = 0;
> > + u64 mask_attrs = 0;
> > + bool encap_valid = false;
> > + int err;
> > +
> > + err = parse_flow_nlattrs(key, a, &key_attrs);
> > + if (err)
> > + return err;
> > +
> > + if (key_attrs & 1ULL << OVS_KEY_ATTR_ENCAP) {
> > + encap = a[OVS_KEY_ATTR_ENCAP];
> > + key_attrs &= ~(1ULL << OVS_KEY_ATTR_ENCAP);
> > + if (nla_len(encap)) {
> > + __be16 tci = 0;
> > + __be16 eth_type = 0; /* ETH_P_8021Q */
> > +
> > + if (a[OVS_KEY_ATTR_ETHERTYPE])
> > + eth_type =
> nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE]);
>
> It might be good to mask out this EtherType because we've effectively
> consumed it and are expecting a new one.
>
> > + if (a[OVS_KEY_ATTR_VLAN])
> > + tci = nla_get_be16(a[OVS_KEY_ATTR_VLAN]);
>
> We may want to start enforcing the condition that we don't have
> duplicate attributes in parse_flow_nlattrs(). Otherwise, an inner vlan
> (which we don't support currently) could replace the value from the
> outer tag here and we would silently accept it.
>
> > + if ( (eth_type == htons(ETH_P_8021Q))
> > + && (tci &
> htons(VLAN_TAG_PRESENT)) ) {
>
> I would probably move the check for VLAN_TAG_PRESENT to
> ovs_key_from_nlattrs() and then just check for OVS_KEY_ATTR_VLAN here.
>
> It's somewhat unusual in kernel style to have extra spaces around the
> parentheses.
>
> > + encap_valid = true;
> > + err = parse_flow_nlattrs(encap, a,
> &key_attrs);
> > + }
> > + else
> > + err = -EINVAL;
>
> Similarly, it's normal to have the else on the same line as the closing
> brace.
>
> > + if (ovs_match_validate(match, key_attrs, mask_attrs) == false)
> > return -EINVAL;
>
> My usual comment about NULL/false checks.
>
> > +struct sw_flow_mask *sw_flow_mask_alloc(void)
> > +{
> > + struct sw_flow_mask *mfm;
> > +
> > + mfm = kmalloc(sizeof *mfm, GFP_KERNEL);
> > + if (mfm) {
> > + kref_init(&mfm->kref);
> > + INIT_LIST_HEAD(&mfm->list);
>
> You should have to init this member, since it's not a list head.
>
> > +/**
> > + * Set 'desc' fileds in the mask to the value of 'val'.
> > + */
> > +void sw_flow_mask_set(struct sw_flow_mask *mfm,
> > + struct sw_flow_desc *desc, u8 val)
>
> Can this be static?
>
> In ovs_flow_cmd_new_or_set(), I think we should have a check to make
> sure that the mask is not changed when updating an existing flow.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-dev/attachments/20130610/125493d9/attachment-0003.html>
More information about the dev
mailing list