[ovs-dev] [PATCH 2/2] datapath: Add basic MPLS support to kernel

Wed Mar 20 12:24:31 UTC 2013

On Tue, Mar 19, 2013 at 08:16:16AM -0700, Jesse Gross wrote:
> On Tue, Mar 19, 2013 at 5:31 AM, Simon Horman <horms at verge.net.au> wrote:
> > On Tue, Mar 19, 2013 at 11:14:58AM +0900, Simon Horman wrote:
> >> On Mon, Mar 18, 2013 at 06:35:48PM -0700, Jesse Gross wrote:
> >> > On Mon, Mar 18, 2013 at 6:22 PM, Simon Horman <horms at verge.net.au> wrote:
> >> > > On Mon, Mar 18, 2013 at 05:45:52PM -0700, Jesse Gross wrote:
> >> > >> I don't think that we want to allow an array of MPLS labels at this
> >> > >> point in time, since we'll just silently ignore the ones that we don't
> >> > >> expect, which isn't good.  However, we should define the interface in
> >> > >> such a way that anticipates this extension.  For example, I don't
> >> > >> think that it's good to call the struct member mpls_top_lse if it is
> >> > >> potentially a stack of labels.
> >> > >
> >> > > I'm not sure that I understand what you want the interface to look like.
> >> > >
> >> > > Of course we can change the name of the struct member, to say mpls_lse.
> >> > > But my understanding was that you simply wanted an array of these structs,
> >> > > which is what I have implemented.
> >> >
> >> > It's not really a code change (I don't think it would even break the
> >> > ABI if we made the change in the future, only the API).  I just think
> >> > that we should write include/linux/openvswitch.h as if we supported
> >> > multiple layers but then restrict it in the implementation.  So just
> >> > something like this:
> >> >
> >> > struct ovs_key_mpls {
> >> >         __be32 mpls_lse[];
> >> > };
> >> >
> >> > plus appropriate comments.
> >>
> >> Thanks, I understand.
> >>
> >> >
> >> > > I could add a check to reject a flow if the number of elements is
> >> > > greater than zero. That would avoid silently ignoring subsequent members
> >> > > while providing an interface that allows them. But I sense that this
> >> > > is not what you have in mind.
> >> >
> >> > That actually is what I have in mind (assuming that you mean rejet if
> >> > number of elements is greater than 1).
> >>
> >> Sorry for my typo, yes I meant 1 :)
> >> I'll make it so.
> >>
> >> > >> > @@ -755,6 +763,19 @@ static int validate_and_copy_actions(const struct nlattr *attr,
> >> > >> >                                 return -EINVAL;
> >> > >> >                         break;
> >> > >> >
> >> > >> > +               case OVS_ACTION_ATTR_PUSH_MPLS: {
> >> > >> > +                       const struct ovs_action_push_mpls *mpls = nla_data(a);
> >> > >> > +                       if (!eth_p_mpls(mpls->mpls_ethertype))
> >> > >> > +                               return -EINVAL;
> >> > >> > +                       current_eth_type = mpls->mpls_ethertype;
> >> > >> > +                       break;
> >> > >> > +               }
> >> > >> > +
> >> > >> > +               case OVS_ACTION_ATTR_POP_MPLS:
> >> > >> > +                       if (!eth_p_mpls(current_eth_type))
> >> > >> > +                               return -EINVAL;
> >> > >> > +                       current_eth_type = nla_get_u32(a);
> >> > >>
> >> > >> I don't think it is safe to assume that the provided EtherType is
> >> > >> correct: it's possible that the packet is not long enough to actually
> >> > >> contain that protocol.  Since all length checking happens at flow
> >> > >> extraction time, a later set could write off the end of the packet.
> >> > >
> >> > > I'm curious to know why this problem doesn't also exist
> >> > > for other set actions. For example set ipv4.
> >> >
> >> > No other action allows anything that would affect other layers to be
> >> > changed - for example, set IPv4 doesn't allow the next protocol to be
> >> > changed.  Therefore, the validation that has already been performed by
> >> > ovs_flow_extract() is still valid.
> >>
> >> Thanks. I'll have a think about how to fix this.
> >> But I wonder if it needs to be handled at the time that actions are executed.
> >
> > Thinking about this, I think it should be possible to resolve the problem
> > by constraining the check above further.
> >
> > In the case of an mpls_push action there should not be a problem
> > with a subsequent set action modifying mon-existent data as any
> > SET actions that modify l3 or l4 should occur before an mpls_push action.
> >
> > In the case of an mpls_pop action recirculation should occur
> > if there is a set action that modifies l3 or l3 data.
> >
> > So I think that validate_and_copy_actions() should be modified to disallow
> > set actions after mpls_push and mpls_pop.
> 
> Yes, I think that's right (although we could allow mpls_set after
> mpls_push easily enough).  Probably the simplest solution is to just
> set current_eth_type to 0 after an MPLS pop.

Yes I agree that seems to be the safest thing to do for now.  It will
disable setting of mpls, l3 and l4 data as well as subsequent MPLS push and
pop actions (the latter was already prohibited). In future we can handle
each of these using recirculation or by loosening the rule up a bit with
appropriate packet checks in place.