[ovs-dev] [RFC PATCH] datapath: allow tunnels to be created with rtnetlink

Thadeu Lima de Souza Cascardo cascardo at redhat.com
Tue Oct 25 18:00:22 UTC 2016


On Thu, Oct 20, 2016 at 10:30:41AM -0700, Pravin Shelar wrote:
> On Tue, Sep 20, 2016 at 7:01 AM, Thadeu Lima de Souza Cascardo
> <cascardo at redhat.com> wrote:
> > In order to use rtnetlink to create new tunnel vports, the backported
> > tunnels require some code that was removed from their upstream version,
> > mainly the necessary code for newlink and for start_xmit.
> >
> > This patch adds back the necessary code for VXLAN, GRE and Geneve
> > tunnels.
> >
> > Signed-off-by: Eric Garver <e at erig.me>
> > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo at redhat.com>
> > ---
> >  datapath/linux/Modules.mk                       |   1 +
> >  datapath/linux/compat/geneve.c                  |  15 +--
> >  datapath/linux/compat/include/linux/if_tunnel.h |  71 ++++++++++++
> >  datapath/linux/compat/ip_gre.c                  |  65 ++++++++---
> >  datapath/linux/compat/vxlan.c                   | 147 +++++++++++++++++++++---
> >  5 files changed, 261 insertions(+), 38 deletions(-)
> >  create mode 100644 datapath/linux/compat/include/linux/if_tunnel.h
> >
> > diff --git a/datapath/linux/Modules.mk b/datapath/linux/Modules.mk
> > index 26f6d22..ad7d14a 100644
> > --- a/datapath/linux/Modules.mk
> > +++ b/datapath/linux/Modules.mk
> > @@ -38,6 +38,7 @@ openvswitch_headers += \
> >         linux/compat/include/linux/if.h \
> >         linux/compat/include/linux/if_ether.h \
> >         linux/compat/include/linux/if_link.h \
> > +       linux/compat/include/linux/if_tunnel.h \
> >         linux/compat/include/linux/if_vlan.h \
> >         linux/compat/include/linux/in.h \
> >         linux/compat/include/linux/jiffies.h \
> > diff --git a/datapath/linux/compat/geneve.c b/datapath/linux/compat/geneve.c
> > index 0c5b58a..79bb0ba 100644
> > --- a/datapath/linux/compat/geneve.c
> > +++ b/datapath/linux/compat/geneve.c
> > @@ -1112,9 +1112,8 @@ tx_error:
> >  }
> >  #endif
> >
> > -netdev_tx_t rpl_geneve_xmit(struct sk_buff *skb)
> > +static netdev_tx_t geneve_dev_xmit(struct sk_buff *skb, struct net_device *dev)
> >  {
> > -       struct net_device *dev = skb->dev;
> >         struct geneve_dev *geneve = netdev_priv(dev);
> >         struct ip_tunnel_info *info = NULL;
> >
> > @@ -1128,18 +1127,12 @@ netdev_tx_t rpl_geneve_xmit(struct sk_buff *skb)
> >  #endif
> >         return geneve_xmit_skb(skb, dev, info);
> >  }
> > -EXPORT_SYMBOL_GPL(rpl_geneve_xmit);
> >
> > -static netdev_tx_t geneve_dev_xmit(struct sk_buff *skb, struct net_device *dev)
> > +netdev_tx_t rpl_geneve_xmit(struct sk_buff *skb)
> >  {
> > -       /* Drop All packets coming from networking stack. OVS-CB is
> > -        * not initialized for these packets.
> > -        */
> > -
> > -       dev_kfree_skb(skb);
> > -       dev->stats.tx_dropped++;
> > -       return NETDEV_TX_OK;
> > +       return geneve_dev_xmit(skb, skb->dev);
> >  }
> This would crash kernel.
> OVS compat code expect dst entry in skb-cb, packet coming from kernel
> would not have it.
> 
> > +EXPORT_SYMBOL_GPL(rpl_geneve_xmit);
> >
> >  static int __geneve_change_mtu(struct net_device *dev, int new_mtu, bool strict)
> >  {
> > diff --git a/datapath/linux/compat/include/linux/if_tunnel.h b/datapath/linux/compat/include/linux/if_tunnel.h
> > new file mode 100644
> > index 0000000..476fe3c
> > --- /dev/null
> > +++ b/datapath/linux/compat/include/linux/if_tunnel.h
> > @@ -0,0 +1,71 @@
> > +#ifndef _LINUX_IF_TUNNEL_WRAPPER_H
> > +#define _LINUX_IF_TUNNEL_WRAPPER_H
> > +
> > +#include_next<linux/if_tunnel.h>
> > +
> > +/* GRE section */
> > +enum {
> > +#define IFLA_GRE_UNSPEC rpl_IFLA_GRE_UNSPEC
> > +       IFLA_GRE_UNSPEC,
> > +
> > +#define IFLA_GRE_LINK rpl_IFLA_GRE_LINK
> > +       IFLA_GRE_LINK,
> > +
> > +#define IFLA_GRE_IFLAGS rpl_IFLA_GRE_IFLAGS
> > +       IFLA_GRE_IFLAGS,
> > +
> > +#define IFLA_GRE_OFLAGS rpl_IFLA_GRE_OFLAGS
> > +       IFLA_GRE_OFLAGS,
> > +
> > +#define IFLA_GRE_IKEY rpl_IFLA_GRE_IKEY
> > +       IFLA_GRE_IKEY,
> > +
> > +#define IFLA_GRE_OKEY rpl_IFLA_GRE_OKEY
> > +       IFLA_GRE_OKEY,
> > +
> > +#define IFLA_GRE_LOCAL rpl_IFLA_GRE_LOCAL
> > +       IFLA_GRE_LOCAL,
> > +
> > +#define IFLA_GRE_REMOTE rpl_IFLA_GRE_REMOTE
> > +       IFLA_GRE_REMOTE,
> > +
> > +#define IFLA_GRE_TTL rpl_IFLA_GRE_TTL
> > +       IFLA_GRE_TTL,
> > +
> > +#define IFLA_GRE_TOS rpl_IFLA_GRE_TOS
> > +       IFLA_GRE_TOS,
> > +
> > +#define IFLA_GRE_PMTUDISC rpl_IFLA_GRE_PMTUDISC
> > +       IFLA_GRE_PMTUDISC,
> > +
> > +#define IFLA_GRE_ENCAP_LIMIT rpl_IFLA_GRE_ENCAP_LIMIT
> > +       IFLA_GRE_ENCAP_LIMIT,
> > +
> > +#define IFLA_GRE_FLOWINFO rpl_IFLA_GRE_FLOWINFO
> > +       IFLA_GRE_FLOWINFO,
> > +
> > +#define IFLA_GRE_FLAGS rpl_IFLA_GRE_FLAGS
> > +       IFLA_GRE_FLAGS,
> > +
> > +#define IFLA_GRE_ENCAP_TYPE rpl_IFLA_GRE_ENCAP_TYPE
> > +       IFLA_GRE_ENCAP_TYPE,
> > +
> > +#define IFLA_GRE_ENCAP_FLAGS rpl_IFLA_GRE_ENCAP_FLAGS
> > +       IFLA_GRE_ENCAP_FLAGS,
> > +
> > +#define IFLA_GRE_ENCAP_SPORT rpl_IFLA_GRE_ENCAP_SPORT
> > +       IFLA_GRE_ENCAP_SPORT,
> > +
> > +#define IFLA_GRE_ENCAP_DPORT rpl_IFLA_GRE_ENCAP_DPORT
> > +       IFLA_GRE_ENCAP_DPORT,
> > +
> > +#define IFLA_GRE_COLLECT_METADATA rpl_IFLA_GRE_COLLECT_METADATA
> > +       IFLA_GRE_COLLECT_METADATA,
> > +
> > +#define __IFLA_GRE_MAX rpl__IFLA_GRE_MAX
> > +       __IFLA_GRE_MAX
> > +};
> > +#undef IFLA_GRE_MAX
> > +#define IFLA_GRE_MAX   (__IFLA_GRE_MAX - 1)
> > +
> 
> After thinking about the actual issue of using netdevices for tunnel
> port this is what I think.
> These are tree different implementations of OVS tunnel.
> 
> Case 1. OVS kernel module is upstream. It is straight forward to
> tunnel devices on upstream kernel module. STT and lisp are not
> available.
> Case 2. OVS kernel module is out of tree. In this case OVS has compat
> code and USE_UPSTREAM_TUNNEL is defined. We are using upstream kernel
> modules for geneve, gre and vxlan, for rest of vport. (stt and lisp)
> we are using netdevices from compat code.
> Case 3. OVS kernel module is out of tree and not using upstream tunnel
> devices. we have to fallback to  OVS compat code for all tunnel
> modules.
> 
> Now to detect these cases userspace could probe for tunnel device
> "ovs_geneve" or "ovs_vxlan" if it exist it is case 3, and userspace
> vswitchd has to use existing vport APIs. Otherwise we could use netdev
> based tunnel devices. And create tunnel devices for each type of
> tunnel port.

The fallback option should already work, then. We can make sure during testing
that is the case, so there would be no need to verify ovs_vxlan is present in
case 3. Would that be OK for you?

So, in summary, we drop this patch, submit what we had before, make sure it
works in the following scenarions:

1) upstream ovs and tunnels are used;
  1a) metadata tunnels can be created, those are used;
  1b) we use compat vports if the configuration allows that;

2) out-of-tree ovs and out-of-tree tunnels are used;
   we make sure using rtnetlink will fail and compat vport is used;
   NOTE: this should work even with the old out-of-tree code that named
         drivers as vxlan instead of ovs_vxlan.

3) out-of-tree ovs and upstream/in-tree tunnels are used;
   it should work just like with upstream ovs, unless the out-of-tree code does
   not support metadata tunnels, in which case, it should fallback to compat
   code.

In all cases, whenever a tunnel configuration that is not supported is used, it
will fail to setup the tunnel. For example, if GPE would be used and it was not
supported by creating the netdev, it won't work as well. As the compat code does
not receive new features, when out-of-tree tunnel drivers are used, those new
features won't be supported.

One question that is left (though I tried to cover it in the scenarios above)
is: do we need to support "old" out-of-tree versions with the new userspace?
That is, if the user updates the userspace, should we require that the
out-of-tree kernel datapath be updated to the matching release? In that case, we
don't need to test the new userspace with the old kernel datapath.

Thanks.
Cascardo.



More information about the dev mailing list