[ovs-dev] [PATCH v5] Extend OVS IPFIX exporter to export tunnel headers

Wenyu Zhang wenyuz at vmware.com
Tue Jul 22 23:14:58 UTC 2014


My comments inline.

发自我的 iPhone

在 2014-7-23,4:02,"Pravin Shelar" <pshelar at nicira.com> 写道:

> On Mon, Jul 21, 2014 at 8:35 PM, Wenyu Zhang <wenyuz at vmware.com> wrote:
>> Thanks for the review, my comments inline.
>> 
>> Bests,
>> Wenyu
>> -----Original Message-----
>> From: Pravin Shelar [mailto:pshelar at nicira.com]
>> Sent: Tuesday, July 22, 2014 6:56 AM
>> To: Wenyu Zhang
>> Cc: Romain Lenglet; Ben Pfaff; Jesse Gross; dev at openvswitch.org
>> Subject: Re: [PATCH v5] Extend OVS IPFIX exporter to export tunnel headers
>> 
>> On Mon, Jul 21, 2014 at 2:39 AM, Wenyu Zhang <wenyuz at vmware.com> wrote:
>>> Extend IPFIX exporter to export tunnel headers when both input and output
>>> of the port.
>>> Add three other_config options in IPFIX table: enable-input-sampling,
>>> enable-output-sampling and enable-tunnel-sampling, to control whether
>>> sampling tunnel info, on which direction (input or output).
>>> Insert sampling action before output action and the output tunnel port
>>> is sent to datapath in the sampling action.
>>> Make datapath collect output tunnel info and send it back to userpace
>>> in upcall message with a new additional optional attribute.
>>> Add a tunnel ports map to make the tunnel port lookup faster in sampling
>>> upcalls in IPFIX exporter. Make the IPFIX exporter generate IPFIX template
>>> sets with enterprise elements for the tunnel info, save the tunnel info
>>> in IPFIX cache entries, and send IPFIX DATA with tunnel info.
>>> Add flowDirection element in IPFIX templates.
>>> 
>>> Signed-off-by: Wenyu Zhang <wenyuz at vmware.com>
>>> Acked-by: Romain Lenglet <rlenglet at vmware.com>
>> 
>> This looking close, I have few comments below.
>> 
>>> ---
>>> v2: Address Romain's comments
>>> v3: Address Pravin's comments, make datapath sent all tunnel info,
>>>    not only tunnel key to userspace.
>>> v4: Address Pravin's comments, introduce a common function to get output
>>>    tunnel info for all vports, remove duplicated codes.
>>>    Rebase.
>>> v5: Address Pravin's comments on v4, correct sparse errors, make a common
>>>    function to setup tunnel info data for both input and output case.
>>> ---
>>> datapath/actions.c                    |   18 ++
>>> datapath/datapath.c                   |   46 +++-
>>> datapath/datapath.h                   |    2 +
>>> datapath/flow.h                       |   54 ++--
>>> datapath/flow_netlink.c               |   58 ++++-
>>> datapath/flow_netlink.h               |    2 +
>>> datapath/vport-geneve.c               |   36 ++-
>>> datapath/vport-gre.c                  |   35 ++-
>>> datapath/vport-lisp.c                 |   40 ++-
>>> datapath/vport-vxlan.c                |   39 ++-
>>> datapath/vport.c                      |   40 +++
>>> datapath/vport.h                      |   11 +
>>> include/linux/openvswitch.h           |   21 +-
>>> lib/dpif-linux.c                      |    2 +
>>> lib/dpif.h                            |    1 +
>>> lib/odp-util.c                        |  183 +++++++++-----
>>> lib/odp-util.h                        |    2 +
>>> lib/packets.h                         |    9 +-
>>> ofproto/automake.mk                   |    3 +
>>> ofproto/ipfix-enterprise-entities.def |   19 ++
>>> ofproto/ofproto-dpif-ipfix.c          |  444 ++++++++++++++++++++++++++++++---
>>> ofproto/ofproto-dpif-ipfix.h          |   14 +-
>>> ofproto/ofproto-dpif-upcall.c         |   20 +-
>>> ofproto/ofproto-dpif-xlate.c          |   51 +++-
>>> ofproto/ofproto-dpif.c                |   20 ++
>>> ofproto/ofproto.h                     |    3 +
>>> ofproto/tunnel.c                      |    4 +
>>> tests/odp.at                          |    9 +-
>>> utilities/ovs-vsctl.8.in              |    8 +-
>>> vswitchd/bridge.c                     |    9 +
>>> vswitchd/vswitch.ovsschema            |    5 +-
>>> vswitchd/vswitch.xml                  |   27 ++
>>> 32 files changed, 1048 insertions(+), 187 deletions(-)
>>> create mode 100644 ofproto/ipfix-enterprise-entities.def
>>> 
>>> diff --git a/datapath/actions.c b/datapath/actions.c
>>> index 39a21f4..c1ad4dc 100644
>>> --- a/datapath/actions.c
>>> +++ b/datapath/actions.c
>>> @@ -502,6 +502,7 @@ static int output_userspace(struct datapath *dp, struct sk_buff *skb,
>>>        struct dp_upcall_info upcall;
>>>        const struct nlattr *a;
>>>        int rem;
>>> +       struct ovs_tunnel_info output_tunnel_info;
>>> 
>>>        BUG_ON(!OVS_CB(skb)->pkt_key);
>>> 
>>> @@ -509,6 +510,7 @@ static int output_userspace(struct datapath *dp, struct sk_buff *skb,
>>>        upcall.key = OVS_CB(skb)->pkt_key;
>>>        upcall.userdata = NULL;
>>>        upcall.portid = 0;
>>> +       upcall.out_tun_info = NULL;
>>> 
>>>        for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
>>>                 a = nla_next(a, &rem)) {
>>> @@ -520,7 +522,23 @@ static int output_userspace(struct datapath *dp, struct sk_buff *skb,
>>>                case OVS_USERSPACE_ATTR_PID:
>>>                        upcall.portid = nla_get_u32(a);
>>>                        break;
>>> +
>>> +               case OVS_USERSPACE_ATTR_TUNNEL_OUT_PORT: {
>>> +                       /* Get out tunnel info. */
>>> +                       int err;
>>> +                       struct vport *vport;
>>> +                       vport = ovs_vport_ovsl_rcu(dp, nla_get_u32(a));
>> This function is never called under ovs_lock, so you can use rcu
>> version of lookup.
>> Wenyu: Sorry, I am confused. I have used ovs_vport_ovsl_rcu(), do you mean another rcu version of lookup?
> 
> You can use ovs_vport_rcu().
Wenyu: Thanks, I will change it.
> 
>>> +                       if (vport->ops->get_out_tun_info){
>> 
>> You also need to check for vport, if it is NULL.
>> Wenyu: Sure, I will add check for vport and vport->ops.
> ok.
> 
>>> +                               err = vport->ops->get_out_tun_info(
>>> +                                       vport, skb, &output_tunnel_info);
>>> +                               if (err == 0) {
>>> +                                       upcall.out_tun_info = &output_tunnel_info;
>>> +                               }
>>> +                       }
>>> +                       break;
>>>                }
>>> +
>>> +               }/* End of switch. */
>>>        }
>> 
>> .......
>>> {
>>> +       BUILD_BUG_ON(sizeof(tun_info->tunnel) != OVS_TUNNEL_KEY_SIZE);
>>> +
>>>        tun_info->tunnel.tun_id = tun_id;
>>> -       tun_info->tunnel.ipv4_src = iph->saddr;
>>> -       tun_info->tunnel.ipv4_dst = iph->daddr;
>>> -       tun_info->tunnel.ipv4_tos = iph->tos;
>>> -       tun_info->tunnel.ipv4_ttl = iph->ttl;
>>> +       tun_info->tunnel.ipv4_src = saddr;
>>> +       tun_info->tunnel.ipv4_dst = daddr;
>>> +       tun_info->tunnel.ipv4_tos = tos;
>>> +       tun_info->tunnel.ipv4_ttl = ttl;
>>>        tun_info->tunnel.tun_flags = tun_flags;
>>> 
>>> -       /* clear struct padding. */
>>> -       memset((unsigned char *) &tun_info->tunnel + OVS_TUNNEL_KEY_SIZE, 0,
>>> -              sizeof(tun_info->tunnel) - OVS_TUNNEL_KEY_SIZE);
>> 
>> We still need this memset for case where we add another member to this
>> struct, you can check for size of padding to avoid sparse warning.
>> Wenyu:  The additional if condition  may affect performance, we'd better to add some comments to mention this usage.
>>               I have added a compling check about the size: BUILD_BUG_ON(sizeof(tun_info->tunnel) != OVS_TUNNEL_KEY_SIZE);
>>               If we add another member to this structre in future, this check will failed when compling to mention us, and then we can add the memset only when the failure happens.
>>                And I can add some comments  about the usage of memset here.
> 
> Compiler can determine size is zero and it does not generate code if
> it is not required. so there is no performance penalty.
> 
Wenyu: Ok, I will change it.
>>> +       /* For the tunnel types on the top of IPsec, the tp_src and tp_dst of
>>> +        * the upper tunnel are used.
>>> +        * E.g: GRE over IPSEC, the tp_src and tp_port are zero.
>>> +        */
>>> +       tun_info->tunnel.tp_src = tp_src;
>>> +       tun_info->tunnel.tp_dst = tp_dst;
>>> 
>>>        tun_info->options = opts;
>>>        tun_info->options_len = opts_len;
>>> }
>> .....
>> 
>>> 
>>> +static int geneve_get_out_tun_info(struct vport *vport, struct sk_buff *skb,
>>> +                                  struct ovs_tunnel_info *out_tun_info)
>>> +{
>>> +       struct geneve_port *geneve_port = geneve_vport(vport);
>>> +
>>> +       if (unlikely(!OVS_CB(skb)->tun_info))
>>> +               return -EINVAL;
>>> +
>> Can you move this check inside ovs_vport_out_tun_info.
>> Wenyu: Considering that the tp_src and tp_dst need be prepared before ovs_vport_get_out_tun_info() is called, we'd better do the check before doing the work to calculate tp_src and tp_dst. So I perfer to keep the check here.
> 
> tp_src and tp_dst does not depend on out_tun_info, so the check should
> be moved to the common function.

Wenyu: I mean that if out_tun_info is NULL, it is a waste to calculate tp_src and tp_dst. Anyway,out_tun_info won't be NULL in normal case, I can move it.

>>> +       /*
>>> +        * Get tp_src and tp_dst, refert to geneve_build_header().
>>> +        */
>>> +       return ovs_vport_get_out_tun_info(out_tun_info, OVS_CB(skb)->tun_info,
>>> +                                         ovs_dp_get_net(vport->dp),
>>> +                                         IPPROTO_UDP, skb->mark,
>>> +                                         vxlan_src_port(1, USHRT_MAX, skb),
>>> +                                         inet_sport(geneve_port->sock->sk));
>>> +
>>> +}
>>> +
>>> const struct vport_ops ovs_geneve_vport_ops = {
>>> -       .type           = OVS_VPORT_TYPE_GENEVE,
>>> -       .create         = geneve_tnl_create,
>>> -       .destroy        = geneve_tnl_destroy,
>>> -       .get_name       = geneve_get_name,
>>> -       .get_options    = geneve_get_options,
>>> -       .send           = geneve_send,
>>> +       .type                   = OVS_VPORT_TYPE_GENEVE,
>>> +       .create                 = geneve_tnl_create,
>>> +       .destroy                = geneve_tnl_destroy,
>>> +       .get_name               = geneve_get_name,
>>> +       .get_options            = geneve_get_options,
>>> +       .send                   = geneve_send,
>>> +       .get_out_tun_info       = geneve_get_out_tun_info,
>>> };


More information about the dev mailing list