[ovs-dev] [PATCH v2 4/4] tunneling: Avoid datapath-recirc by combining recirc actions at xlate.

Chandran, Sugesh sugesh.chandran at intel.com
Tue Jul 18 08:59:20 UTC 2017



Regards
_Sugesh


> -----Original Message-----
> From: Joe Stringer [mailto:joe at ovn.org]
> Sent: Tuesday, July 18, 2017 6:42 AM
> To: Chandran, Sugesh <sugesh.chandran at intel.com>
> Cc: Zoltán Balogh <zoltan.balogh at ericsson.com>; ovs dev
> <dev at openvswitch.org>; Andy Zhou <azhou at ovn.org>
> Subject: Re: [PATCH v2 4/4] tunneling: Avoid datapath-recirc by combining
> recirc actions at xlate.
> 
> On 17 July 2017 at 15:17, Chandran, Sugesh <sugesh.chandran at intel.com>
> wrote:
> > Comments inline
> >
> > Regards
> > _Sugesh
> >
> >> -----Original Message-----
> >> From: Joe Stringer [mailto:joe at ovn.org]
> >> Sent: Monday, July 17, 2017 7:33 PM
> >> To: Chandran, Sugesh <sugesh.chandran at intel.com>
> >> Cc: Zoltán Balogh <zoltan.balogh at ericsson.com>; ovs dev
> >> <dev at openvswitch.org>; Andy Zhou <azhou at ovn.org>
> >> Subject: Re: [PATCH v2 4/4] tunneling: Avoid datapath-recirc by
> >> combining recirc actions at xlate.
> >>
> >> On 17 July 2017 at 10:27, Chandran, Sugesh
> >> <sugesh.chandran at intel.com>
> >> wrote:
> >> >> > -----Original Message-----
> >> >> > From: Joe Stringer [mailto:joe at ovn.org]
> >> >> > Sent: Saturday, July 15, 2017 1:46 AM
> >> >> > To: Sugesh Chandran <sugesh.chandran at intel.com>
> >> >> > Cc: ovs dev <dev at openvswitch.org>; Andy Zhou <azhou at ovn.org>;
> >> >> > Zoltán Balogh <zoltan.balogh at ericsson.com>
> >> >> > Subject: Re: [PATCH v2 4/4] tunneling: Avoid datapath-recirc by
> >> >> > combining
> >> >> recirc actions at xlate.
> >> >> >
> >> >> > On 4 July 2017 at 02:46, Sugesh Chandran
> >> >> > <sugesh.chandran at intel.com>
> >> >> wrote:
> >>
> >> <snip>
> >>
> >> >> > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> >> >> > > 4e29085..4d996c1 100644
> >> >> > > --- a/lib/dpif-netdev.c
> >> >> > > +++ b/lib/dpif-netdev.c
> >> >> > > @@ -5028,24 +5028,8 @@ dp_execute_cb(void *aux_, struct
> >> >> > > dp_packet_batch *packets_,
> >> >> > >
> >> >> > >      case OVS_ACTION_ATTR_TUNNEL_PUSH:
> >> >> > >          if (*depth < MAX_RECIRC_DEPTH) {
> >> >> > > -            struct dp_packet_batch tnl_pkt;
> >> >> > > -            struct dp_packet_batch *orig_packets_ = packets_;
> >> >> > > -            int err;
> >> >> > > -
> >> >> > > -            if (!may_steal) {
> >> >> > > -                dp_packet_batch_clone(&tnl_pkt, packets_);
> >> >> > > -                packets_ = &tnl_pkt;
> >> >> > > -                dp_packet_batch_reset_cutlen(orig_packets_);
> >> >> > > -            }
> >> >> > > -
> >> >> > >              dp_packet_batch_apply_cutlen(packets_);
> >> >> > > -
> >> >> > > -            err = push_tnl_action(pmd, a, packets_);
> >> >> > > -            if (!err) {
> >> >> > > -                (*depth)++;
> >> >> > > -                dp_netdev_recirculate(pmd, packets_);
> >> >> > > -                (*depth)--;
> >> >> > > -            }
> >> >> > > +            push_tnl_action(pmd, a, packets_);
> >> >> >
> >> >> > If I follow, this was the previous datapath-level code to ensure
> >> >> > that when packets are output to a tunnel, only the copy of the
> >> >> > packet that goes to the tunnel gets the cutlen applied. With the
> >> >> > new version of the code, we replace this by making sure that the
> >> >> > ofproto layer generates a clone to wrap the push_tunnel and all
> >> >> > subsequent bridge translation, so then it results in the
> >> >> > equivalent
> >> >> > behaviour: The cutlen is only ever applied to a clone of the
> >> >> > packets. Is
> >> that right?
> >> > [Sugesh] Hmm. That’s the good catch. Looks like the packets are not
> >> > cloned when combine_tunnel_action is found impossible in ofproto
> layer.
> >> >
> >> > As a fix we should keep the code block in 'may_steal' to handle the
> >> > non tunnel-combine case.
> >>
> >> Oh, really? I thought that we always end up with a clone wrapping the
> >> tunnel_push now, which means we shouldn't need this bit any more. My
> >> comment was intended just to ask if I understood correctly to make
> >> sure we don't miss anything.
> >>
> >> > Any other suggestion to apply cut_len in ofproto?
> >>
> >> Unfortunately the cut_len is set up through a previous truncate
> >> action, and it is supposed to apply to the "next output action" (and
> >> reset afterwards), so it is part of the datapath actions interface
> >> definition; I don't think there's a way to push it up to ofproto.
> > [Sugesh] Or at ofproto layer,  the non-combine tunnel push also placed
> > in a clone, similar to the combine case? That will help to keep the datapath
> as simple as just a push.
> 
> That sounds simpler, but I don't think we can get rid of applying the cut_len
> just yet. This is related to some other discussions going on around the
> interaction between truncate() and clone(), so it's probably a bit safer to just
> leave this code to apply cut len in place for now.
[Sugesh] I don’t meant to get rid of cut_len. Instead we make sure any tunnel-push
always happen on a clone. We keep the push +cut_len as is in the patch.
So the ofproto actions would become
clone(tnl_push), trunc(200) -->Non-combine case. (instead of just [tnl_push, trunc(200)]
clone(tnl_push, output(1)) ---> Combine case
Do you think it has any other implication that we may missed?
> 
> >> >> > > +static bool
> >> >> > > +validate_and_combine_post_tnl_actions(struct xlate_ctx *ctx,
> >> >> > > +                                      const struct xport *xport,
> >> >> > > +                                      struct xport *out_dev,
> >> >> > > +                                      struct
> >> >> > > +ovs_action_push_tnl
> >> >> > > +tnl_push_data) {
> >> >> > > +    const struct dpif_flow_stats *backup_resubmit_stats;
> >> >> > > +    struct xlate_cache *backup_xcache;
> >> >> > > +    bool nested_act_flag = false;
> >> >> > > +    struct flow_wildcards tmp_flow_wc;
> >> >> > > +    struct flow_wildcards *backup_flow_wc_ptr;
> >> >> > > +    bool backup_side_effects;
> >> >> > > +    const struct dp_packet *backup_pkt;
> >> >> > > +
> >> >> > > +    memset(&tmp_flow_wc, 0 , sizeof tmp_flow_wc);
> >> >> > > +    backup_flow_wc_ptr = ctx->wc;
> >> >> > > +    ctx->wc = &tmp_flow_wc;
> >> >> > > +    ctx->xin->wc = NULL;
> >> >> > > +    backup_resubmit_stats = ctx->xin->resubmit_stats;
> >> >> > > +    backup_xcache = ctx->xin->xcache;
> >> >> > > +    backup_side_effects = ctx->xin->allow_side_effects;
> >> >> > > +    backup_pkt = ctx->xin->packet;
> >> >> > > +
> >> >> > > +    size_t push_action_size = 0;
> >> >> > > +    size_t clone_ofs = nl_msg_start_nested(ctx->odp_actions,
> >> >> > > +                                           OVS_ACTION_ATTR_CLONE);
> >> >> > > +    odp_put_tnl_push_action(ctx->odp_actions, &tnl_push_data);
> >> >> > > +    push_action_size = ctx->odp_actions->size;
> >> >> > > +
> >> >> > > +    ctx->xin->resubmit_stats =  NULL;
> >> >> > > +    ctx->xin->xcache = xlate_cache_new(); /* Use new
> >> >> > > + temporary
> >> cache.
> >> >> */
> >> >> > > +    ctx->xin->allow_side_effects = false;
> >> >> > > +    ctx->xin->packet = NULL;
> >> >> >
> >> >> > I realised that you asked about this after the previous patch,
> >> >> > but I didn't understand the implications of it. I believe that
> >> >> > this packet is how, for instance, the second bridge can forward
> >> >> > packets to the controller. If you reset it to NULL here, then I
> >> >> > don't think that the controller action is executed correctly
> >> >> > during the second bridge translation. (That's a test we could add).
> >> > [Sugesh] Yes, that’s a good test case for it.
> >> >> >
> >> >> > I suspect that some multicast snooping stuff and some other
> >> >> > pieces that would be affected by this.
> >> > [Sugesh] This is interesting. So does the controller expects a
> >> > encaped packets in that case? Do you think, the controller action
> >> > also be an exception case like the TRUNC?
> >>
> >> Yes, I think the controller expects encapped packets in this case.
> >> Yes, I think that the easiest way to handle this is to put controller
> >> in that exception case with truncate.
> > [Sugesh] Ok.  Will add that case as an exception.
> >
> > Before that, Would like to clarify the controller output action bit
> > more How does the following case are currently handled that involves
> > controller action (We haven’t run any test case that involves a
> > controller and very limited knowledge on that side)
> >
> > MOD_VLAN(10) --> (PATCH-PORT) --> (SECOND-BR) What if there is no
> > rules to handle the packet in 'second-br' and wanted to report to
> controller?
> >
> > Will the packet report with VLAN 10 to controller?
> 
> Currently, this will work because patch port doesn't NULLify this
> ctx->xin->packet pointer, then when it translates through the second
> bridge and gets to execute_controller_action(), that function will make a
> clone, apply the current actions to that packet, then send the modified
> packet to the controller.
[Sugesh] Ok. So all the pre-actions will be applied before 
cloning the packets(applying mod_vlan) using the callback.
> 
> So, longer term I think that this NULLification should be removed and we
> keep the same packet attached to the context. However at the moment,
> from execute_controller_action() ->
> xlate_execute_odp_actions() -> odp_execute_actions() there is no datapath
> callback provided for implementing the tunnel_push functionality. Maybe it's
> as simple as plugging
> dpif_execute_helper_cb() into the odp_execute_actions() call there, but
[Sugesh] I think we should offer the callback to do the modification.
Will work on when this patch series is accepted.
> that sounds like a separate change that should get some thorough review
> and testing as well. I don't think it's strictly necessary for getting this series in,
> this series can function correctly using the fallback path as we discussed
> above. It'd be nice to see this at some point in future though to consolidate
> the paths through this code.
[Sugesh] Will try to do as a separate patch as you suggested.


More information about the dev mailing list