[ovs-dev] [PATCH 4/4] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.

Joe Stringer joe at ovn.org
Tue Jun 27 17:25:53 UTC 2017


On 27 June 2017 at 09:30, Chandran, Sugesh <sugesh.chandran at intel.com> wrote:
>
>
> Regards
> _Sugesh
>
>
>> -----Original Message-----
>> From: Joe Stringer [mailto:joe at ovn.org]
>> Sent: Monday, June 26, 2017 7:06 PM
>> To: Chandran, Sugesh <sugesh.chandran at intel.com>
>> Cc: ovs dev <dev at openvswitch.org>; Zoltán Balogh
>> <zoltan.balogh at ericsson.com>; Andy Zhou <azhou at ovn.org>
>> Subject: Re: [PATCH 4/4] tunneling: Avoid recirculation on datapath by
>> computing the recirculate actions at translate time.
>>
>> On 16 June 2017 at 09:45, Sugesh Chandran <sugesh.chandran at intel.com>
>> wrote:
>> > This patch set removes the recirculation of encapsulated tunnel
>> > packets if possible. It is done by computing the post tunnel actions
>> > at the time of translation. The combined nested action set are
>> > programmed in the datapath using CLONE action.
>> >
>> > Signed-off-by: Sugesh Chandran <sugesh.chandran at intel.com>
>> > Signed-off-by: Zoltán Balogh <zoltan.balogh at ericsson.com>
>> > Co-authored-by: Zoltán Balogh <zoltan.balogh at ericsson.com>
>> > ---
>>
>> It seems like this approach attempts to simulate the later bridge traversal to
>> generate the actions to be processed, forgets all of this processing, then if it
>> won't work, generates the recirc action.. but if it would work, goes back
>> through the latter bridge traversal again, this time keeping all of the
>> generated netlink actions; attributing stats; and applying side effects.
> [Sugesh] yes, that’s right!
>>
>> So, the first traversal needs to *not* attribute stats/side effects initially, until
>> we later do the check and determine that it's the right thing to do... but later
>> on, if we determine that it's the correct set of actions, then ideally we
>> wouldn't have to go and do the whole same translation again. The common
>> case should be that the latter bridge traversal is the actual actions we want to
>> do.. Could we instead, for this first traversal, clear out the
>> packet/resubmit_stats/etc, then provide an xlate_cache object to collect
>> stats/side effects, traverse the other bridge, then if we determine that it was
>> viable to do it this way, keep the buffer as-is, use xlate_cache to handle the
>> stats/side effects, then (if there is already an xin->xc) add temporary
>> xlate_cache entries to the end of the xin->xc?
> [Sugesh] Our first attempt is to use single translation. The issue that we faced for that approach is
> with resubmit_stats. We noticed the resubmit_stats is used in translation to update stats
> of ports, rule and etc in the translation phase.

Right, I don't think that resubmit stats should be passed through for
the initial 'dummy'/'trial' translation.

> If we need to revert the actions in case of trunc, the stats update made by resubmit_stats must be
> reverted, which is bit difficult to track. Even if we keep resubmit_stats as 0, stats may change
> for cases like adding tunnel.

Right. With regards to stats changing, xlate_cache needs to be
properly updated to track these changes. I believe that this series is
already doing the right thing here.

> As you suggested another option that we can try out would be as
>
> * Backup the xc and resubmit_stats before making the first traversal. Also clear the  xin->packet and set side-effects to false.
> * Create a tmp_xc and use it while doing the traversal. But will set the resubmit_stats as null.
> * Once the traversal complete, validate if its safe to combine the actions
> * If yes, perform the stats update using xlate_push_stats(tmp_xc). Append temporary xc entries into xin->xc

Yup, that's what I'm suggesting.

>>
>> Modulo need for the clone action, of course... but in my eyes even if we just
>> needed to shift a portion of the odp buffer back/forward to insert/remove a
>> clone action afterwards, then doing this memmove would be cheaper than
>> performing the whole translation for the latter bridge again.
> [Sugesh] May be I am missing something here, why do we need a memmove?
> We can do the clone_cancel in case to remove the clone action in case we wanted revert
> The actions.

I couldn't find a clone_cancel. This comment I made was a bit
speculative, because there are no actual clone actions involved in
this part of translation yet. It's more related to the other series by
Andy than this one.

However, for instance, if you have two bridges, and the first bridge
outputs packets to a patch port into the second bridge, then the
second bridge does some stateful processing like NAT, then finally the
first bridge's actions list does stuff after the output(patch), then
currently the second bridge will see a NATed packet. To avoid this
kind of situation, we can look at inserting clone action around the
output(patch), so only the cloned packet gets NATed. This tunneling
work could plausibly trigger a similar issue. Now, you don't
necessarily always want to insert a clone action around an
output(patch) because that could result in more processing in the
fastpath than necessary. So that's where Andy's series needs a bit
more work. Hopefully with some revision, that series can end up
resolving this kind of issue for patch ports as well as this tunneling
series.

>> Now, in the packet processing path for xlate_actions I wouldn't expect the
>> xlate_cache to be there today. If implementing the above, then we would
>> introduce xlate_cache to that path, which can be expensive due to things like
>> taking atomic refcounts on rules. So, I am making an assumption that in this
>> situation, performing one xlate_actions in the subsequent bridge with
>> xlate_cache being collected is cheaper than performing the two
>> xlate_actions with no xlate_cache. It might be worthwhile to check out that
>> assumption.
> [Sugesh] Will try to do in the above approach in v2 as its avoid additional translation.

OK, thanks.

>> For what it's worth, I think Andy had some thoughts on doing cloning in this
>> area, but he's OOO so won't be able to provide that feedback just yet. He
>> had posted the patches below. There is outstanding feedback on these so I
>> imagine that he will want to respin them. It might be worth taking a look to
>> see how they might interact with your
>> series:
>> http://patchwork.ozlabs.org/patch/767655/
>> http://patchwork.ozlabs.org/patch/767656/
> [Sugesh] Will review this patch series to see how it can work with our proposal

Great.


More information about the dev mailing list