[ovs-dev] [PATCH] odp-util.c: Fix dp_hash execution with slowpath actions.

Han Zhou hzhou at ovn.org
Thu Jun 11 17:41:23 UTC 2020

On Thu, Jun 11, 2020 at 6:15 AM Ilya Maximets <i.maximets at ovn.org> wrote:
> On 5/15/20 8:55 PM, Han Zhou wrote:
> >
> >
> > On Fri, May 15, 2020 at 12:18 AM Han Zhou <hzhou at ovn.org <mailto:
hzhou at ovn.org>> wrote:
> >>
> >> When dp_hash is executed with slowpath actions, it results in endless
> >> recirc loop in kernel datapath, and finally drops the packet, with
> >> kernel logs:
> >>
> >> openvswitch: ovs-system: deferred action limit reached, drop recirc
> >>
> >> The root cause is that the dp_hash value calculated by slowpath is not
> >> passed to datapath when executing the recirc action, thus when the
> >> packet miss upcall comes to userspace again, it generates the dp_hash
> >> and recirc action again, with same recirc_id, which in turn generates
> >> a megaflow with recirc action with the recird_id same as the recirc_id
> >> its match condition, which causes a loop in datapath.
> >>
> >> For example, this can be reproduced with below setup of OVN
> >>
> >>                          LS1            LS2
> >>                           |              |
> >>                           |------R1------|
> >>         VIF--LS0---R0-----|              |------R3
> >>                           |------R2------|
> >>
> >> Assume there is a route from the VIF to R3: R0 -> R1 -> R3, and there
are two
> >> routes (ECMP) from R3 to the VIF:
> >> R3 -> R1 -> R0
> >> R3 -> R2 -> R0
> >>
> >> Now if we ping from the VIF to R3, the OVS flow execution on the HV of
the VIF
> >> will hit the R3's datapath which has flows that responds to the ICMP
> >> by setting ICMP fields, which requires slowpath actions, and in later
> >> tables it will hit the "group" action that selects between the ECMP
> >>
> >> By default OVN uses "dp_hash" method for the "group" action.
> >>
> >> For the first miss upcall packet, dp_hash value is empty, so the group
> >> will be translated to "dp_hash" and "recirc".
> >>
> >> During action execution, because of the previous actions that sets
ICMP fields,
> >> the whole execution requires slowpath, so it tries to execute all
actions in
> >> userspace in odp_execute_actions(), including dp_hash action, except
> >> recirc action, which can only be executed in datapath. So the dp_hash
> >> is calculated in userspace, and then the packet is injected to
datapath for
> >> recirc action execution.
> >>
> >> However, the dp_hash calculated by the userspace is not passed to
> >>
> >> Because of this, the packet recirc in datapath doesn't have dp_hash
> >> and the miss upcall for the recirced packet hits the same flow tables
> >> triggers same "dp_hash" and "recirc" action again, with exactly same
> >>
> >> This time, the new upcall doesn't require any slowpath execution, so
> >> the dp_hash and recirc actions are executed in datapath, after
creating a
> >> datapath megaflow like:
> >>
> >> recirc_id(XYZ),..., actions:hash(l4(0)),recirc(XYZ)
> >>
> >> with match recirc_id equals the recirc id in the action, thus creating
a loop.
> >>
> >> This patch fixes the problem by passing the calculated dp_hash value to
> >> datapath in odp_key_from_dp_packet().
> >>
> >> Signed-off-by: Han Zhou <hzhou at ovn.org <mailto:hzhou at ovn.org>>
> >> ---
> >>  lib/odp-util.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/lib/odp-util.c b/lib/odp-util.c
> >> index b66d266..ac532fe 100644
> >> --- a/lib/odp-util.c
> >> +++ b/lib/odp-util.c
> >> @@ -6392,6 +6392,10 @@ odp_key_from_dp_packet(struct ofpbuf *buf,
const struct dp_packet *packet)
> >>
> >>      nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, md->skb_priority);
> >>
> >> +    if (md->dp_hash) {
> >> +        nl_msg_put_u32(buf, OVS_KEY_ATTR_DP_HASH, md->dp_hash);
> >> +    }
> >> +
> >>      if (flow_tnl_dst_is_set(&md->tunnel)) {
> >>          tun_key_to_attr(buf, &md->tunnel, &md->tunnel, NULL, NULL);
> >>      }
> >> --
> >> 2.1.0
> >>
> >
> > Ben and Ilya, this is the fix to the dp_hash problem we discussed in
yesterday's meeting. The actual fix is simpler that I thought it would be.
I didn't take the approach of executing dp_hash in datapath because in this
case since the flow is required to be slowpathed, all the following packets
for this flow will anyway get upcalled. If all the dp_hash for the flow is
executed in slowpath then there is no consistency problem. So I think it is
ok to keep the calculation in userspace and the fix is simple. Let me know
if you think differently.
> Hi.  Sorry for it took so long to reply.
Hi Ilya, thanks for the review!

> I understand that this patch fixes this particular case, however I still
> think it's dangerous to pass the hash calculated in userspace to kernel
> since it might cause mismatch for the later packets in case where the
> flow doesn't have actions that requires sending to userspace.

If the flow doesn't have actions that require slowpath, the dp_hash will
not get processed in userspace for the first upcall. This problem happens
only if it requires both slowpath actions (in the example of the commit
message it is the ICMP field setting) and dp_hash, and in such case dp_hash
is always caculated in userspace for this flow. The fix is just to make
this scenario work without endless recirculation. Please correct me if I am

> One more thing.  We have such a comment in odp-execute.c:
>         /* Calculate a hash value directly. This might not match the

>          * value computed by the datapath, but it is much less expensive,

>          * and the current use case (bonding) does not require a strict

>          * match to work properly. */
> But we're using dp_hash not only for bonding for a long time now.
> And this doesn't look correct.  Even for bonding I'm not sure if that
> is a fully correct assumption.

Yes, I saw this comment and got confused. I think this out-of-date comment
may be fixed in a separate patch.

> From the other side, AFAIU, OVS is not able to execute any "non-terminal"
> actions in datapath, get results and continue to execute further actions.
> The option here is to assume that dp_hash + recirc always goes together.
Sorry, I didn't catch this point. What do you mean by "non-terminal"

Here's how I think about this problem:
1. The current design is that when the flow has actions that require
slowpath, it tries to execute as many actions as possible in userspace
(see dpif_execute_with_help()),
and only execute actions in datapath if it can't be handled in userspace
(e.g. recirc). For this design, the current implementation has a bug: it
didn't pass the hash value calculated by userspace when sending to datapath
to continue executing "recirc", which caused the endless loop of recirc.
This patch just fix this bug.

2. As you pointed out, it may be better to always execute dp_hash in
datapath, which is a design change. While it seems more reasonable, but
maybe not really necessary. I think we can continue the discussion, and we
can address it with another patch if we conclude that it is necessary.

So do you think it makes sense to fix 1) with this patch and then continue
on 2)? (I can't promise that I can address 2) very soon, but I will try if
I have time)


> I'm not sure how to proceed here.
> Ben, what do you think?
> Best regards, Ilya Maximets.

More information about the dev mailing list