[ovs-git] [openvswitch/ovs] c95dee: odp-util.c: Fix dp_hash execution with slowpath ac...

Han Zhou noreply at github.com
Mon Jul 6 23:19:58 UTC 2020

  Branch: refs/heads/branch-2.12
  Home:   https://github.com/openvswitch/ovs
  Commit: c95deec94ef0231c19460e4f073c47614d0c3ab6
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2020-07-06 (Mon, 06 Jul 2020)

  Changed paths:
    M lib/odp-util.c

  Log Message:
  odp-util.c: Fix dp_hash execution with slowpath actions.

When dp_hash is executed with slowpath actions, it results in endless
recirc loop in kernel datapath, and finally drops the packet, with
kernel logs:

openvswitch: ovs-system: deferred action limit reached, drop recirc action

The root cause is that the dp_hash value calculated by slowpath is not
passed to datapath when executing the recirc action, thus when the recirced
packet miss upcall comes to userspace again, it generates the dp_hash
and recirc action again, with same recirc_id, which in turn generates
a megaflow with recirc action with the recird_id same as the recirc_id in
its match condition, which causes a loop in datapath.

For example, this can be reproduced with below setup of OVN environment:

                         LS1            LS2
                          |              |
        VIF--LS0---R0-----|              |------R3

Assume there is a route from the VIF to R3: R0 -> R1 -> R3, and there are two
routes (ECMP) from R3 to the VIF:
R3 -> R1 -> R0
R3 -> R2 -> R0

Now if we ping from the VIF to R3, the OVS flow execution on the HV of the VIF
will hit the R3's datapath which has flows that responds to the ICMP packet
by setting ICMP fields, which requires slowpath actions, and in later flow
tables it will hit the "group" action that selects between the ECMP routes.

By default OVN uses "dp_hash" method for the "group" action.

For the first miss upcall packet, dp_hash value is empty, so the group action
will be translated to "dp_hash" and "recirc".

During action execution, because of the previous actions that sets ICMP fields,
the whole execution requires slowpath, so it tries to execute all actions in
userspace in odp_execute_actions(), including dp_hash action, except the
recirc action, which can only be executed in datapath. So the dp_hash value
is calculated in userspace, and then the packet is injected to datapath for
recirc action execution.

However, the dp_hash calculated by the userspace is not passed to datapath.

Because of this, the packet recirc in datapath doesn't have dp_hash value,
and the miss upcall for the recirced packet hits the same flow tables and
triggers same "dp_hash" and "recirc" action again, with exactly same recirc_id!

This time, the new upcall doesn't require any slowpath execution, so both
the dp_hash and recirc actions are executed in datapath, after creating a
datapath megaflow like:

recirc_id(XYZ),..., actions:hash(l4(0)),recirc(XYZ)

with match recirc_id equals the recirc id in the action, thus creating a loop.

This patch fixes the problem by passing the calculated dp_hash value to
datapath in odp_key_from_dp_packet().

Fixes: 572f732ab078 ("dpif-netdev: user space datapath recirculation")
Signed-off-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Ben Pfaff <blp at ovn.org>

More information about the git mailing list