[ovs-dev] [PATCH ovn] northd: Remove the usage of 'ct.inv' in logical flows.

Han Zhou hzhou at ovn.org
Thu Mar 4 22:51:25 UTC 2021


On Mon, Mar 1, 2021 at 5:40 AM Numan Siddique <numans at ovn.org> wrote:
>
> On Fri, Feb 26, 2021 at 1:15 AM Han Zhou <hzhou at ovn.org> wrote:
> >
> > On Thu, Feb 25, 2021 at 1:25 AM Numan Siddique <numans at ovn.org> wrote:
> > >
> > > On Thu, Feb 25, 2021 at 1:12 PM Han Zhou <hzhou at ovn.org> wrote:
> > > >
> > > > On Wed, Feb 24, 2021 at 5:27 AM <numans at ovn.org> wrote:
> > > > >
> > > > > From: Numan Siddique <numans at ovn.org>
> > > > >
> > > > > Presently we add 65535 priority lflows in the stages -
> > > > > 'ls_in_acl' and 'ls_out_acl' to drop packets which
> > > > > match on 'ct.inv'.
> > > > >
> > > > > As per the 'ovs-fields' man page, this
> > > > > ct state field can be used to identify problems such as:
> > > > >  •  L3/L4 protocol handler is not loaded/unavailable.
> > > > >
> > > > >  •  L3/L4 protocol handler determines that the packet is
> > > > >     malformed.
> > > > >
> > > > >  •  Packets are unexpected length for protocol.
> > > > >
> > > > > This patch removes the usage of this field for the following
> > > > > reasons:
> > > > >
> > > > >  • Some of the smart NICs which support offloading datapath
> > > > >    flows don't support this field.
> > > >
> > > > What do you mean by "don't support this field"? Do you mean the NIC
> > > > offloading supports connection tracking, but cannot detect if a
packet
> > is
> > > > invalid and always populate the ct.inv as 0?
> > >
> > > I think so. From what I understand, the kernel conntrack feature is
used
> > > for the actual connection tracking. So NIC can't tell if the packet is
> > > invalid or not
> > > (say due to out-of-window tcp errors).
> > >
> > I know some NICs support CT offloading and some doesn't.
> > So here what you are referring are the NICs that doesn't support CT
> > offloading, which falls back to kernel datapath when CT is used, is it?
If
> > this is the case, then even without ct.inv it still couldn't support
> > ct.est, etc. right?
> > Or, do you mean this is specifically for NICs that support CT offloading
> > but not ct.inv, i.e. it can do regular conntrack in NIC but just can't
> > identify out-of-window packets, and that's why it supports ct.est but
not
> > ct.inv?
> > I am still quite confused. Could clarify a little more, which types of
NICs
> > would benefit from this, and how?
>
> I'm not sure if I can explain the issue well.  Can you please look
> into this bugzilla -
> https://bugzilla.redhat.com/show_bug.cgi?id=1921946
> We can discuss further if you have further questions or comments.
>
Unfortunately this one seems to require access permission.

>
> >
> > > >
> > > > >
> > > > >  • A recent commit in kernel ovs datapath sets the committed
> > > > >    connection tracking entry to be liberal for out-of-window
> > > > >    tcp packets (nf_ct_set_tcp_be_liberal()).  Such TCP
> > > > >    packets will not be marked as invalid.
> > > > >
> > > >
> > > > Could you share a link to this commit?
> > >
> > > Sure.
> > >
> >
https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next/+/e2ef5203c817a60bfb591343ffd851b6537370ff
> > >
> > Thanks for sharing. So OVS is not capable of detecting a out-of-window
> > packet now. Could you explain more about the motivation? I couldn't get
the
> > full picture from commit message of that patch. Do you have a link that
> > discusses more details?
>
> Let me share with you the patch which I first submitted to handle this
issue.
> During the review, @Florian Westphal suggested being liberal for
out-of-window
> packets to solve this issue.
>
> I think the patch discussions have enough information. Please let me know
> if you have further questions or comments and we can discuss further.
>
> Initial approach taken by me -
>
https://patchwork.ozlabs.org/project/netdev/patch/20201006083355.121018-1-nusiddiq@redhat.com/
> Final approach taken -
>
https://patchwork.ozlabs.org/project/netdev/patch/20201109072930.14048-1-nusiddiq@redhat.com/
>
Thanks for the pointers. Now I have a better context. It seems all these
work was to deal with (optimize) the LB (stateful) + stateless ACL use
cases.
1. we don't want to track packets coming from VIF (because there is no
stateful ACL)
2. but packets to VIF need to go through CT because there is LB configured
which requires CT (for nat), which regarded return packets as invalid.

With the above patch, the return packets won't be invalid any more in the
above scenario.
However, isn't the stateful ACL support also insufficient now? With the
above patch, middle-of-traffic packets without established connection will
be regarded as "new", instead of "invalid", right? Then I wonder if we lose
the value of stateful ACL completely. Any considerations here? (I am not a
fan of stateful ACLs but just thinking if our implementation reflects what
we are declaring to the users)

>
>
> >
> > > >
> > > > >  • Even if a ct.inv packet is delivered to a VIF, the
> > > > >    networking stack of the VIF's kernel can handle such
> > > > >    packets.
> > > >
> > > > I have some concern for this point. We shouldn't make assumptions
for
> > > > what's configured in the VIF's kernel, because it is independent of
> > what's
> > > > expected from OVN ACLs. In addition, egress rules are expected to
drop
> > > > invalid packets sent by the VIF (regardless of how the VIF's kernel
is
> > > > configured).
> > >
> > > Agree. I can delete this point from the commit message.
> > >
> > > >
> > > > However, I am not against this patch, but just want to double
confirm. I
> > > > think this deserves a description in NEWS if we do so.
> > >
> > > Sure. I will add to the NEWS.
> > >
> > > If there are concerns about removing this, how about we use ct.inv by
> > > default and
> > > add a config option to not use this flag in ovn-northd ?
> > > Deployments who want to make use of HWOL nics can turn on this option
?
> > >
> > Yes, I think an option would help. However, before moving there, I am
> > wondering if the user could simply use stateless ACLs to achieve the
same
> > outcome. What's the benefit of using stateful ACLs without the ability
to
> > detect invalid packets v.s. using stateless ACLs?
>
> Given that we enable conntrack for all the packets if a logical switch
> has "allow-related"
> ACLs or a load balancers associated, I don't think we can take this
> path.  In the case
> of ovn-k8s, the node logical switch will have a load balancer
> associated with it.
>
OK, so removing ct.inv is necessary (and ok) for a user who doesn't use
stateful ACL but uses LB and needs to deal with NIC offloading limitations.

However, for a user who really needs stateful ACL in OVN, it is now not
really supported by the CT actions because:
1. out-of-window packets and middle-of-traffic packets without established
connection are not checked (due to the net-next patch)
2. and with this patch, checksum errors are not checked.
Is this correct understanding?

Regardless of the above questions, I am totally ok with an option that
controls the behavior, but need to make the implications clear in the
document.

Thanks,
Han

> Thanks
> Numan
>
> >
> > > Thanks for the review.
> > >
> > > Numan
> > >
> > > >
> > > > Thanks,
> > > > Han
> > > >
> > > > >
> > > > > Signed-off-by: Numan Siddique <numans at ovn.org>
> > > > > ---
> > > > >  northd/ovn-northd.c | 24 ++++++++++++------------
> > > > >  tests/ovn-northd.at | 24 ++++++++++++------------
> > > > >  2 files changed, 24 insertions(+), 24 deletions(-)
> > > > >
> > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > > index dcdb777a2..e30fb532c 100644
> > > > > --- a/northd/ovn-northd.c
> > > > > +++ b/northd/ovn-northd.c
> > > > > @@ -5617,10 +5617,10 @@ build_acls(struct ovn_datapath *od, struct
> > hmap
> > > > *lflows,
> > > > >           *
> > > > >           * This is enforced at a higher priority than ACLs can be
> > > > defined. */
> > > > >          ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> > > > > -                      "ct.inv || (ct.est && ct.rpl &&
> > ct_label.blocked
> > > > == 1)",
> > > > > +                      "ct.est && ct.rpl && ct_label.blocked ==
1",
> > > > >                        "drop;");
> > > > >          ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> > > > > -                      "ct.inv || (ct.est && ct.rpl &&
> > ct_label.blocked
> > > > == 1)",
> > > > > +                      "ct.est && ct.rpl && ct_label.blocked ==
1",
> > > > >                        "drop;");
> > > > >
> > > > >          /* Ingress and Egress ACL Table (Priority 65535).
> > > > > @@ -5633,12 +5633,12 @@ build_acls(struct ovn_datapath *od, struct
> > hmap
> > > > *lflows,
> > > > >           *
> > > > >           * This is enforced at a higher priority than ACLs can be
> > > > defined. */
> > > > >          ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> > > > > -                      "ct.est && !ct.rel && !ct.new && !ct.inv "
> > > > > -                      "&& ct.rpl && ct_label.blocked == 0",
> > > > > +                      "ct.est && !ct.rel && !ct.new && "
> > > > > +                      "ct.rpl && ct_label.blocked == 0",
> > > > >                        "next;");
> > > > >          ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> > > > > -                      "ct.est && !ct.rel && !ct.new && !ct.inv "
> > > > > -                      "&& ct.rpl && ct_label.blocked == 0",
> > > > > +                      "ct.est && !ct.rel && !ct.new && "
> > > > > +                      "ct.rpl && ct_label.blocked == 0",
> > > > >                        "next;");
> > > > >
> > > > >          /* Ingress and Egress ACL Table (Priority 65535).
> > > > > @@ -5653,12 +5653,12 @@ build_acls(struct ovn_datapath *od, struct
> > hmap
> > > > *lflows,
> > > > >           * related traffic such as an ICMP Port Unreachable
through
> > > > >           * that's generated from a non-listening UDP port.  */
> > > > >          ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> > > > > -                      "!ct.est && ct.rel && !ct.new && !ct.inv "
> > > > > -                      "&& ct_label.blocked == 0",
> > > > > +                      "!ct.est && ct.rel && !ct.new && "
> > > > > +                      "ct_label.blocked == 0",
> > > > >                        "next;");
> > > > >          ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> > > > > -                      "!ct.est && ct.rel && !ct.new && !ct.inv "
> > > > > -                      "&& ct_label.blocked == 0",
> > > > > +                      "!ct.est && ct.rel && !ct.new && "
> > > > > +                      "ct_label.blocked == 0",
> > > > >                        "next;");
> > > > >
> > > > >          /* Ingress and Egress ACL Table (Priority 65535).
> > > > > @@ -5846,11 +5846,11 @@ build_lb(struct ovn_datapath *od, struct
hmap
> > > > *lflows)
> > > > >           *
> > > > >           * Send established traffic through conntrack for just
NAT.
> > */
> > > > >          ovn_lflow_add(lflows, od, S_SWITCH_IN_LB, UINT16_MAX - 1,
> > > > > -                      "ct.est && !ct.rel && !ct.new && !ct.inv
&& "
> > > > > +                      "ct.est && !ct.rel && !ct.new && "
> > > > >                        "ct_label.natted == 1",
> > > > >                        REGBIT_CONNTRACK_NAT" = 1; next;");
> > > > >          ovn_lflow_add(lflows, od, S_SWITCH_OUT_LB, UINT16_MAX -
1,
> > > > > -                      "ct.est && !ct.rel && !ct.new && !ct.inv
&& "
> > > > > +                      "ct.est && !ct.rel && !ct.new && "
> > > > >                        "ct_label.natted == 1",
> > > > >                        REGBIT_CONNTRACK_NAT" = 1; next;");
> > > > >      }
> > > > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > > > > index ad0f9f562..dc49c2543 100644
> > > > > --- a/tests/ovn-northd.at
> > > > > +++ b/tests/ovn-northd.at
> > > > > @@ -1940,9 +1940,9 @@ AT_CHECK([ovn-sbctl lflow-list ls | grep -e
> > > > ls_in_acl_hint -e ls_out_acl_hint -e
> > > > >    table=4 (ls_out_acl_hint    ), priority=6    , match=(!ct.new
&&
> > > > ct.est && !ct.rpl && ct_label.blocked == 1), action=(reg0[[7]] = 1;
> > > > reg0[[9]] = 1; next;)
> > > > >    table=4 (ls_out_acl_hint    ), priority=7    , match=(ct.new &&
> > > > !ct.est), action=(reg0[[7]] = 1; reg0[[9]] = 1; next;)
> > > > >    table=5 (ls_out_acl         ), priority=1    , match=(ip &&
> > (!ct.est
> > > > || (ct.est && ct_label.blocked == 1))), action=(reg0[[1]] = 1;
next;)
> > > > > -  table=5 (ls_out_acl         ), priority=65535, match=(!ct.est
&&
> > > > ct.rel && !ct.new && !ct.inv && ct_label.blocked == 0),
action=(next;)
> > > > > -  table=5 (ls_out_acl         ), priority=65535, match=(ct.est &&
> > > > !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_label.blocked == 0),
> > > > action=(next;)
> > > > > -  table=5 (ls_out_acl         ), priority=65535, match=(ct.inv ||
> > > > (ct.est && ct.rpl && ct_label.blocked == 1)), action=(drop;)
> > > > > +  table=5 (ls_out_acl         ), priority=65535, match=(!ct.est
&&
> > > > ct.rel && !ct.new && ct_label.blocked == 0), action=(next;)
> > > > > +  table=5 (ls_out_acl         ), priority=65535, match=(ct.est &&
> > > > !ct.rel && !ct.new && ct.rpl && ct_label.blocked == 0),
action=(next;)
> > > > > +  table=5 (ls_out_acl         ), priority=65535, match=(ct.est &&
> > ct.rpl
> > > > && ct_label.blocked == 1), action=(drop;)
> > > > >    table=8 (ls_in_acl_hint     ), priority=1    , match=(ct.est &&
> > > > ct_label.blocked == 0), action=(reg0[[10]] = 1; next;)
> > > > >    table=8 (ls_in_acl_hint     ), priority=2    , match=(ct.est &&
> > > > ct_label.blocked == 1), action=(reg0[[9]] = 1; next;)
> > > > >    table=8 (ls_in_acl_hint     ), priority=3    , match=(!ct.est),
> > > > action=(reg0[[9]] = 1; next;)
> > > > > @@ -1951,9 +1951,9 @@ AT_CHECK([ovn-sbctl lflow-list ls | grep -e
> > > > ls_in_acl_hint -e ls_out_acl_hint -e
> > > > >    table=8 (ls_in_acl_hint     ), priority=6    , match=(!ct.new
&&
> > > > ct.est && !ct.rpl && ct_label.blocked == 1), action=(reg0[[7]] = 1;
> > > > reg0[[9]] = 1; next;)
> > > > >    table=8 (ls_in_acl_hint     ), priority=7    , match=(ct.new &&
> > > > !ct.est), action=(reg0[[7]] = 1; reg0[[9]] = 1; next;)
> > > > >    table=9 (ls_in_acl          ), priority=1    , match=(ip &&
> > (!ct.est
> > > > || (ct.est && ct_label.blocked == 1))), action=(reg0[[1]] = 1;
next;)
> > > > > -  table=9 (ls_in_acl          ), priority=65535, match=(!ct.est
&&
> > > > ct.rel && !ct.new && !ct.inv && ct_label.blocked == 0),
action=(next;)
> > > > > -  table=9 (ls_in_acl          ), priority=65535, match=(ct.est &&
> > > > !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_label.blocked == 0),
> > > > action=(next;)
> > > > > -  table=9 (ls_in_acl          ), priority=65535, match=(ct.inv ||
> > > > (ct.est && ct.rpl && ct_label.blocked == 1)), action=(drop;)
> > > > > +  table=9 (ls_in_acl          ), priority=65535, match=(!ct.est
&&
> > > > ct.rel && !ct.new && ct_label.blocked == 0), action=(next;)
> > > > > +  table=9 (ls_in_acl          ), priority=65535, match=(ct.est &&
> > > > !ct.rel && !ct.new && ct.rpl && ct_label.blocked == 0),
action=(next;)
> > > > > +  table=9 (ls_in_acl          ), priority=65535, match=(ct.est &&
> > ct.rpl
> > > > && ct_label.blocked == 1), action=(drop;)
> > > > >  ])
> > > > >
> > > > >  AS_BOX([Check match ct_state with load balancer])
> > > > > @@ -1972,9 +1972,9 @@ AT_CHECK([ovn-sbctl lflow-list ls | grep -e
> > > > ls_in_acl_hint -e ls_out_acl_hint -e
> > > > >    table=4 (ls_out_acl_hint    ), priority=6    , match=(!ct.new
&&
> > > > ct.est && !ct.rpl && ct_label.blocked == 1), action=(reg0[[7]] = 1;
> > > > reg0[[9]] = 1; next;)
> > > > >    table=4 (ls_out_acl_hint    ), priority=7    , match=(ct.new &&
> > > > !ct.est), action=(reg0[[7]] = 1; reg0[[9]] = 1; next;)
> > > > >    table=5 (ls_out_acl         ), priority=1    , match=(ip &&
> > (!ct.est
> > > > || (ct.est && ct_label.blocked == 1))), action=(reg0[[1]] = 1;
next;)
> > > > > -  table=5 (ls_out_acl         ), priority=65535, match=(!ct.est
&&
> > > > ct.rel && !ct.new && !ct.inv && ct_label.blocked == 0),
action=(next;)
> > > > > -  table=5 (ls_out_acl         ), priority=65535, match=(ct.est &&
> > > > !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_label.blocked == 0),
> > > > action=(next;)
> > > > > -  table=5 (ls_out_acl         ), priority=65535, match=(ct.inv ||
> > > > (ct.est && ct.rpl && ct_label.blocked == 1)), action=(drop;)
> > > > > +  table=5 (ls_out_acl         ), priority=65535, match=(!ct.est
&&
> > > > ct.rel && !ct.new && ct_label.blocked == 0), action=(next;)
> > > > > +  table=5 (ls_out_acl         ), priority=65535, match=(ct.est &&
> > > > !ct.rel && !ct.new && ct.rpl && ct_label.blocked == 0),
action=(next;)
> > > > > +  table=5 (ls_out_acl         ), priority=65535, match=(ct.est &&
> > ct.rpl
> > > > && ct_label.blocked == 1), action=(drop;)
> > > > >    table=8 (ls_in_acl_hint     ), priority=1    , match=(ct.est &&
> > > > ct_label.blocked == 0), action=(reg0[[10]] = 1; next;)
> > > > >    table=8 (ls_in_acl_hint     ), priority=2    , match=(ct.est &&
> > > > ct_label.blocked == 1), action=(reg0[[9]] = 1; next;)
> > > > >    table=8 (ls_in_acl_hint     ), priority=3    , match=(!ct.est),
> > > > action=(reg0[[9]] = 1; next;)
> > > > > @@ -1983,9 +1983,9 @@ AT_CHECK([ovn-sbctl lflow-list ls | grep -e
> > > > ls_in_acl_hint -e ls_out_acl_hint -e
> > > > >    table=8 (ls_in_acl_hint     ), priority=6    , match=(!ct.new
&&
> > > > ct.est && !ct.rpl && ct_label.blocked == 1), action=(reg0[[7]] = 1;
> > > > reg0[[9]] = 1; next;)
> > > > >    table=8 (ls_in_acl_hint     ), priority=7    , match=(ct.new &&
> > > > !ct.est), action=(reg0[[7]] = 1; reg0[[9]] = 1; next;)
> > > > >    table=9 (ls_in_acl          ), priority=1    , match=(ip &&
> > (!ct.est
> > > > || (ct.est && ct_label.blocked == 1))), action=(reg0[[1]] = 1;
next;)
> > > > > -  table=9 (ls_in_acl          ), priority=65535, match=(!ct.est
&&
> > > > ct.rel && !ct.new && !ct.inv && ct_label.blocked == 0),
action=(next;)
> > > > > -  table=9 (ls_in_acl          ), priority=65535, match=(ct.est &&
> > > > !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_label.blocked == 0),
> > > > action=(next;)
> > > > > -  table=9 (ls_in_acl          ), priority=65535, match=(ct.inv ||
> > > > (ct.est && ct.rpl && ct_label.blocked == 1)), action=(drop;)
> > > > > +  table=9 (ls_in_acl          ), priority=65535, match=(!ct.est
&&
> > > > ct.rel && !ct.new && ct_label.blocked == 0), action=(next;)
> > > > > +  table=9 (ls_in_acl          ), priority=65535, match=(ct.est &&
> > > > !ct.rel && !ct.new && ct.rpl && ct_label.blocked == 0),
action=(next;)
> > > > > +  table=9 (ls_in_acl          ), priority=65535, match=(ct.est &&
> > ct.rpl
> > > > && ct_label.blocked == 1), action=(drop;)
> > > > >  ])
> > > > >
> > > > >  AT_CLEANUP
> > > > > --
> > > > > 2.29.2
> > > > >
> > > > > _______________________________________________
> > > > > dev mailing list
> > > > > dev at openvswitch.org
> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > > _______________________________________________
> > > > dev mailing list
> > > > dev at openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list