[ovs-dev] [PATCH ovn v2] ovn-northd: bypass ct for allow ACLs

Ihar Hrachyshka ihrachys at redhat.com
Thu Apr 22 01:30:47 UTC 2021


On Tue, Apr 13, 2021 at 4:02 AM Dumitru Ceara <dceara at redhat.com> wrote:
>
> On 4/13/21 2:26 AM, Ihar Hrachyshka wrote:
> > On Mon, Apr 12, 2021 at 5:00 AM Dumitru Ceara <dceara at redhat.com> wrote:
> >>
> >> On 4/6/21 6:35 PM, Ihar Hrachyshka wrote:
> >>> On Fri, Apr 2, 2021 at 7:23 AM Dumitru Ceara <dceara at redhat.com> wrote:
> >>>>
> >>>> On 3/24/21 3:40 PM, Ihar Hrachyshka wrote:
> >>>>> For allow ACLs, bypass connection tracking by avoiding setting ct
> >>>>> hints for matching traffic. Avoid sending all traffic to ct when a
> >>>>> stateful ACL is present. Before the patch, this unnecessarily hit
> >>>>> performance when mixed ACL action types were used for the same
> >>>>> datapath.
> >>>>>
> >>>>> ===
> >>>>>
> >>>>> For performance measurements, ovn-fake-multinode environment and qperf
> >>>>> were used. Performance measured between two virtual nodes, two ports
> >>>>> that belong to different LSs connected via router. Using qperf,
> >>>>> performance was measured for UDP, TCP, SCTP protocols (using
> >>>>> <proto>_lat and <proto>_bw tests). The qperf version used:
> >>>>> 0.4.9-16.fc31.x86_64.  Each test scenario was executed five times and
> >>>>> averages compared.
> >>>>>
> >>>>> Tests were executed with `allow` rules for the tested protocol and
> >>>>> `allow-related` for another protocol set for both ports, both
> >>>>> directions, e.g. for TCP scenario, the following ACLs were defined:
> >>>>>
> >>>>> ovn-nbctl acl-add sw0 to-lport 100 tcp allow
> >>>>> ovn-nbctl acl-add sw0 from-lport 100 tcp allow
> >>>>> ovn-nbctl acl-add sw1 to-lport 100 tcp allow
> >>>>> ovn-nbctl acl-add sw1 from-lport 100 tcp allow
> >>>>>
> >>>>> ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related
> >>>>> ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related
> >>>>> ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related
> >>>>> ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related
> >>>>>
> >>>>> In this particular environment, improvement was seen in send_bw,
> >>>>> latency, and msg_rate measurements, where applicable, for all three
> >>>>> protocols under test.
> >>>>>
> >>>>> for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%)
> >>>>>          latency: 16 us => 14.08 us (-12%)
> >>>>>          msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%)
> >>>>>
> >>>>> for TCP, latency: 18.6 us => 14.88 us (-20%)
> >>>>>          msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%)
> >>>>>
> >>>>> for SCTP, latency: 21.98 us => 19.42 us (-11.65%)
> >>>>>           msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%)
> >>>>>
> >>>>> Interestingly, some performance improvement was also seen for the same
> >>>>> scenarios with no ACLs set at all, albeit significantly more
> >>>>> negligible.
> >>>>>
> >>>>> for UDP, send_bw: 320.0 MB/sec => 338.6 MB/sec (+5.81%)
> >>>>>          latency: 13.74 us => 12.88 us (-6.68%)
> >>>>>          msg_rate: 73.02 K/sec => 77.84 K/sec (+6.6%)
> >>>>>
> >>>>> for TCP, latency: 15.62 us => 14.26 us (-9.54%)
> >>>>>          msg_rate: 64.02 K/sec => 70.26 K/sec (+9.75%)
> >>>>>
> >>>>> for SCTP, latency: 19.56 us => 18.16 us (-7.16%)
> >>>>>           msg_rate: 51.16 K/sec => 55.12 K/sec (+7.74%)
> >>>>>
> >>>>> Comparable numbers can be captured with iperf. It may be useful to run
> >>>>> more tests in a more elaborate (bare metal) environment.
> >>>>>
> >>>>> ===
> >>>>>
> >>>>> The patch takes inspiration from a now abandoned patch:
> >>>>>
> >>>>> "ovn-northd: Support mixing stateless/stateful ACLs with
> >>>>> Stateless_Filter." by Dumitru Ceara.
> >>>>>
> >>>>> Signed-off-by: Ihar Hrachyshka <ihrachys at redhat.com>
> >>>>>
> >>>>> ---
> >>>>
> >>>> Hi Ihar,
> >>>>
> >>>> Thanks for working on this and sorry for the delay in reviewing your patch.
> >>>>
> >>>>>
> >>>>> v1: initial version.
> >>>>> v2: rebased after conflict.
> >>>>> ---
> >>>>>  NEWS                    |   1 +
> >>>>>  northd/ovn-northd.8.xml |   9 +-
> >>>>>  northd/ovn-northd.c     | 166 +++++++++++++++--------
> >>>>>  tests/ovn-northd.at     | 287 ++++++++++++++++++++++++++++++++++++++++
> >>>>>  4 files changed, 400 insertions(+), 63 deletions(-)
> >>>>>
> >>>>> diff --git a/NEWS b/NEWS
> >>>>> index 530c5d42f..548a45fb7 100644
> >>>>> --- a/NEWS
> >>>>> +++ b/NEWS
> >>>>> @@ -7,6 +7,7 @@ Post-v21.03.0
> >>>>>      (This may take testing and tuning to be effective.)  This version of OVN
> >>>>>      requires DDLog 0.36.
> >>>>>    - Introduce ovn-controller incremetal processing engine statistics
> >>>>> +  - Bypass connection tracking for ACL "allow" action processing.
> >>>>>
> >>>>>  OVN v21.03.0 - 12 Mar 2021
> >>>>>  -------------------------
> >>>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> >>>>> index a62f5c057..f38d71682 100644
> >>>>> --- a/northd/ovn-northd.8.xml
> >>>>> +++ b/northd/ovn-northd.8.xml
> >>>>> @@ -419,7 +419,9 @@
> >>>>>        before eventually advancing to ingress table <code>ACLs</code>. If
> >>>>>        special ports such as route ports or localnet ports can't use ct(), a
> >>>>>        priority-110 flow is added to skip over stateful ACLs. IPv6 Neighbor
> >>>>> -      Discovery and MLD traffic also skips stateful ACLs.
> >>>>> +      Discovery and MLD traffic also skips stateful ACLs. For stateless "allow"
> >>>>> +      ACLs, a flow is added to bypass setting the hint for connection tracker
> >>>>> +      processing.
> >>>>>      </p>
> >>>>>
> >>>>>      <p>
> >>>>> @@ -603,10 +605,7 @@
> >>>>>      <ul>
> >>>>>        <li>
> >>>>>          <code>allow</code> ACLs translate into logical flows with
> >>>>> -        the <code>next;</code> action.  If there are any stateful ACLs
> >>>>> -        on this datapath, then <code>allow</code> ACLs translate to
> >>>>> -        <code>ct_commit; next;</code> (which acts as a hint for the next tables
> >>>>> -        to commit the connection to conntrack),
> >>>>> +        the <code>next;</code> action.
> >>>>>        </li>
> >>>>>        <li>
> >>>>>          <code>allow-related</code> ACLs translate into logical
> >>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> >>>>> index 4783e43d7..cd343a3e3 100644
> >>>>> --- a/northd/ovn-northd.c
> >>>>> +++ b/northd/ovn-northd.c
> >>>>> @@ -4943,7 +4943,58 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op,
> >>>>>  }
> >>>>>
> >>>>>  static void
> >>>>> -build_pre_acls(struct ovn_datapath *od, struct hmap *lflows)
> >>>>> +build_stateless_filter(struct ovn_datapath *od,
> >>>>> +                       const struct nbrec_acl *acl,
> >>>>> +                       struct hmap *lflows)
> >>>>> +{
> >>>>> +    /* Stateless filters must be applied in both directions so that reply
> >>>>> +     * traffic bypasses conntrack too.
> >>>>> +     */
> >>>>
> >>>> In the "Stateless_Filter" approach that I was proposing initially,
> >>>> filters didn't have a direction.  But ACLs have an explicit direction
> >>>> (to-lport/from-lport).
> >>>>
> >>>> So I think it would be enough to add the flow either in IN_PRE_ACL or in
> >>>> OUT_PRE_ACL, according to the ACL's direction.
> >>>>
> >>>> On the other hand now, if we have the following configuration:
> >>>>
> >>>> vm1 (10.0.0.1) --- LS --- vm2 (10.0.0.2)
> >>>>
> >>>> ACLs:
> >>>> priority 2, from-lport, "inport==vm1 && ip4.dst==10.0.0.2", allow
> >>>> priority 1, from-lport, "inport==vm1 && tcp && ip4.dst==10.0.0.2",
> >>>> allow-related
> >>>> priority 0, from-lport, "inport==vm1 && tcp", drop
> >>>>
> >>>> Notice that there's no DROP rule for non-tcp traffic so I'd assume that
> >>>> ICMP traffic should be always allowed between vm1 and vm2.  However,
> >>>> ping from vm1 to vm2 fails.
> >>>>
> >>>> The problem is that replies from vm2 are sent to conntrack in the
> >>>> IN/OUT_PRE_STATEFUL stage because there's no high priority rule to match
> >>>> them.
> >>>>
> >>>> In this case the user must add an explicit ACL to allow returning
> >>>> stateless traffic:
> >>>> priority 1, from-lport, "inport==vm1 && ip4.src==10.0.0.2", allow
> >>>>
> >>>> But this feels very error prone and quite counter intuitive.
> >>>>
> >>>> The rest of the patch looks OK to me and it still needs the ddlog
> >>>> implementation but I think we need to find a way to fix the use cases
> >>>> like the one above before moving forward.
> >>>>
> >>>> What do you think?
> >>>>
> >>>
> >>> Dumitru,
> >>> thanks for reviewing the patch.
> >>>
> >>> On your concern, this sounds like a usability issue and not
> >>> necessarily an implementation issue, correct? While for simple cases
> >>> (like allowing *all* ICMP traffic both ways, stateless) we could
> >>> introduce a new CLI command to inject the same match into both
> >>> pipelines, it cannot be easily done for any rule that e.g. refers to
> >>> any "directional" fields, like IP src/dst in your example.
> >>>
> >>> How I see it, the whole idea of stateless rules is offloading some
> >>> magic from ct "related" machinery to explicit user configuration, if
> >>> returning traffic is expected (it is not necessarily the case and
> >>> depends on the app).
> >>>
> >>> While I appreciate that user interface becomes more quirky when
> >>> switching from allow-related to allow, I wonder if there is anything
> >>> one could do to mitigate it in general due to the very nature of
> >>> "directed" matching rules. (AFAIU Stateless_Filter implementation that
> >>> you mentioned won't simplify anything in this regard, but let me know
> >>> if you are onto something here that I miss.)
> >>
> >> With Stateless_Filter, my goal was to allow the CMS describe the type of
> >> traffic that should skip conntrack in the ACL stage in a very general
> >> way, e.g., "all TCP traffic should be firewalled in a 'stateless' way".
> >
> > In OpenStack, CMS allows for the same level of configurability for
> > both stateful and stateless security groups. Perhaps you have a
> > different CMS in mind that would have different scenarios for
> > stateless and stateful rules.
> >
>
> Yes, I was thinking of ovn-kubernetes where ALCs are used to implement
> network policies which AFAIU are mainly stateful.  Dan Winship suggested
> a while back an alternative implementation for TCP network policies that
> wouldn't require conntrack.  IIRC it was something like this:
> - only allow packets with tcp-flags=+syn-ack from pods that are allowed
>   to initiate connections
> - drop all other packets with tcp-flags=+syn-ack
> - allow all other TCP packets
>
> This would rely on the pod's TCP stack to implement the firewalling by
> dropping unexpected TCP packets and would also avoid sending TCP traffic
> to conntrack for ACL in OVN.
>
> But for UDP traffic we'd still have to send traffic to conntrack.
>
> This was where Stateless_Filter would come into picture.
>
> >>
> >> Also, with this in mind, the Stateless_Filter rules would've been
> >> generic enough to be applied directly in both directions.
> >>
> >> Back to the scenario above:
> >>
> >> priority 2, from-lport, "inport==vm1 && ip4.dst==10.0.0.2", allow
> >> priority 1, from-lport, "inport==vm1 && tcp && ip4.dst==10.0.0.2",
> >> allow-related
> >> priority 0, from-lport, "inport==vm1 && tcp", drop
> >>
> >> As user I think I'd be very confused by the fact that ICMP replies from
> >> vm2 are dropped.  I agree that this rule set can be changed but I wonder
> >> how often such ACL combinations already exist in production deployments.
> >> Because, with the current changes, they will break.
> >>
> >
> > If it's just a matter of backwards compatibility and we believe that
> > it's unsafe to touch existing verbs, this can either be solved by a
> > new resource type, like in your Stateless_Filter approach, or by
> > introducing a new verb, f.e. 'allow-stateless' [plus as clearly
> > describing the difference between all the verbs in documentation].
>
> That might be the safest way indeed.
>

The latest (v4) up for review includes the switch from allow to new
allow-stateless.

> >
> > If it's also a matter of UX, this is largely a function of
> > configurability of the mechanism (which is a function of CMS
> > considered). For that, I don't have an easy answer on how to keep it
> > both simple / magical AND support all possible matching scenarios that
> > regular rules would support.
> >
> > We can continue discussing here or in the new version of the patch
> > that I'm about to send, incl. ddlog and other comments of yours
> > addressed. I will put some background as to the issue at stake in its
> > commit message for reference while we contemplate alternatives.
>
> Sounds good.
>
> >
> > Ihar
> >
>
> Thanks,
> Dumitru
>



More information about the dev mailing list