[ovs-dev] [PATCH ovn v2] ovn-northd: bypass ct for allow ACLs

Dumitru Ceara dceara at redhat.com
Tue Apr 13 08:01:57 UTC 2021


On 4/13/21 2:26 AM, Ihar Hrachyshka wrote:
> On Mon, Apr 12, 2021 at 5:00 AM Dumitru Ceara <dceara at redhat.com> wrote:
>>
>> On 4/6/21 6:35 PM, Ihar Hrachyshka wrote:
>>> On Fri, Apr 2, 2021 at 7:23 AM Dumitru Ceara <dceara at redhat.com> wrote:
>>>>
>>>> On 3/24/21 3:40 PM, Ihar Hrachyshka wrote:
>>>>> For allow ACLs, bypass connection tracking by avoiding setting ct
>>>>> hints for matching traffic. Avoid sending all traffic to ct when a
>>>>> stateful ACL is present. Before the patch, this unnecessarily hit
>>>>> performance when mixed ACL action types were used for the same
>>>>> datapath.
>>>>>
>>>>> ===
>>>>>
>>>>> For performance measurements, ovn-fake-multinode environment and qperf
>>>>> were used. Performance measured between two virtual nodes, two ports
>>>>> that belong to different LSs connected via router. Using qperf,
>>>>> performance was measured for UDP, TCP, SCTP protocols (using
>>>>> <proto>_lat and <proto>_bw tests). The qperf version used:
>>>>> 0.4.9-16.fc31.x86_64.  Each test scenario was executed five times and
>>>>> averages compared.
>>>>>
>>>>> Tests were executed with `allow` rules for the tested protocol and
>>>>> `allow-related` for another protocol set for both ports, both
>>>>> directions, e.g. for TCP scenario, the following ACLs were defined:
>>>>>
>>>>> ovn-nbctl acl-add sw0 to-lport 100 tcp allow
>>>>> ovn-nbctl acl-add sw0 from-lport 100 tcp allow
>>>>> ovn-nbctl acl-add sw1 to-lport 100 tcp allow
>>>>> ovn-nbctl acl-add sw1 from-lport 100 tcp allow
>>>>>
>>>>> ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related
>>>>> ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related
>>>>> ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related
>>>>> ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related
>>>>>
>>>>> In this particular environment, improvement was seen in send_bw,
>>>>> latency, and msg_rate measurements, where applicable, for all three
>>>>> protocols under test.
>>>>>
>>>>> for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%)
>>>>>          latency: 16 us => 14.08 us (-12%)
>>>>>          msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%)
>>>>>
>>>>> for TCP, latency: 18.6 us => 14.88 us (-20%)
>>>>>          msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%)
>>>>>
>>>>> for SCTP, latency: 21.98 us => 19.42 us (-11.65%)
>>>>>           msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%)
>>>>>
>>>>> Interestingly, some performance improvement was also seen for the same
>>>>> scenarios with no ACLs set at all, albeit significantly more
>>>>> negligible.
>>>>>
>>>>> for UDP, send_bw: 320.0 MB/sec => 338.6 MB/sec (+5.81%)
>>>>>          latency: 13.74 us => 12.88 us (-6.68%)
>>>>>          msg_rate: 73.02 K/sec => 77.84 K/sec (+6.6%)
>>>>>
>>>>> for TCP, latency: 15.62 us => 14.26 us (-9.54%)
>>>>>          msg_rate: 64.02 K/sec => 70.26 K/sec (+9.75%)
>>>>>
>>>>> for SCTP, latency: 19.56 us => 18.16 us (-7.16%)
>>>>>           msg_rate: 51.16 K/sec => 55.12 K/sec (+7.74%)
>>>>>
>>>>> Comparable numbers can be captured with iperf. It may be useful to run
>>>>> more tests in a more elaborate (bare metal) environment.
>>>>>
>>>>> ===
>>>>>
>>>>> The patch takes inspiration from a now abandoned patch:
>>>>>
>>>>> "ovn-northd: Support mixing stateless/stateful ACLs with
>>>>> Stateless_Filter." by Dumitru Ceara.
>>>>>
>>>>> Signed-off-by: Ihar Hrachyshka <ihrachys at redhat.com>
>>>>>
>>>>> ---
>>>>
>>>> Hi Ihar,
>>>>
>>>> Thanks for working on this and sorry for the delay in reviewing your patch.
>>>>
>>>>>
>>>>> v1: initial version.
>>>>> v2: rebased after conflict.
>>>>> ---
>>>>>  NEWS                    |   1 +
>>>>>  northd/ovn-northd.8.xml |   9 +-
>>>>>  northd/ovn-northd.c     | 166 +++++++++++++++--------
>>>>>  tests/ovn-northd.at     | 287 ++++++++++++++++++++++++++++++++++++++++
>>>>>  4 files changed, 400 insertions(+), 63 deletions(-)
>>>>>
>>>>> diff --git a/NEWS b/NEWS
>>>>> index 530c5d42f..548a45fb7 100644
>>>>> --- a/NEWS
>>>>> +++ b/NEWS
>>>>> @@ -7,6 +7,7 @@ Post-v21.03.0
>>>>>      (This may take testing and tuning to be effective.)  This version of OVN
>>>>>      requires DDLog 0.36.
>>>>>    - Introduce ovn-controller incremetal processing engine statistics
>>>>> +  - Bypass connection tracking for ACL "allow" action processing.
>>>>>
>>>>>  OVN v21.03.0 - 12 Mar 2021
>>>>>  -------------------------
>>>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>>>>> index a62f5c057..f38d71682 100644
>>>>> --- a/northd/ovn-northd.8.xml
>>>>> +++ b/northd/ovn-northd.8.xml
>>>>> @@ -419,7 +419,9 @@
>>>>>        before eventually advancing to ingress table <code>ACLs</code>. If
>>>>>        special ports such as route ports or localnet ports can't use ct(), a
>>>>>        priority-110 flow is added to skip over stateful ACLs. IPv6 Neighbor
>>>>> -      Discovery and MLD traffic also skips stateful ACLs.
>>>>> +      Discovery and MLD traffic also skips stateful ACLs. For stateless "allow"
>>>>> +      ACLs, a flow is added to bypass setting the hint for connection tracker
>>>>> +      processing.
>>>>>      </p>
>>>>>
>>>>>      <p>
>>>>> @@ -603,10 +605,7 @@
>>>>>      <ul>
>>>>>        <li>
>>>>>          <code>allow</code> ACLs translate into logical flows with
>>>>> -        the <code>next;</code> action.  If there are any stateful ACLs
>>>>> -        on this datapath, then <code>allow</code> ACLs translate to
>>>>> -        <code>ct_commit; next;</code> (which acts as a hint for the next tables
>>>>> -        to commit the connection to conntrack),
>>>>> +        the <code>next;</code> action.
>>>>>        </li>
>>>>>        <li>
>>>>>          <code>allow-related</code> ACLs translate into logical
>>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>>>> index 4783e43d7..cd343a3e3 100644
>>>>> --- a/northd/ovn-northd.c
>>>>> +++ b/northd/ovn-northd.c
>>>>> @@ -4943,7 +4943,58 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op,
>>>>>  }
>>>>>
>>>>>  static void
>>>>> -build_pre_acls(struct ovn_datapath *od, struct hmap *lflows)
>>>>> +build_stateless_filter(struct ovn_datapath *od,
>>>>> +                       const struct nbrec_acl *acl,
>>>>> +                       struct hmap *lflows)
>>>>> +{
>>>>> +    /* Stateless filters must be applied in both directions so that reply
>>>>> +     * traffic bypasses conntrack too.
>>>>> +     */
>>>>
>>>> In the "Stateless_Filter" approach that I was proposing initially,
>>>> filters didn't have a direction.  But ACLs have an explicit direction
>>>> (to-lport/from-lport).
>>>>
>>>> So I think it would be enough to add the flow either in IN_PRE_ACL or in
>>>> OUT_PRE_ACL, according to the ACL's direction.
>>>>
>>>> On the other hand now, if we have the following configuration:
>>>>
>>>> vm1 (10.0.0.1) --- LS --- vm2 (10.0.0.2)
>>>>
>>>> ACLs:
>>>> priority 2, from-lport, "inport==vm1 && ip4.dst==10.0.0.2", allow
>>>> priority 1, from-lport, "inport==vm1 && tcp && ip4.dst==10.0.0.2",
>>>> allow-related
>>>> priority 0, from-lport, "inport==vm1 && tcp", drop
>>>>
>>>> Notice that there's no DROP rule for non-tcp traffic so I'd assume that
>>>> ICMP traffic should be always allowed between vm1 and vm2.  However,
>>>> ping from vm1 to vm2 fails.
>>>>
>>>> The problem is that replies from vm2 are sent to conntrack in the
>>>> IN/OUT_PRE_STATEFUL stage because there's no high priority rule to match
>>>> them.
>>>>
>>>> In this case the user must add an explicit ACL to allow returning
>>>> stateless traffic:
>>>> priority 1, from-lport, "inport==vm1 && ip4.src==10.0.0.2", allow
>>>>
>>>> But this feels very error prone and quite counter intuitive.
>>>>
>>>> The rest of the patch looks OK to me and it still needs the ddlog
>>>> implementation but I think we need to find a way to fix the use cases
>>>> like the one above before moving forward.
>>>>
>>>> What do you think?
>>>>
>>>
>>> Dumitru,
>>> thanks for reviewing the patch.
>>>
>>> On your concern, this sounds like a usability issue and not
>>> necessarily an implementation issue, correct? While for simple cases
>>> (like allowing *all* ICMP traffic both ways, stateless) we could
>>> introduce a new CLI command to inject the same match into both
>>> pipelines, it cannot be easily done for any rule that e.g. refers to
>>> any "directional" fields, like IP src/dst in your example.
>>>
>>> How I see it, the whole idea of stateless rules is offloading some
>>> magic from ct "related" machinery to explicit user configuration, if
>>> returning traffic is expected (it is not necessarily the case and
>>> depends on the app).
>>>
>>> While I appreciate that user interface becomes more quirky when
>>> switching from allow-related to allow, I wonder if there is anything
>>> one could do to mitigate it in general due to the very nature of
>>> "directed" matching rules. (AFAIU Stateless_Filter implementation that
>>> you mentioned won't simplify anything in this regard, but let me know
>>> if you are onto something here that I miss.)
>>
>> With Stateless_Filter, my goal was to allow the CMS describe the type of
>> traffic that should skip conntrack in the ACL stage in a very general
>> way, e.g., "all TCP traffic should be firewalled in a 'stateless' way".
> 
> In OpenStack, CMS allows for the same level of configurability for
> both stateful and stateless security groups. Perhaps you have a
> different CMS in mind that would have different scenarios for
> stateless and stateful rules.
> 

Yes, I was thinking of ovn-kubernetes where ALCs are used to implement
network policies which AFAIU are mainly stateful.  Dan Winship suggested
a while back an alternative implementation for TCP network policies that
wouldn't require conntrack.  IIRC it was something like this:
- only allow packets with tcp-flags=+syn-ack from pods that are allowed
  to initiate connections
- drop all other packets with tcp-flags=+syn-ack
- allow all other TCP packets

This would rely on the pod's TCP stack to implement the firewalling by
dropping unexpected TCP packets and would also avoid sending TCP traffic
to conntrack for ACL in OVN.

But for UDP traffic we'd still have to send traffic to conntrack.

This was where Stateless_Filter would come into picture.

>>
>> Also, with this in mind, the Stateless_Filter rules would've been
>> generic enough to be applied directly in both directions.
>>
>> Back to the scenario above:
>>
>> priority 2, from-lport, "inport==vm1 && ip4.dst==10.0.0.2", allow
>> priority 1, from-lport, "inport==vm1 && tcp && ip4.dst==10.0.0.2",
>> allow-related
>> priority 0, from-lport, "inport==vm1 && tcp", drop
>>
>> As user I think I'd be very confused by the fact that ICMP replies from
>> vm2 are dropped.  I agree that this rule set can be changed but I wonder
>> how often such ACL combinations already exist in production deployments.
>> Because, with the current changes, they will break.
>>
> 
> If it's just a matter of backwards compatibility and we believe that
> it's unsafe to touch existing verbs, this can either be solved by a
> new resource type, like in your Stateless_Filter approach, or by
> introducing a new verb, f.e. 'allow-stateless' [plus as clearly
> describing the difference between all the verbs in documentation].

That might be the safest way indeed.

> 
> If it's also a matter of UX, this is largely a function of
> configurability of the mechanism (which is a function of CMS
> considered). For that, I don't have an easy answer on how to keep it
> both simple / magical AND support all possible matching scenarios that
> regular rules would support.
> 
> We can continue discussing here or in the new version of the patch
> that I'm about to send, incl. ddlog and other comments of yours
> addressed. I will put some background as to the issue at stake in its
> commit message for reference while we contemplate alternatives.

Sounds good.

> 
> Ihar
> 

Thanks,
Dumitru



More information about the dev mailing list