[ovs-dev] [PATCH ovn v2] ovn-northd: bypass ct for allow ACLs

Dumitru Ceara dceara at redhat.com
Mon Apr 12 09:00:34 UTC 2021


On 4/6/21 6:35 PM, Ihar Hrachyshka wrote:
> On Fri, Apr 2, 2021 at 7:23 AM Dumitru Ceara <dceara at redhat.com> wrote:
>>
>> On 3/24/21 3:40 PM, Ihar Hrachyshka wrote:
>>> For allow ACLs, bypass connection tracking by avoiding setting ct
>>> hints for matching traffic. Avoid sending all traffic to ct when a
>>> stateful ACL is present. Before the patch, this unnecessarily hit
>>> performance when mixed ACL action types were used for the same
>>> datapath.
>>>
>>> ===
>>>
>>> For performance measurements, ovn-fake-multinode environment and qperf
>>> were used. Performance measured between two virtual nodes, two ports
>>> that belong to different LSs connected via router. Using qperf,
>>> performance was measured for UDP, TCP, SCTP protocols (using
>>> <proto>_lat and <proto>_bw tests). The qperf version used:
>>> 0.4.9-16.fc31.x86_64.  Each test scenario was executed five times and
>>> averages compared.
>>>
>>> Tests were executed with `allow` rules for the tested protocol and
>>> `allow-related` for another protocol set for both ports, both
>>> directions, e.g. for TCP scenario, the following ACLs were defined:
>>>
>>> ovn-nbctl acl-add sw0 to-lport 100 tcp allow
>>> ovn-nbctl acl-add sw0 from-lport 100 tcp allow
>>> ovn-nbctl acl-add sw1 to-lport 100 tcp allow
>>> ovn-nbctl acl-add sw1 from-lport 100 tcp allow
>>>
>>> ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related
>>> ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related
>>> ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related
>>> ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related
>>>
>>> In this particular environment, improvement was seen in send_bw,
>>> latency, and msg_rate measurements, where applicable, for all three
>>> protocols under test.
>>>
>>> for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%)
>>>          latency: 16 us => 14.08 us (-12%)
>>>          msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%)
>>>
>>> for TCP, latency: 18.6 us => 14.88 us (-20%)
>>>          msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%)
>>>
>>> for SCTP, latency: 21.98 us => 19.42 us (-11.65%)
>>>           msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%)
>>>
>>> Interestingly, some performance improvement was also seen for the same
>>> scenarios with no ACLs set at all, albeit significantly more
>>> negligible.
>>>
>>> for UDP, send_bw: 320.0 MB/sec => 338.6 MB/sec (+5.81%)
>>>          latency: 13.74 us => 12.88 us (-6.68%)
>>>          msg_rate: 73.02 K/sec => 77.84 K/sec (+6.6%)
>>>
>>> for TCP, latency: 15.62 us => 14.26 us (-9.54%)
>>>          msg_rate: 64.02 K/sec => 70.26 K/sec (+9.75%)
>>>
>>> for SCTP, latency: 19.56 us => 18.16 us (-7.16%)
>>>           msg_rate: 51.16 K/sec => 55.12 K/sec (+7.74%)
>>>
>>> Comparable numbers can be captured with iperf. It may be useful to run
>>> more tests in a more elaborate (bare metal) environment.
>>>
>>> ===
>>>
>>> The patch takes inspiration from a now abandoned patch:
>>>
>>> "ovn-northd: Support mixing stateless/stateful ACLs with
>>> Stateless_Filter." by Dumitru Ceara.
>>>
>>> Signed-off-by: Ihar Hrachyshka <ihrachys at redhat.com>
>>>
>>> ---
>>
>> Hi Ihar,
>>
>> Thanks for working on this and sorry for the delay in reviewing your patch.
>>
>>>
>>> v1: initial version.
>>> v2: rebased after conflict.
>>> ---
>>>  NEWS                    |   1 +
>>>  northd/ovn-northd.8.xml |   9 +-
>>>  northd/ovn-northd.c     | 166 +++++++++++++++--------
>>>  tests/ovn-northd.at     | 287 ++++++++++++++++++++++++++++++++++++++++
>>>  4 files changed, 400 insertions(+), 63 deletions(-)
>>>
>>> diff --git a/NEWS b/NEWS
>>> index 530c5d42f..548a45fb7 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -7,6 +7,7 @@ Post-v21.03.0
>>>      (This may take testing and tuning to be effective.)  This version of OVN
>>>      requires DDLog 0.36.
>>>    - Introduce ovn-controller incremetal processing engine statistics
>>> +  - Bypass connection tracking for ACL "allow" action processing.
>>>
>>>  OVN v21.03.0 - 12 Mar 2021
>>>  -------------------------
>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>>> index a62f5c057..f38d71682 100644
>>> --- a/northd/ovn-northd.8.xml
>>> +++ b/northd/ovn-northd.8.xml
>>> @@ -419,7 +419,9 @@
>>>        before eventually advancing to ingress table <code>ACLs</code>. If
>>>        special ports such as route ports or localnet ports can't use ct(), a
>>>        priority-110 flow is added to skip over stateful ACLs. IPv6 Neighbor
>>> -      Discovery and MLD traffic also skips stateful ACLs.
>>> +      Discovery and MLD traffic also skips stateful ACLs. For stateless "allow"
>>> +      ACLs, a flow is added to bypass setting the hint for connection tracker
>>> +      processing.
>>>      </p>
>>>
>>>      <p>
>>> @@ -603,10 +605,7 @@
>>>      <ul>
>>>        <li>
>>>          <code>allow</code> ACLs translate into logical flows with
>>> -        the <code>next;</code> action.  If there are any stateful ACLs
>>> -        on this datapath, then <code>allow</code> ACLs translate to
>>> -        <code>ct_commit; next;</code> (which acts as a hint for the next tables
>>> -        to commit the connection to conntrack),
>>> +        the <code>next;</code> action.
>>>        </li>
>>>        <li>
>>>          <code>allow-related</code> ACLs translate into logical
>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>> index 4783e43d7..cd343a3e3 100644
>>> --- a/northd/ovn-northd.c
>>> +++ b/northd/ovn-northd.c
>>> @@ -4943,7 +4943,58 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op,
>>>  }
>>>
>>>  static void
>>> -build_pre_acls(struct ovn_datapath *od, struct hmap *lflows)
>>> +build_stateless_filter(struct ovn_datapath *od,
>>> +                       const struct nbrec_acl *acl,
>>> +                       struct hmap *lflows)
>>> +{
>>> +    /* Stateless filters must be applied in both directions so that reply
>>> +     * traffic bypasses conntrack too.
>>> +     */
>>
>> In the "Stateless_Filter" approach that I was proposing initially,
>> filters didn't have a direction.  But ACLs have an explicit direction
>> (to-lport/from-lport).
>>
>> So I think it would be enough to add the flow either in IN_PRE_ACL or in
>> OUT_PRE_ACL, according to the ACL's direction.
>>
>> On the other hand now, if we have the following configuration:
>>
>> vm1 (10.0.0.1) --- LS --- vm2 (10.0.0.2)
>>
>> ACLs:
>> priority 2, from-lport, "inport==vm1 && ip4.dst==10.0.0.2", allow
>> priority 1, from-lport, "inport==vm1 && tcp && ip4.dst==10.0.0.2",
>> allow-related
>> priority 0, from-lport, "inport==vm1 && tcp", drop
>>
>> Notice that there's no DROP rule for non-tcp traffic so I'd assume that
>> ICMP traffic should be always allowed between vm1 and vm2.  However,
>> ping from vm1 to vm2 fails.
>>
>> The problem is that replies from vm2 are sent to conntrack in the
>> IN/OUT_PRE_STATEFUL stage because there's no high priority rule to match
>> them.
>>
>> In this case the user must add an explicit ACL to allow returning
>> stateless traffic:
>> priority 1, from-lport, "inport==vm1 && ip4.src==10.0.0.2", allow
>>
>> But this feels very error prone and quite counter intuitive.
>>
>> The rest of the patch looks OK to me and it still needs the ddlog
>> implementation but I think we need to find a way to fix the use cases
>> like the one above before moving forward.
>>
>> What do you think?
>>
> 
> Dumitru,
> thanks for reviewing the patch.
> 
> On your concern, this sounds like a usability issue and not
> necessarily an implementation issue, correct? While for simple cases
> (like allowing *all* ICMP traffic both ways, stateless) we could
> introduce a new CLI command to inject the same match into both
> pipelines, it cannot be easily done for any rule that e.g. refers to
> any "directional" fields, like IP src/dst in your example.
> 
> How I see it, the whole idea of stateless rules is offloading some
> magic from ct "related" machinery to explicit user configuration, if
> returning traffic is expected (it is not necessarily the case and
> depends on the app).
> 
> While I appreciate that user interface becomes more quirky when
> switching from allow-related to allow, I wonder if there is anything
> one could do to mitigate it in general due to the very nature of
> "directed" matching rules. (AFAIU Stateless_Filter implementation that
> you mentioned won't simplify anything in this regard, but let me know
> if you are onto something here that I miss.)

With Stateless_Filter, my goal was to allow the CMS describe the type of
traffic that should skip conntrack in the ACL stage in a very general
way, e.g., "all TCP traffic should be firewalled in a 'stateless' way".

Also, with this in mind, the Stateless_Filter rules would've been
generic enough to be applied directly in both directions.

Back to the scenario above:

priority 2, from-lport, "inport==vm1 && ip4.dst==10.0.0.2", allow
priority 1, from-lport, "inport==vm1 && tcp && ip4.dst==10.0.0.2",
allow-related
priority 0, from-lport, "inport==vm1 && tcp", drop

As user I think I'd be very confused by the fact that ICMP replies from
vm2 are dropped.  I agree that this rule set can be changed but I wonder
how often such ACL combinations already exist in production deployments.
Because, with the current changes, they will break.

> 
> Ack on ddlog, my fault.
> Ihar
> 

Thanks,
Dumitru



More information about the dev mailing list