[ovs-dev] [PATCH] ovn: Add second ACL stage

Tue Aug 2 21:38:52 UTC 2016

On Tue, Aug 2, 2016 at 1:39 PM, Darrell Ball <dlu998 at gmail.com> wrote:

>
>
> On Tue, Aug 2, 2016 at 12:05 PM, Russell Bryant <russell at ovn.org> wrote:
>
>>
>>
>> On Tue, Aug 2, 2016 at 3:02 PM, Darrell Ball <dlu998 at gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, Aug 2, 2016 at 10:23 AM, Mickey Spiegel <mickeys.dev at gmail.com>
>>> wrote:
>>>
>>>> On Tue, Aug 2, 2016 at 9:26 AM, Darrell Ball <dlu998 at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell at ovn.org>
>>>>> wrote:
>>>>>
>>>>>> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <
>>>>>> mickeys.dev at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <
>>>>>> emspiege at us.ibm.com>
>>>>>> > wrote:
>>>>>> > >
>>>>>> > > -----"dev" <dev-bounces at openvswitch.org> wrote: -----
>>>>>> > >> To: Mickey Spiegel <mickeys.dev at gmail.com>
>>>>>> > >> From: Russell Bryant
>>>>>> > >> Sent by: "dev"
>>>>>> > >> Date: 07/29/2016 10:02AM
>>>>>> > >> Cc: ovs dev <dev at openvswitch.org>
>>>>>> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
>>>>>> > >>
>>>>>> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
>>>>>> mickeys.dev at gmail.com
>>>>>> > >
>>>>>> > >> wrote:
>>>>>> > >>
>>>>>> > >>>
>>>>>> > >>> This patch adds a second logical switch ingress ACL stage, and
>>>>>> > >>> correspondingly a second logical switch egress ACL stage.  This
>>>>>> > >>> allows for more than one ACL-based feature to be applied in the
>>>>>> > >>> ingress and egress logical switch pipelines.  The features
>>>>>> > >>> driving the different ACL stages may be configured by different
>>>>>> > >>> users, for example an application deployer managing security
>>>>>> > >>> groups and a network or security admin configuring network ACLs
>>>>>> > >>> or firewall rules.
>>>>>> > >>>
>>>>>> > >>> Each ACL stage is self contained.  The "action" for the
>>>>>> > >>> highest-"priority" matching row in an ACL stage determines a
>>>>>> > >>> packet's treatment.  A separate "action" will be determined in
>>>>>> > >>> each ACL stage, according to the ACL rules configured for that
>>>>>> > >>> ACL stage.  The "priority" values are only relevant within the
>>>>>> > >>> context of an ACL stage.
>>>>>> > >>>
>>>>>> > >>> ACL rules that do not specify an ACL stage are applied to the
>>>>>> > >>> default "acl" stage.
>>>>>> > >>>
>>>>>> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev at gmail.com>
>>>>>> > >>
>>>>>> > >>
>>>>>> > >> Could you expand on why priorities in a single stage aren't
>>>>>> enough to
>>>>>> > >> satisfy the use case?
>>>>>> > >
>>>>>> > > If two features are configured independently with a mix of
>>>>>> > > prioritized allow and drop rules, then with a single stage, a
>>>>>> > > new set of ACL rules must be produced that achieves the same
>>>>>> > > behavior.  This is sometimes referred to as an "ACL merge"
>>>>>> > > algorithm, for example:
>>>>>> > >
>>>>>> >
>>>>>> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
>>>>>> > >
>>>>>> > > In the worst case, for example when the features act on different
>>>>>> > > packet fields (e.g. one on IP address and another on L4 port),
>>>>>> > > the number of rules required can approach
>>>>>> > > (# of ACL1 rules) * (# of ACL2 rules).
>>>>>> > >
>>>>>> > > While it is possible to code up such an algorithm, it adds
>>>>>> > > significant complexity and complicates whichever layer
>>>>>> > > implements the merge algorithm, either OVN or the CMS above.
>>>>>> > >
>>>>>> > > By using multiple independent pipeline stages, all of this
>>>>>> > > software complexity is avoided, achieving the proper result
>>>>>> > > in a simple and straightforward manner.
>>>>>> > >
>>>>>> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
>>>>>> > > stages, though they tend to evaluate these in parallel given
>>>>>> > > all the emphasis on low latency these days.
>>>>>> >
>>>>>> > Throwing in an example to illustrate the difference between one
>>>>>> > ACL stage and two ACL stages:
>>>>>> >
>>>>>> > If two separate ACL stages:
>>>>>> > Feature 1
>>>>>> > acl  from-lport  100 (tcp == 80) allow-related
>>>>>> > acl  from-lport  100 (tcp == 8080) allow-related
>>>>>> > acl  from-lport  100 (udp) allow-related
>>>>>> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
>>>>>> >
>>>>>> > Feature 2
>>>>>> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
>>>>>> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
>>>>>> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>>>> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>>>> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
>>>>>> >
>>>>>> > Combined in one stage, to get the equivalent behavior, this would
>>>>>> require:
>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80)
>>>>>> allow-related
>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080)
>>>>>> allow-related
>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src ==
>>>>>> 10.1.1.0/24 &&
>>>>>> > tcp) allow-related
>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80)
>>>>>> allow-related
>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
>>>>>> allow-related
>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src ==
>>>>>> 10.1.1.0/24 &&
>>>>>> > tcp) allow-related
>>>>>> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>>>> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80)
>>>>>> allow-related
>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080)
>>>>>> allow-related
>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24
>>>>>> &&
>>>>>> > tcp) allow-related
>>>>>> >
>>>>>>
>>>>>> Or have an address set, "addrset1", which contains {172.16.10.0/24,
>>>>>> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
>>>>>>
>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80,
>>>>>> 8080})
>>>>>> allow-related
>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24
>>>>>> &&
>>>>>> tcp) allow-related
>>>>>>
>>>>>>
>>>>>> >
>>>>>> > If there are more IP addresses in feature 2, then the number
>>>>>> > of ACL rules will climb geometrically:
>>>>>> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2
>>>>>> drop
>>>>>> > rules)
>>>>>> >
>>>>>> > With 2 separate ACL stages, the rules just go straight into
>>>>>> > the corresponding ACL table, no merge required:
>>>>>> > (# feature 1 rules + # feature 2 rules)
>>>>>> >
>>>>>>
>>>>>> Thanks for elaborating.  I'm not opposed.  It seems harmless if not
>>>>>> being
>>>>>> used.
>>>>>>
>>>>>
>>>>>
>>>>> There are presently no unit tests for ACLs in the system tests
>>>>> (system-ovn.at).
>>>>> The first step should be to add unit tests for single stage ACLs.
>>>>> and then add a delta of tests if other stages are desired.
>>>>>
>>>>> It will be good to test the coordination between multiple stages
>>>>> coming directly from northbound APIs and check what happens when
>>>>> multistage ACLs are setup and torn down stage by stage, particularly
>>>>> when the datapath ends up in a more permissive state for some period
>>>>> of time.
>>>>>
>>>>
>>> This feature proposal has a problem for both setup and teardown where
>>> the staging will result in a more permissive state for periods of time.
>>>
>>> Here is a simple example based on your example above:
>>> If one only wants to allow TCP and src IP 20.20.20.20 and the stage with
>>> TCP is
>>> added first with the stage with src IP 20.20.20.20 lagging, one will
>>> have the
>>> following
>>>
>>> 200 TCP permit
>>> 100 DROP ALL
>>>
>>> which permits all TCP - not what we want.
>>>
>>> We cannot enforce a transaction across multiple databases (NB, SB,
>>> ovn-controller)
>>>
>>
That is not how this is meant to be used. I used one stage for IP addresses
and
one stage for L4 port as a worst case example of expansion due to ACL merge.
That is not the motivation for using two stages. The motivation is for two
different
features that are configured separately, with one example being OpenStack
Security Groups versus OpenStack FWaaS v2, another example being Security
Groups versus Network ACLs as in a rather popular public cloud.

If you have correlated intent, with TCP and src IP 20.20.20.20 belonging
together,
they should absolutely be put together in one rule in one common ACL stage.

I don't understand this.  Rules for both stages could be added in the same
>> transaction.  It's all in the same table of the northbound database.
>>
>>
>
> I am assuming that the rules would be entered into the Northbound database
> in the same
> transaction. That part is fine.
>
> However, there is no enforcement of a transaction across multiple
> databases in
> OVN. So there is no requirement that northd and ovn-controller maintain
> that NB DB transaction
> across different tables which generating their respective output (i.e. SB
> DB and openflow rules).
>
>
>
>
>
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Can you update the docs to indicate the specific accepted values for
>>>>>> "stage"?
>>>>>
>>>>>
>>>>>
>>>>> This would significantly complicate the usage of northbound ACL APIs,
>>>>> since multi-staging would be exposed at the top (northbound) OVN layer.
>>>>>
>>>>
>>>> The default behavior when "stage" is not specified is to apply the ACL
>>>> to the
>>>> existing "acl" stage. If you don't care about the second ACL stage,
>>>> continue
>>>> to use ACLs as you do today and it will work. There is no complication.
>>>>
>>>
>>> You need a set of guidelines.
>>> You just cannot assume the northbound API usage will avoid this feature.
>>> How does one know this feature should be avoided or when to use it.
>>> Assuming one decides to use it, how does one know how to use it.
>>>
>>
If you are exposing the OVN northbound API directly, then you have two
options:
1. Hide stage, and just program everything in the default "acl" stage.
2. Expose stage and try to explain how it works.

Hardware switches have had multiple ACL tables for many many years.
As far as I can remember, they are always used for different features that
are configured separately:
Security based on VLANs
Security based on L3 interface
QoS
Service Function Chaining
Control Plane Protection

This would need a clear set of guidelines how northbound
>>>>> multistage ACLs would be used by a CMS, at the user level.
>>>>>
>>>>
>>>> The CMS typically does not expose ACLs directly to the user. For
>>>> example,
>>>> with OpenStack, Security Groups use the default "acl" stage. OpenStack
>>>> FWaaS v2 would use the "acl2" stage. These are two separate OpenStack
>>>> features with separate OpenStack northbound APIs to the user.
>>>>
>>>
>>>
>>> First of all, every OVN feature should not be tied to Openstack.]
>>>
>>
>> It was just used as an example of how it would be used ...
>>
>
As I said above and Russell reiterated, OpenStack FWaaS is just one example.
That is why I went with the generic stage names of "acl" and "acl2" rather
than something like "fw".

Mickey

> --
>> Russell Bryant
>>
>
>