[ovs-dev] [OVN][Scale] Conjunctive matches exponentially increase in Table 45

Han Zhou hzhou at ovn.org
Tue Apr 27 05:42:35 UTC 2021


On Mon, Mar 15, 2021 at 5:18 PM Krzysztof Klimonda <
kklimonda at syntaxhighlighted.com> wrote:

> Hi,
>
> Sorry for what is most likely an unconnected reply to a thread - I can't
> seem to figure out how to reply to a thread from before I was subscribed to
> ML.
>
> We've been testing OVN scaling for our OpenStack cloud, and found what
> seems to be a OF flow explosion that is basically a mirror of the issue
> reported by Girish a week ago or so.
>
> In OpenStack, neutron creates a "default" security group that has 4 rules
> (2 for both IPv4 and IPv6):
>
> - allow all egress traffic from the port
> - allow all ingress traffic from other ports belonging to the same default
> group
>
> What we have discovered in our testing, is that this second rule
> translates into the following ACL in OVN:
>
> ```
> outport == @pg_304cc336_8db3_4efd_a558_408e648e6259 && ip4 && ip4.src ==
> $pg_304cc336_8db3_4efd_a558_408e648e6259_ip4
> ```
> where port_group `pg_304cc336_8db3_4efd_a558_408e648e6259_ip4` is defined
> in nbdb and contains all ports attached to the SG, and address_set
> pg_304cc336_8db3_4efd_a558_408e648e6259_ip4 is defined in sbdb and seems to
> have a list of addresses that are assigned to ports from that port_group[1].
>
> As Girish has explained in his email, such ACLs are translated into a
> bunch of duplicated flows that only seem to differ in metadata:
>
> ```
> # ovs-ofctl dump-flows br-int |egrep "(12474|12475)"
> [...]
>  cookie=0x0, duration=47132.116s, table=45, n_packets=0, n_bytes=0,
> idle_age=47132, priority=2002,ip,reg0=0x100/0x100,reg15=0x3,metadata=0x20e
> actions=conjunction(12475,2/2)
>  cookie=0x0, duration=47132.116s, table=45, n_packets=0, n_bytes=0,
> idle_age=47132,
> priority=2002,ip,reg0=0x100/0x100,metadata=0x20e,nw_src=1.0.0.67
> actions=conjunction(12475,1/2)
>  cookie=0x0, duration=47132.116s, table=45, n_packets=0, n_bytes=0,
> idle_age=47132,
> priority=2002,ip,reg0=0x100/0x100,metadata=0x20e,nw_src=2.0.0.52
> actions=conjunction(12475,1/2)
>  cookie=0x0, duration=47132.116s, table=45, n_packets=0, n_bytes=0,
> idle_age=47132,
> priority=2002,ip,reg0=0x100/0x100,metadata=0x20d,nw_src=1.0.0.67
> actions=conjunction(12475,1/2)
>  cookie=0x0, duration=47132.116s, table=45, n_packets=0, n_bytes=0,
> idle_age=47132,
> priority=2002,ip,reg0=0x100/0x100,metadata=0x20d,nw_src=2.0.0.52
> actions=conjunction(12475,1/2)
>  cookie=0xb25108c3, duration=47132.116s, table=45, n_packets=0, n_bytes=0,
> idle_age=47132,
> priority=2002,conj_id=12475,ip,reg0=0x100/0x100,metadata=0x20e
> actions=resubmit(,46)
>  cookie=0xb25108c3, duration=47132.116s, table=45, n_packets=0, n_bytes=0,
> idle_age=47132,
> priority=2002,conj_id=12475,ip,reg0=0x100/0x100,metadata=0x20d
> actions=resubmit(,46)
> [...]
> #
> ```
> (See http://paste.openstack.org/show/803598/ for the full output of grep)
>
> His idea of changing this conjunction into one that matches additionally
> on metadata seems to make sense in this particular instance, given that all
> ports from all datapaths need to evaluate same set of rules, and possibly
> it makes sense for all ACLs too?
>
> Anyway, to understand how OF flows are generated by ovn-controller, I took
> a quick look at the source code, and it seems that right now all flows are
> forcefully matched to their datapath (by unconditional matching on metadata
> field).
> Would it make sense to introduce a notion of "datapath unbound flow" when
> conjunction is already matching metadata?
> Are there some other parts of OVN code that heavily depend on flows being
> installed per-dp?
> How would that affect OVS performance when matching packets in userspace?
> In our testing we've ended up with over 1M flows installed in table 45,
> which seems to be dwarfing any potential performance loss from having flows
> that don't match on metadata field, but perhaps I'm wrong? Still, that's a
> lot of flows, and puts a hard scaling limit on some openstack deployments
> given it's a SG that is by default attached to all ports on all VMs.
>
>
> [1] (although apparently not additional IP addresses allowed on port via
> allowed-address-pair - I think I've seen this issue before while testing
> magnum.
>
>
> --
>   Krzysztof Klimonda
>   kklimonda at syntaxhighlighted.com
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Hi Krzysztof,

Sorry for the late response, but here is a series of patch to the problem:
https://patchwork.ozlabs.org/project/ovn/list/?series=240419

Would you give it a try?

Thanks,
Han


More information about the dev mailing list