[ovs-discuss] [ovn] lflows explosion when using a lot of FIPs (dnat_and_snat NAT entries)

Daniel Alvarez Sanchez dalvarez at redhat.com
Tue Jan 28 15:55:41 UTC 2020


Hi all,

Based on some problems that we've detected at scale, I've been doing an
analysis of how logical flows are distributed on a system which makes heavy
use of Floating IPs (dnat_and_snat NAT entries) and DVR.

[root at central ~]# ovn-nbctl list NAT|grep dnat_and_snat -c
985

With 985 Floating IPs (and ~1.2K ACLs), I can see that 680K logical flows
are generated. This is creating a terribly stress everywhere (ovsdb-server,
ovn-northd, ovn-controller) especially upon reconnection of ovn-controllers
to the SB database which have to read ~0.7 million of logical flows and
process them:

[root at central ~]# time ovn-sbctl list logical_flow > logical_flows.txt
real    1m17.465s
user    0m41.916s
sys     0m1.996s
[root at central ~]# grep _uuid logical_flows.txt -c
680276

The problem is even worse when a lot of clients are simultaneously reading
the dump from the SB DB server (this could be certainly alleviated by using
RAFT but we're not there yet) causing even OOM killers on
ovsdb-server/ovn-northd and a severe delay of the control plane to be
operational again.

I have investigated a little bit the lflows generated and their
distribution per stage finding that 62.2% are in the lr_out_egr_loop and
31.1% are in the lr_in_ip_routing stage:

[root at central ~]# head -n 10 logical_flows_distribution_sorted.txt
lr_out_egr_loop: 423414  62.24%
lr_in_ip_routing: 212199  31.19%
lr_in_ip_input: 10831  1.59%
ls_out_acl: 4831  0.71%
ls_in_port_sec_ip: 3471  0.51%
ls_in_l2_lkup: 2360  0.34%
....

Tackling first the lflows in lr_out_egr_loop I can see that there are
mainly two lflow types:

1)

external_ids        : {source="ovn-northd.c:8807",
stage-name=lr_out_egr_loop}
logical_datapath    : 261206d2-72c5-4e79-ae5c-669e6ee4e71a
match               : "ip4.src == 10.142.140.39 && ip4.dst ==
10.142.140.112"
pipeline            : egress
priority            : 200
table_id            : 2
hash                : 0

2)
actions             : "inport = outport; outport = \"\"; flags = 0;
flags.loopback = 1; reg9[1] = 1; next(pipeline=ingress, table=0); "
external_ids        : {source="ovn-northd.c:8799",
stage-name=lr_out_egr_loop}
logical_datapath    : 161206d2-72c5-4e79-ae5c-669e6ee4e71a
match               :
"is_chassis_resident(\"42f64a6c-a52d-4712-8c56-876e8fb30c03\") && ip4.src
== 10.142.140.39 && ip4.dst == 10.142.141.19"
pipeline            : egress
priority            : 300

Looks like these lflows are added by this commit:
https://github.com/ovn-org/ovn/commit/551e3d989557bd2249d5bbe0978b44b775c5e619


And each Floating IP contributes to ~1.2K lflows (of course this grows as
the number of FIPs grow):

[root at central ~]# grep 10.142.140.39  lr_out_egr_loop.txt |grep match  -c
1233

Similarly, for the lr_in_ip_routing stage, we find the same pattern:

1)
actions             : "outport =
\"lrp-d2d745f5-91f0-4626-81c0-715c63d35716\"; eth.src = fa:16:3e:22:02:29;
eth.dst = fa:16:5e:6f:36:e4; reg0 = ip4.dst; reg1 = 10.142.143.147; reg9[2]
= 1; reg9[0] = 0; next;"
external_ids        : {source="ovn-northd.c:6782",
stage-name=lr_in_ip_routing}
logical_datapath    : 161206d2-72c5-4e79-ae5c-669e6ee4e71a
match               : "inport ==
\"lrp-09f7eba5-54b7-48f4-9820-80423b65c608\" && ip4.src == 10.1.0.170 &&
ip4.dst == 10.142.140.39"
pipeline            : ingress
priority            : 400

Looks like these last flows are added by this commit:
https://github.com/ovn-org/ovn/commit/8244c6b6bd8802a018e4ec3d3665510ebb16a9c7

Each FIP contributes to 599 LFlows in this stage:

[root at central ~]# grep -c 10.142.140.39  lr_in_ip_routing.txt
599
[root at central ~]# grep -c 10.142.140.185  lr_in_ip_routing.txt
599

In order to figure out the relationship between the # of FIPs and the
lflows, I removed a few of them and still the % of lflows in both stages
remain constant.


[root at central ~]# ovn-nbctl find NAT type=dnat_and_snat | grep -c  _uuid
833

[root at central ~]# grep _uuid logical_flows_2.txt -c
611640

lr_out_egr_loop: 379740  62.08%
lr_in_ip_routing: 190295   31.11%


I'd like to gather feedback around the mentioned commits to see if there's
a way we can avoid to insert those lflows or somehow offload the
calculation to ovn-controller on the chassis where the logical port is
bound to. This way we'll avoid stress on ovsdb-server and ovn-northd.

Any thoughts?

Thanks,
Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200128/7ee2179a/attachment.html>


More information about the discuss mailing list