[ovs-dev] [RFC v3 0/5] ovn: add distributed NAT capability
Mickey Spiegel
mickeys.dev at gmail.com
Thu Dec 15 14:34:10 UTC 2016
Currently OVN supports NAT functionality by connecting each distributed
logical router to a centralized "l3gateway" router that resides on a
single chassis. NAT is only carried out in the "l3gateway" router.
This patch set introduces NAT capability in the distributed logical
router itself, avoiding the need to pass through a transit logical
switch and a second logical router, and in many cases avoiding the need
to pass through a centralized chassis.
NAT functionality is associated with the logical router gateway port.
In order to support one-to-many SNAT (aka IP masquerading), where
multiple private IP addresses spread across multiple chassis are mapped
to a single public IP address, it will be necessary to handle some of
the logical router processing on a specific chassis in a centralized
manner. Some NAT flows are handled in a distributed manner on all
chassis (following the local "patch" port as is normally done for
distributed logical routers), while other NAT flows are handled on a
centralized "redirect-chassis".
North/south DNAT and SNAT are working, including some automated tests.
There is another patch required to get east/west NAT working, which is
dependent on the pending "clone" patch:
1. Add egress loopback capability, along with associated
flags.egress_loopback. When flags.egress_loopback is set, at the
end of the egress pipeline, instead of the packet being sent out the
outport, the packet is forced back to the beginning of the ingress
pipeline with inport = outport. All other registers are cleared, as
if the packet just arrived on that inport.
This capability is needed in order to implement some of the
east/west NAT flows.
Note: The existing flags.loopback allows a packet to go from the end
of the ingress pipeline to the beginning of the egress pipeline with
outport = inport, which is different.
Other to do items include:
2. Rewrite the chassisredirect port logic to avoid creating an ofport.
This is dependent on patch 7 in blp's ovn-controller patch series.
As well as streamlining the code, this will remove a restriction
on the underlying distributed port name being at most 12 characters
long. The current patch set would not be able to work with
OpenStack until this limitation is addressed.
3. Unless there are local VIFs on a chassis, the localnet port on the
switch connected to the distributed router gateway port is not
getting instantiated. This would be resolved by patch 6 in blp's
ovn-controller patch set, which extends the notion of local
datapaths to include all reachable patched datapaths.
4. The NAT flows patch lifts the restriction that conntrack zones are
only assigned to datapaths for gateway routers. At the moment
conntrack zones are assigned to all datapaths. This should be
restricted. If datapaths of interest and/or blp's ovn-controller
patch set limit to only reachable datapaths, is that good enough?
5. The current automated test for NAT flows is single node, so it does
not cover the distributed functionality. Full coverage requires a
multi-node test with conntrack NAT capability, either in the kernel
or userspace. Is this possible?
Multi-node tests have been added for the chassisdirect patch,
testing non-NAT aspects of the distributed router gateway port.
6. Consider how to generalize distributed versus centralized handling
of non-NAT traffic being output on the distributed gateway port.
If MAC learning is used in the upstream network, then the
distributed gateway port’s MAC address must be restricted to the
redirect-chassis by using the chassisredirect port. In the
presence of dynamic protocols such as BGP EVPN, non-NAT traffic
could be handled in a distributed manner.
7. Gratuitous ARP for NAT addresses needs to be updated for
distributed NAT.
v2 -> v3
Reordered the first two patches.
Moved non-NAT specific flows from patch 5 to patch 2.
Added automated tests for is_chassis_resident (which is ready for
review) and chassisredirect patches.
Added flows to limit ICMP echo replies for router IPs on the gateway
interface, so that they are only generated on the redirect-chassis.
Mickey Spiegel (5):
ovn: add is_chassis_resident match expression component
ovn: Introduce "chassisredirect" port binding
ovn: move load balancing flows after NAT flows
ovn: avoid snat recirc only on gateway routers
ovn: distributed NAT flows
include/ovn/actions.h | 3 +
include/ovn/expr.h | 22 +-
ovn/controller/binding.c | 143 +++++++-
ovn/controller/lflow.c | 45 ++-
ovn/controller/lflow.h | 1 +
ovn/controller/ovn-controller.8.xml | 15 +
ovn/controller/ovn-controller.c | 11 +-
ovn/controller/physical.c | 68 +++-
ovn/controller/physical.h | 2 +
ovn/lib/actions.c | 15 +-
ovn/lib/expr.c | 155 ++++++++-
ovn/northd/ovn-northd.8.xml | 322 ++++++++++++++++-
ovn/northd/ovn-northd.c | 663 ++++++++++++++++++++++++++++--------
ovn/ovn-nb.ovsschema | 13 +-
ovn/ovn-nb.xml | 66 +++-
ovn/ovn-sb.xml | 35 ++
ovn/utilities/ovn-trace.c | 21 +-
tests/ovn.at | 314 ++++++++++++++++-
tests/system-ovn.at | 155 +++++++++
tests/test-ovn.c | 15 +-
20 files changed, 1894 insertions(+), 190 deletions(-)
--
1.9.1
More information about the dev
mailing list