[ovs-dev] [PATCH v11 3/5] ovn: distributed NAT flows
Guru Shetty
guru at ovn.org
Thu Jan 26 17:57:22 UTC 2017
On 26 January 2017 at 09:47, Mickey Spiegel <mickeys.dev at gmail.com> wrote:
>
> On Thu, Jan 26, 2017 at 8:53 AM, Guru Shetty <guru at ovn.org> wrote:
>
>>
>>
>> On 21 January 2017 at 16:52, Mickey Spiegel <mickeys.dev at gmail.com>
>> wrote:
>>
>>> This patch implements the flows required in the ingress and egress
>>> pipeline stages in order to support NAT on a distributed logical router.
>>>
>>> NAT functionality is associated with the logical router gateway port.
>>> The flows that carry out NAT functionality all have match conditions on
>>> inport or outport equal to the logical router gateway port. There are
>>> additional flows that are used to redirect traffic when necessary,
>>> using the tunnel key of a "chassisredirect" SB port binding in order to
>>> redirect traffic to the instance of the logical router gateway port on
>>> the centralized "redirect-chassis".
>>>
>>> North/south traffic subject to one-to-one "dnat_and_snat" is handled
>>> in a distributed manner, with south-to-north traffic going to the
>>> local instance of the logical router gateway port. North/south
>>> traffic subject to (possibly one-to-many) "snat" is handled in a
>>> centralized manner, with south-to-north traffic going to the instance
>>> of the logical router gateway port on the "redirect-chassis".
>>> North-to-south traffic is directed to the corresponding chassis by
>>> limiting ARP responses to the appropriate instance of the logical
>>> router gateway port on one chassis. For centralized NAT rules, this
>>> is the instance on the "redirect-chassis". For distributed NAT rules,
>>> this is the chassis where the corresponding logical port resides, using
>>> an ethernet address specified in the NB NAT rule to trigger upstream
>>> MAC learning.
>>>
>>> East/west NAT traffic is all handled in a centralized manner. While it
>>> is certainly possible to handle some of this traffic in a distributed
>>> manner, the centralized approach keeps the NAT flows simpler and
>>> cleaner. The expectation is that east/west NAT traffic is not as
>>> important to optimize as north/south NAT traffic, with most east/west
>>> traffic not requiring NAT.
>>>
>>> Automated tests are currently limited to only a single node. The
>>> single node automated tests cover both north/south and east/west
>>> traffic flows.
>>>
>>> Signed-off-by: Mickey Spiegel <mickeys.dev at gmail.com>
>>>
>> Acked-by: Gurucharan Shetty <guru at ovn.org>
>>
>> A few comments below...
>>
>
> Thanks for the review.
>
>
>>
>>
>>> + /* For distributed router NAT, determine whether this NAT
>>> rule
>>> + * satisfies the conditions for distributed NAT processing.
>>> */
>>> + bool distributed = false;
>>> + struct eth_addr mac;
>>> + if (od->l3dgw_port && !strcmp(nat->type, "dnat_and_snat") &&
>>> + nat->logical_port && nat->external_mac) {
>>> + if (eth_addr_from_string(nat->external_mac, &mac)) {
>>> + distributed = true;
>>> + } else {
>>> + static struct vlog_rate_limit rl =
>>> + VLOG_RATE_LIMIT_INIT(5, 1);
>>> + VLOG_WARN_RL(&rl, "bad mac %s for dnat in router "
>>> + ""UUID_FMT"", nat->external_mac,
>>> UUID_ARGS(&od->key));
>>>
>> Does bad mac need a "continue" ?
>>
>
> The way I have it right now, if the MAC is bad (or if there is an
> external_mac but no logical_port, or there is a logical_port but no
> external_mac) then the NAT rule is still installed, but only on the
> centralized instance of the distributed gateway port.
> Do you think it is better to "continue", where the NAT rule will not
> work at all until the MAC is fixed?
>
Since we are warning any way, it may make sense to continue.
>
>
>>
>>
>>
>>> + }
>>> + }
>>> +
>>> /* Ingress UNSNAT table: It is for already established
>>> connections'
>>> * reverse traffic. i.e., SNAT has already been done in
>>> egress
>>> * pipeline and now the packet has entered the ingress
>>> pipeline as
>>>
>>
>> ...snip..
>>
>>
>>> @@ -4161,21 +4290,87 @@ build_lrouter_flows(struct hmap *datapaths,
>>> struct hmap *ports,
>>> * to a logical IP address. */
>>> if (!strcmp(nat->type, "dnat")
>>> || !strcmp(nat->type, "dnat_and_snat")) {
>>> - /* Packet when it goes from the initiator to
>>> destination.
>>> - * We need to zero the inport because the router can
>>> - * send the packet back through the same interface. */
>>> + if (!od->l3dgw_port) {
>>> + /* Gateway router. */
>>> + /* Packet when it goes from the initiator to
>>> destination.
>>> + * We need to set flags.loopback because the router
>>> can
>>> + * send the packet back through the same interface.
>>> */
>>> + ds_clear(&match);
>>> + ds_put_format(&match, "ip && ip4.dst == %s",
>>> + nat->external_ip);
>>> + ds_clear(&actions);
>>> + if (dnat_force_snat_ip) {
>>> + /* Indicate to the future tables that a DNAT
>>> has taken
>>> + * place and a force SNAT needs to be done in
>>> the
>>> + * Egress SNAT table. */
>>> + ds_put_format(&actions,
>>> + "flags.force_snat_for_dnat = 1;
>>> ");
>>> + }
>>> + ds_put_format(&actions, "flags.loopback = 1;
>>> ct_dnat(%s);",
>>> + nat->logical_ip);
>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
>>> + ds_cstr(&match), ds_cstr(&actions));
>>> + } else {
>>> + /* Distributed router. */
>>> +
>>> + /* Traffic received on l3dgw_port is subject to
>>> NAT. */
>>> + ds_clear(&match);
>>> + ds_put_format(&match, "ip && ip4.dst == %s"
>>> + " && inport == %s",
>>> + nat->external_ip,
>>> + od->l3dgw_port->json_key);
>>> + if (!distributed && od->l3redirect_port) {
>>> + /* Flows for NAT rules that are centralized are
>>> only
>>> + * programmed on the "redirect-chassis". */
>>> + ds_put_format(&match, " &&
>>> is_chassis_resident(%s)",
>>> + od->l3redirect_port->json_key);
>>> + }
>>> + ds_clear(&actions);
>>> + ds_put_format(&actions,"ct_dnat(%s);",
>>>
>>
>> A space before ct_dnat
>>
>
> ct_dnat(%s) is the only action for this case.
>
I looked like there is no space after comma. Just a codingstyle thing.
>
>
>>
>> + nat->logical_ip);
>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
>>> + ds_cstr(&match), ds_cstr(&actions));
>>> +
>>> + /* Traffic received on other router ports must be
>>> + * redirected to the central instance of the
>>> l3dgw_port
>>> + * for NAT processing. */
>>> + ds_clear(&match);
>>> + ds_put_format(&match, "ip && ip4.dst == %s",
>>> + nat->external_ip);
>>> + ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
>>> + ds_cstr(&match),
>>> + REGBIT_NAT_REDIRECT" = 1; next;");
>>> + }
>>> + }
>>> +
>>> + /* Egress UNDNAT table: It is for already established
>>> connections'
>>> + * reverse traffic. i.e., DNAT has already been done in
>>> ingress
>>> + * pipeline and now the packet has entered the egress
>>> pipeline as
>>> + * part of a reply. We undo the DNAT here.
>>> + *
>>> + * Note that this only applies for NAT on a distributed
>>> router.
>>> + * Undo DNAT on a gateway router is done in the ingress DNAT
>>> + * pipeline stage. */
>>> + if (od->l3dgw_port && (!strcmp(nat->type, "dnat")
>>> + || !strcmp(nat->type, "dnat_and_snat"))) {
>>> ds_clear(&match);
>>> - ds_put_format(&match, "ip && ip4.dst == %s",
>>> nat->external_ip);
>>> + ds_put_format(&match, "ip && ip4.src == %s"
>>> + " && outport == %s",
>>> + nat->logical_ip,
>>> + od->l3dgw_port->json_key);
>>> + if (!distributed && od->l3redirect_port) {
>>> + /* Flows for NAT rules that are centralized are only
>>> + * programmed on the "redirect-chassis". */
>>> + ds_put_format(&match, " && is_chassis_resident(%s)",
>>> + od->l3redirect_port->json_key);
>>> + }
>>> ds_clear(&actions);
>>> - if (dnat_force_snat_ip) {
>>> - /* Indicate to the future tables that a DNAT has
>>> taken
>>> - * place and a force SNAT needs to be done in the
>>> Egress
>>> - * SNAT table. */
>>> - ds_put_format(&actions, "flags.force_snat_for_dnat
>>> = 1; ");
>>> + if (distributed) {
>>> + ds_put_format(&actions, "eth.src = "ETH_ADDR_FMT";
>>> ",
>>> + ETH_ADDR_ARGS(mac));
>>> }
>>> - ds_put_format(&actions, "flags.loopback = 1;
>>> ct_dnat(%s);",
>>> - nat->logical_ip);
>>> - ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
>>> + ds_put_format(&actions, "ct_dnat;");
>>> + ovn_lflow_add(lflows, od, S_ROUTER_OUT_UNDNAT, 100,
>>> ds_cstr(&match), ds_cstr(&actions));
>>> }
>>>
>>>
>>>
>> ...snip....
>>
>>
>>> +
>>> + /* Load balancing and packet defrag are only valid on
>>> + * Gateway routers. */
>>> + if (!smap_get(&od->nbr->options, "chassis")) {
>>> + continue;
>>> + }
>>>
>> Looks like load-balancing is not supported on the router. Is there a
>> technical reason? If we don't intend to support this, would it make sense
>> to only have only one force snat flag? (I am not asking for this in this
>> patch series)
>>
>
> I just did not get to it, ran out of time. However, I could
> get away with a different register bit instead of the flag.
> On the distributed router, force snat would have to be
> carried out in the ingress pipeline on the distributed
> gateway port, since the egress pipeline is likely to be
> carried out on a different chassis. I would have to add
> another ingress pipeline stage to do it. I would have
> to differentiate between the gateway router case and
> the distributed router case in add_router_lb_flow(),
> but that is not difficult.
>
> The need for two flags is tied to the separate IP
> addresses "dnat_force_snat_ip" and "lb_force_snat_ip"
> in the northbound database. If this is changed to only
> one IP address, then for a distributed router the
> "force_snat_ip" would only apply to lb traffic, not to
> dnat traffic. This would have to be explained in
> documentation.
>
Lets just leave the 2 flags for now.
>
> Mickey
>
>
More information about the dev
mailing list