[ovs-dev] [PATCH v11 3/5] ovn: distributed NAT flows

Guru Shetty guru at ovn.org
Thu Jan 26 17:57:22 UTC 2017


On 26 January 2017 at 09:47, Mickey Spiegel <mickeys.dev at gmail.com> wrote:

>
> On Thu, Jan 26, 2017 at 8:53 AM, Guru Shetty <guru at ovn.org> wrote:
>
>>
>>
>> On 21 January 2017 at 16:52, Mickey Spiegel <mickeys.dev at gmail.com>
>> wrote:
>>
>>> This patch implements the flows required in the ingress and egress
>>> pipeline stages in order to support NAT on a distributed logical router.
>>>
>>> NAT functionality is associated with the logical router gateway port.
>>> The flows that carry out NAT functionality all have match conditions on
>>> inport or outport equal to the logical router gateway port.  There are
>>> additional flows that are used to redirect traffic when necessary,
>>> using the tunnel key of a "chassisredirect" SB port binding in order to
>>> redirect traffic to the instance of the logical router gateway port on
>>> the centralized "redirect-chassis".
>>>
>>> North/south traffic subject to one-to-one "dnat_and_snat" is handled
>>> in a distributed manner, with south-to-north traffic going to the
>>> local instance of the logical router gateway port.  North/south
>>> traffic subject to (possibly one-to-many) "snat" is handled in a
>>> centralized manner, with south-to-north traffic going to the instance
>>> of the logical router gateway port on the "redirect-chassis".
>>> North-to-south traffic is directed to the corresponding chassis by
>>> limiting ARP responses to the appropriate instance of the logical
>>> router gateway port on one chassis.  For centralized NAT rules, this
>>> is the instance on the "redirect-chassis".  For distributed NAT rules,
>>> this is the chassis where the corresponding logical port resides, using
>>> an ethernet address specified in the NB NAT rule to trigger upstream
>>> MAC learning.
>>>
>>> East/west NAT traffic is all handled in a centralized manner.  While it
>>> is certainly possible to handle some of this traffic in a distributed
>>> manner, the centralized approach keeps the NAT flows simpler and
>>> cleaner.  The expectation is that east/west NAT traffic is not as
>>> important to optimize as north/south NAT traffic, with most east/west
>>> traffic not requiring NAT.
>>>
>>> Automated tests are currently limited to only a single node.  The
>>> single node automated tests cover both north/south and east/west
>>> traffic flows.
>>>
>>> Signed-off-by: Mickey Spiegel <mickeys.dev at gmail.com>
>>>
>>  Acked-by: Gurucharan Shetty <guru at ovn.org>
>>
>> A few comments below...
>>
>
> Thanks for the review.
>
>
>>
>>
>>> +            /* For distributed router NAT, determine whether this NAT
>>> rule
>>> +             * satisfies the conditions for distributed NAT processing.
>>> */
>>> +            bool distributed = false;
>>> +            struct eth_addr mac;
>>> +            if (od->l3dgw_port && !strcmp(nat->type, "dnat_and_snat") &&
>>> +                nat->logical_port && nat->external_mac) {
>>> +                if (eth_addr_from_string(nat->external_mac, &mac)) {
>>> +                    distributed = true;
>>> +                } else {
>>> +                    static struct vlog_rate_limit rl =
>>> +                        VLOG_RATE_LIMIT_INIT(5, 1);
>>> +                    VLOG_WARN_RL(&rl, "bad mac %s for dnat in router "
>>> +                        ""UUID_FMT"", nat->external_mac,
>>> UUID_ARGS(&od->key));
>>>
>> Does bad mac need a "continue" ?
>>
>
> The way I have it right now, if the MAC is bad (or if there is an
> external_mac but no logical_port, or there is a logical_port but no
> external_mac) then the NAT rule is still installed, but only on the
> centralized instance of the distributed gateway port.
> Do you think it is better to "continue", where the NAT rule will not
> work at all until the MAC is fixed?
>
Since we are warning any way, it may make sense to continue.


>
>
>>
>>
>>
>>> +                }
>>> +            }
>>> +
>>>              /* Ingress UNSNAT table: It is for already established
>>> connections'
>>>               * reverse traffic. i.e., SNAT has already been done in
>>> egress
>>>               * pipeline and now the packet has entered the ingress
>>> pipeline as
>>>
>>
>> ...snip..
>>
>>
>>> @@ -4161,21 +4290,87 @@ build_lrouter_flows(struct hmap *datapaths,
>>> struct hmap *ports,
>>>               * to a logical IP address. */
>>>              if (!strcmp(nat->type, "dnat")
>>>                  || !strcmp(nat->type, "dnat_and_snat")) {
>>> -                /* Packet when it goes from the initiator to
>>> destination.
>>> -                 * We need to zero the inport because the router can
>>> -                 * send the packet back through the same interface. */
>>> +                if (!od->l3dgw_port) {
>>> +                    /* Gateway router. */
>>> +                    /* Packet when it goes from the initiator to
>>> destination.
>>> +                     * We need to set flags.loopback because the router
>>> can
>>> +                     * send the packet back through the same interface.
>>> */
>>> +                    ds_clear(&match);
>>> +                    ds_put_format(&match, "ip && ip4.dst == %s",
>>> +                                  nat->external_ip);
>>> +                    ds_clear(&actions);
>>> +                    if (dnat_force_snat_ip) {
>>> +                        /* Indicate to the future tables that a DNAT
>>> has taken
>>> +                         * place and a force SNAT needs to be done in
>>> the
>>> +                         * Egress SNAT table. */
>>> +                        ds_put_format(&actions,
>>> +                                      "flags.force_snat_for_dnat = 1;
>>> ");
>>> +                    }
>>> +                    ds_put_format(&actions, "flags.loopback = 1;
>>> ct_dnat(%s);",
>>> +                                  nat->logical_ip);
>>> +                    ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
>>> +                                  ds_cstr(&match), ds_cstr(&actions));
>>> +                } else {
>>> +                    /* Distributed router. */
>>> +
>>> +                    /* Traffic received on l3dgw_port is subject to
>>> NAT. */
>>> +                    ds_clear(&match);
>>> +                    ds_put_format(&match, "ip && ip4.dst == %s"
>>> +                                          " && inport == %s",
>>> +                                  nat->external_ip,
>>> +                                  od->l3dgw_port->json_key);
>>> +                    if (!distributed && od->l3redirect_port) {
>>> +                        /* Flows for NAT rules that are centralized are
>>> only
>>> +                         * programmed on the "redirect-chassis". */
>>> +                        ds_put_format(&match, " &&
>>> is_chassis_resident(%s)",
>>> +                                      od->l3redirect_port->json_key);
>>> +                    }
>>> +                    ds_clear(&actions);
>>> +                    ds_put_format(&actions,"ct_dnat(%s);",
>>>
>>
>> A space before ct_dnat
>>
>
> ct_dnat(%s) is the only action for this case.
>
I looked like there is no space after comma. Just a codingstyle thing.


>
>
>>
>> +                                  nat->logical_ip);
>>> +                    ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
>>> +                                  ds_cstr(&match), ds_cstr(&actions));
>>> +
>>> +                    /* Traffic received on other router ports must be
>>> +                     * redirected to the central instance of the
>>> l3dgw_port
>>> +                     * for NAT processing. */
>>> +                    ds_clear(&match);
>>> +                    ds_put_format(&match, "ip && ip4.dst == %s",
>>> +                                  nat->external_ip);
>>> +                    ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
>>> +                                  ds_cstr(&match),
>>> +                                  REGBIT_NAT_REDIRECT" = 1; next;");
>>> +                }
>>> +            }
>>> +
>>> +            /* Egress UNDNAT table: It is for already established
>>> connections'
>>> +             * reverse traffic. i.e., DNAT has already been done in
>>> ingress
>>> +             * pipeline and now the packet has entered the egress
>>> pipeline as
>>> +             * part of a reply. We undo the DNAT here.
>>> +             *
>>> +             * Note that this only applies for NAT on a distributed
>>> router.
>>> +             * Undo DNAT on a gateway router is done in the ingress DNAT
>>> +             * pipeline stage. */
>>> +            if (od->l3dgw_port && (!strcmp(nat->type, "dnat")
>>> +                || !strcmp(nat->type, "dnat_and_snat"))) {
>>>                  ds_clear(&match);
>>> -                ds_put_format(&match, "ip && ip4.dst == %s",
>>> nat->external_ip);
>>> +                ds_put_format(&match, "ip && ip4.src == %s"
>>> +                                      " && outport == %s",
>>> +                              nat->logical_ip,
>>> +                              od->l3dgw_port->json_key);
>>> +                if (!distributed && od->l3redirect_port) {
>>> +                    /* Flows for NAT rules that are centralized are only
>>> +                     * programmed on the "redirect-chassis". */
>>> +                    ds_put_format(&match, " && is_chassis_resident(%s)",
>>> +                                  od->l3redirect_port->json_key);
>>> +                }
>>>                  ds_clear(&actions);
>>> -                if (dnat_force_snat_ip) {
>>> -                    /* Indicate to the future tables that a DNAT has
>>> taken
>>> -                     * place and a force SNAT needs to be done in the
>>> Egress
>>> -                     * SNAT table. */
>>> -                    ds_put_format(&actions, "flags.force_snat_for_dnat
>>> = 1; ");
>>> +                if (distributed) {
>>> +                    ds_put_format(&actions, "eth.src = "ETH_ADDR_FMT";
>>> ",
>>> +                                  ETH_ADDR_ARGS(mac));
>>>                  }
>>> -                ds_put_format(&actions, "flags.loopback = 1;
>>> ct_dnat(%s);",
>>> -                              nat->logical_ip);
>>> -                ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
>>> +                ds_put_format(&actions, "ct_dnat;");
>>> +                ovn_lflow_add(lflows, od, S_ROUTER_OUT_UNDNAT, 100,
>>>                                ds_cstr(&match), ds_cstr(&actions));
>>>              }
>>>
>>>
>>>
>> ...snip....
>>
>>
>>> +
>>> +        /* Load balancing and packet defrag are only valid on
>>> +         * Gateway routers. */
>>> +        if (!smap_get(&od->nbr->options, "chassis")) {
>>> +            continue;
>>> +        }
>>>
>> Looks like load-balancing is not supported on the router.  Is there a
>> technical reason? If we don't intend to support this, would it make sense
>> to only have only one force snat flag? (I am not asking for this in this
>> patch series)
>>
>
> I just did not get to it, ran out of time. However, I could
> get away with a different register bit instead of the flag.
> On the distributed router, force snat would have to be
> carried out in the ingress pipeline on the distributed
> gateway port, since the egress pipeline is likely to be
> carried out on a different chassis. I would have to add
> another ingress pipeline stage to do it. I would have
> to differentiate between the gateway router case and
> the distributed router case in add_router_lb_flow(),
> but that is not difficult.
>
> The need for two flags is tied to the separate IP
> addresses "dnat_force_snat_ip" and "lb_force_snat_ip"
> in the northbound database. If this is changed to only
> one IP address, then for a distributed router the
> "force_snat_ip" would only apply to lb traffic, not to
> dnat traffic. This would have to be explained in
> documentation.
>
Lets just leave the 2 flags for now.


>
> Mickey
>
>


More information about the dev mailing list