[ovs-dev] [PATCH v4] ovn: DNAT and SNAT on a gateway router.

Flaviof flavio at flaviof.com
Tue Jun 21 02:36:21 UTC 2016


On Mon, Jun 13, 2016 at 6:45 AM, Gurucharan Shetty <guru at ovn.org> wrote:

> For traffic from physical space to virtual space we need DNAT.
> The DNAT happens in the gateway router and reaches the logical
> port. The return traffic should be unDNATed.
>
> Traffic originating in virtual space heading to physical space
> should be SNATed. The return traffic is unSNATted.
>
> East-west traffic with the public destination IP address needs
> a DNAT. This traffic is punted to the l3 gateway where DNAT
> takes place. This traffic is also SNATed and eventually loops back to
> its destination. The SNAT is needed because we need the reverse traffic
> to go back to the l3 gateway and not short-circuit directly to the source.
>
> This commit introduces 4 new logical actions.
> 1. ct_snat: To send the packet through SNAT zone to unSNAT packets.
> 2. ct_snat(IP): To SNAT to the provided IP address.
> 3. ct_dnat: To send the packet throgh DNAT zone to unDNAT packets.
> 4. ct_dnat(IP): To DNAT to the provided IP.
>
> This commit only provides the ability to do IP based NAT. This will
> eventually be enhanced to do PORT based NAT too.
>
> Command hints:
>
> Consider a distributed router "R1" that has switch foo (192.168.1.0/24)
> with a lport foo1 (192.168.1.2) and bar (192.168.2.0/24) with lport bar1
> (192.168.2.2) connected to it. You connect "R1" to
> a gateway router "R2" via a switch "join" in (20.0.0.0/24) network.
>
> R2 has a switch "alice" (172.16.1.0/24) connected to it (to simulate
> external network).
>
> case: Add pure DNAT (north-south)
>
> Add a DNAT rule in R2:
> ovn-nbctl -- --id=@nat create nat type="dnat" logical_ip=192.168.1.2 \
> external_ip=30.0.0.2 -- add logical_router R2 nat @nat
>
> Now alice1 should be able to ping 192.168.1.2 via 30.0.0.2.
>
> case2 : Add pure SNAT (south-north)
>
> Add a SNAT rule in R2:
>
> ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \
> external_ip=30.0.0.1 -- add logical_router R2 nat @nat
>
> (You need a static route in R1 to send packets destined to outside
> world to go through R2. The logical_ip can be a subnet.)
>
> When bar1 pings alice1, alice1 receives traffic from 30.0.0.1
>
> case3 : SNAT and DNAT (east-west traffic)
>
> When bar1 pings 30.0.0.2, the traffic jumps to the gateway router
> and loops back to foo1 with a source ip address of 30.0.0.1
>
>
So, is 30.0.0.0/x network an external network that R2 has a port too?
What is the next hop that R2 would use to reach a destination beyond
that subnet?

I think this may be clear when a test is added to ovn.at, which uses foo,
bar, join, alice

Based on the code and my little test setup, there seems to be a high cost
for DNAT entries in that an ARP response rule will be added per DNAT x all
router ports. In the example used by the commit message, ingress table 1 of
the logical router will have arp response entries for inports alice and
R2_join.

Is that expected? This may be very okay, since it only takes place on the
gateway router (i.e. logical router associated to a chassis)?

Side note: After adding DNAT, I tried removing and re-adding 'alice' and
'rp-alice'. As expected, the ARP reply rules we properly removed when the
ports were deleted, and properly re-added when the ports were re-created.
Nice!

Table 3: do we really intend to apply the actions 'inport = ""; ct_dnat;'
to all ip packets that do not have an explicit dnat mapping?

SNAT: do we need ARP reply rules for the SNAT addresses, similar to the
ones added for DNAT?

SNAT: looking at the openflow table I see n mentioning of the address added
to support SNAT. Ist that because that is all handled by connect_tracker
and there is nothing to be done via openflow? Or maybe part of another
patchset?

Thanks,

-- flaviof




> Signed-off-by: Gurucharan Shetty <guru at ovn.org>
> ---
>  ovn/lib/actions.c           |  83 ++++++++++++++++++++
>  ovn/northd/ovn-northd.8.xml | 131 ++++++++++++++++++++++++++++---
>  ovn/northd/ovn-northd.c     | 187
> ++++++++++++++++++++++++++++++++++++++++++--
>  ovn/ovn-nb.ovsschema        |  19 ++++-
>  ovn/ovn-nb.xml              |  65 +++++++++++++--
>  ovn/ovn-sb.xml              |  41 ++++++++++
>  ovn/utilities/ovn-nbctl.c   |   5 ++
>  tests/ovn.at                |  17 ++++
>  8 files changed, 524 insertions(+), 24 deletions(-)
>
> diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
> index 5f0bf19..4a486a0 100644
> --- a/ovn/lib/actions.c
> +++ b/ovn/lib/actions.c
> @@ -442,6 +442,85 @@ emit_ct(struct action_context *ctx, bool recirc_next,
> bool commit)
>      add_prerequisite(ctx, "ip");
>  }
>
> +static void
> +parse_ct_nat(struct action_context *ctx, bool snat)
> +{
> +    const size_t ct_offset = ctx->ofpacts->size;
> +    ofpbuf_pull(ctx->ofpacts, ct_offset);
> +
> +    struct ofpact_conntrack *ct = ofpact_put_CT(ctx->ofpacts);
> +
> +    if (ctx->ap->cur_ltable < ctx->ap->n_tables) {
> +        ct->recirc_table = ctx->ap->first_ptable + ctx->ap->cur_ltable +
> 1;
> +    } else {
> +        action_error(ctx,
> +                     "\"ct_[sd]nat\" action not allowed in last table.");
> +        return;
> +    }
> +
> +    if (snat) {
> +        ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
> +    } else {
> +        ct->zone_src.field = mf_from_id(MFF_LOG_DNAT_ZONE);
> +    }
> +    ct->zone_src.ofs = 0;
> +    ct->zone_src.n_bits = 16;
> +    ct->flags = 0;
> +    ct->alg = 0;
> +
> +    add_prerequisite(ctx, "ip");
> +
> +    struct ofpact_nat *nat;
> +    size_t nat_offset;
> +    nat_offset = ctx->ofpacts->size;
> +    ofpbuf_pull(ctx->ofpacts, nat_offset);
> +
> +    nat = ofpact_put_NAT(ctx->ofpacts);
> +    nat->flags = 0;
> +    nat->range_af = AF_UNSPEC;
> +
> +    int commit = 0;
> +    if (lexer_match(ctx->lexer, LEX_T_LPAREN)) {
> +        ovs_be32 ip;
> +        if (ctx->lexer->token.type == LEX_T_INTEGER
> +            && ctx->lexer->token.format == LEX_F_IPV4) {
> +            ip = ctx->lexer->token.value.ipv4;
> +        } else {
> +            action_syntax_error(ctx, "invalid ip");
> +            return;
> +        }
> +
> +        nat->range_af = AF_INET;
> +        nat->range.addr.ipv4.min = ip;
> +        if (snat) {
> +            nat->flags |= NX_NAT_F_SRC;
> +        } else {
> +            nat->flags |= NX_NAT_F_DST;
> +        }
> +        commit = NX_CT_F_COMMIT;
> +        lexer_get(ctx->lexer);
> +        if (!lexer_match(ctx->lexer, LEX_T_RPAREN)) {
> +            action_syntax_error(ctx, "expecting `)'");
> +            return;
> +        }
> +    }
> +
> +    ctx->ofpacts->header = ofpbuf_push_uninit(ctx->ofpacts, nat_offset);
> +    ct = ctx->ofpacts->header;
> +    ct->flags |= commit;
> +
> +    /* XXX: For performance reasons, we try to prevent additional
> +     * recirculations.  So far, ct_snat which is used in a gateway router
> +     * does not need a recirculation. ct_snat(IP) does need a
> recirculation.
> +     * Should we consider a method to let the actions specify whether a
> action
> +     * needs recirculation if there more use cases?. */
> +    if (!commit && snat) {
> +        ct->recirc_table = NX_CT_RECIRC_NONE;
> +    }
> +    ofpact_finish(ctx->ofpacts, &ct->ofpact);
> +    ofpbuf_push_uninit(ctx->ofpacts, ct_offset);
> +}
> +
>  static bool
>  parse_action(struct action_context *ctx)
>  {
> @@ -469,6 +548,10 @@ parse_action(struct action_context *ctx)
>          emit_ct(ctx, true, false);
>      } else if (lexer_match_id(ctx->lexer, "ct_commit")) {
>          emit_ct(ctx, false, true);
> +    } else if (lexer_match_id(ctx->lexer, "ct_dnat")) {
> +        parse_ct_nat(ctx, false);
> +    } else if (lexer_match_id(ctx->lexer, "ct_snat")) {
> +        parse_ct_nat(ctx, true);
>      } else if (lexer_match_id(ctx->lexer, "arp")) {
>          parse_arp_action(ctx);
>      } else if (lexer_match_id(ctx->lexer, "get_arp")) {
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 1983812..c237604 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -517,11 +517,40 @@ next;
>
>        <li>
>          <p>
> -          Reply to ARP requests.  These flows reply to ARP requests for
> the
> -          router's own IP address.  For each router port <var>P</var>
> that owns
> -          IP address <var>A</var> and Ethernet address <var>E</var>, a
> -          priority-90 flow matches <code>inport == <var>P</var> &amp;&amp;
> -          arp.op == 1 &amp;&amp; arp.tpa == <var>A</var></code> (ARP
> request)
> +          Reply to ARP requests.
> +        </p>
> +
> +        <p>
> +          These flows reply to ARP requests for the router's own IP
> address.
> +          For each router port <var>P</var> that owns IP address
> <var>A</var>
> +          and Ethernet address <var>E</var>, a priority-90 flow matches
> +          <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
> +          arp.tpa == <var>A</var></code> (ARP request) with the following
> +          actions:
> +        </p>
> +
> +        <pre>
> +eth.dst = eth.src;
> +eth.src = <var>E</var>;
> +arp.op = 2; /* ARP reply. */
> +arp.tha = arp.sha;
> +arp.sha = <var>E</var>;
> +arp.tpa = arp.spa;
> +arp.spa = <var>A</var>;
> +outport = <var>P</var>;
> +inport = ""; /* Allow sending out inport. */
> +output;
> +        </pre>
> +      </li>
> +
> +      <li>
> +        <p>
> +          These flows reply to ARP requests for the virtual IP addresses
> +          configured in the router for DNAT. For a configured DNAT IP
> address
> +          <var>A</var>, for each router port <var>P</var> with Ethernet
> +          address <var>E</var>, a priority-90 flow matches
> +          <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
> +          arp.tpa == <var>A</var></code> (ARP request)
>            with the following actions:
>          </p>
>
> @@ -663,7 +692,62 @@ icmp4 {
>        </li>
>      </ul>
>
> -    <h3>Ingress Table 2: IP Routing</h3>
> +    <h3>Ingress Table 2: UNSNAT</h3>
> +
> +    <p>
> +      This is for already established connections' reverse traffic.
> +      i.e., SNAT has already been done in egress pipeline and now the
> +      packet has entered the ingress pipeline as part of a reply.  It is
> +      unSNATted here.
> +    </p>
> +
> +    <ul>
> +      <li>
> +        <p>
> +          For each configuration in the OVN Northbound database, that asks
> +          to change the source IP address of a packet from <var>A</var> to
> +          <var>B</var>, a priority-100 flow matches <code>ip &amp;&amp;
> +          ip4.dst == <var>B</var></code> with an action
> +          <code>ct_snat; next;</code>.
> +        </p>
> +
> +        <p>
> +          A priority-0 logical flow with match <code>1</code> has actions
> +          <code>next;</code>.
> +        </p>
> +      </li>
> +    </ul>
> +
> +    <h3>Ingress Table 3: DNAT</h3>
> +
> +    <p>
> +      Packets enter the pipeline with destination IP address that needs to
> +      be DNATted from a virtual IP address to a real IP address.  Packets
> +      in the reverse direction needs to be unDNATed.
> +    </p>
> +    <ul>
> +      <li>
> +        <p>
> +          For each configuration in the OVN Northbound database, that asks
> +          to change the destination IP address of a packet from
> <var>A</var> to
> +          <var>B</var>, a priority-100 flow matches <code>ip &amp;&amp;
> +          ip4.dst == <var>A</var></code> with an action <code>inport = "";
> +          ct_dnat(<var>B</var>);</code>.
> +        </p>
> +
> +        <p>
> +          For all IP packets of a Gateway router, a priority-50 flow with
> an
> +          action <code>inport = ""; ct_dnat;</code>.
> +        </p>
> +
> +        <p>
> +          A priority-0 logical flow with match <code>1</code> has actions
> +          <code>next;</code>.
> +        </p>
> +      </li>
> +    </ul>
> +
> +    <h3>Ingress Table 4: IP Routing</h3>
>
>      <p>
>        A packet that arrives at this table is an IP packet that should be
> routed
> @@ -672,7 +756,7 @@ icmp4 {
>        <code>ip4.dst</code>, the packet's final destination, unchanged) and
>        advances to the next table for ARP resolution.  It also sets
>        <code>reg1</code> to the IP address owned by the selected router
> port
> -      (which is used later in table 4 as the IP source address for an ARP
> +      (which is used later in table 6 as the IP source address for an ARP
>        request, if needed).
>      </p>
>
> @@ -743,7 +827,7 @@ icmp4 {
>        </li>
>      </ul>
>
> -    <h3>Ingress Table 3: ARP Resolution</h3>
> +    <h3>Ingress Table 5: ARP Resolution</h3>
>
>      <p>
>        Any packet that reaches this table is an IP packet whose next-hop IP
> @@ -798,7 +882,7 @@ icmp4 {
>        </li>
>      </ul>
>
> -    <h3>Ingress Table 4: ARP Request</h3>
> +    <h3>Ingress Table 6: ARP Request</h3>
>
>      <p>
>        In the common case where the Ethernet destination has been
> resolved, this
> @@ -823,7 +907,7 @@ arp {
>          </pre>
>
>          <p>
> -          (Ingress table 2 initialized <code>reg1</code> with the IP
> address
> +          (Ingress table 4 initialized <code>reg1</code> with the IP
> address
>            owned by <code>outport</code>.)
>          </p>
>
> @@ -838,7 +922,32 @@ arp {
>        </li>
>      </ul>
>
> -    <h3>Egress Table 0: Delivery</h3>
> +    <h3>Egress Table 0: SNAT</h3>
> +
> +    <p>
> +      Packets that are configured to be SNATed get their source IP address
> +      changed based on the configuration in the OVN Northbound database.
> +    </p>
> +    <ul>
> +      <li>
> +        <p>
> +          For each configuration in the OVN Northbound database, that asks
> +          to change the source IP address of a packet from an IP address
> of
> +          <var>A</var> or to change the source IP address of a packet that
> +          belongs to network <var>A</var> to <var>B</var>, a flow matches
> +          <code>ip &amp;&amp; ip4.src == <var>A</var></code> with an
> action
> +          <code>ct_snat(<var>B</var>);</code>.  The priority of the flow
> +          is calculated based on the mask of <var>A</var>, with matches
> +          having larger masks getting higher priorities.
> +        </p>
> +        <p>
> +          A priority-0 logical flow with match <code>1</code> has actions
> +          <code>next;</code>.
> +        </p>
> +      </li>
> +    </ul>
> +
> +    <h3>Egress Table 1: Delivery</h3>
>
>      <p>
>        Packets that reach this table are ready for delivery.  It contains
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index cac0148..4683780 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -105,12 +105,15 @@ enum ovn_stage {
>      /* Logical router ingress stages. */                              \
>      PIPELINE_STAGE(ROUTER, IN,  ADMISSION,   0, "lr_in_admission")    \
>      PIPELINE_STAGE(ROUTER, IN,  IP_INPUT,    1, "lr_in_ip_input")     \
> -    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  2, "lr_in_ip_routing")   \
> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 3, "lr_in_arp_resolve")  \
> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 4, "lr_in_arp_request")  \
> +    PIPELINE_STAGE(ROUTER, IN,  UNSNAT,      2, "lr_in_unsnat")       \
> +    PIPELINE_STAGE(ROUTER, IN,  DNAT,        3, "lr_in_dnat")         \
> +    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  4, "lr_in_ip_routing")   \
> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 5, "lr_in_arp_resolve")  \
> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 6, "lr_in_arp_request")  \
>                                                                        \
>      /* Logical router egress stages. */                               \
> -    PIPELINE_STAGE(ROUTER, OUT, DELIVERY,    0, "lr_out_delivery")
> +    PIPELINE_STAGE(ROUTER, OUT, SNAT,      0, "lr_out_snat")          \
> +    PIPELINE_STAGE(ROUTER, OUT, DELIVERY,  1, "lr_out_delivery")
>
>  #define PIPELINE_STAGE(DP_TYPE, PIPELINE, STAGE, TABLE, NAME)   \
>      S_##DP_TYPE##_##PIPELINE##_##STAGE                          \
> @@ -1998,6 +2001,51 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>          free(match);
>          free(actions);
>
> +        /* ARP handling for external IP addresses.
> +         *
> +         * DNAT IP addresses are external IP addresses that need ARP
> +         * handling. */
> +        for (int i = 0; i < op->od->nbr->n_nat; i++) {
> +            const struct nbrec_nat *nat;
> +
> +            nat = op->od->nbr->nat[i];
> +
> +            if(!strcmp(nat->type, "snat")) {
> +                continue;
> +            }
> +
> +            ovs_be32 ip;
> +            if (!ip_parse(nat->external_ip, &ip) || !ip) {
> +                static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
> +                VLOG_WARN_RL(&rl, "bad ip address %s in dnat
> configuration "
> +                             "for router %s", nat->external_ip, op->key);
> +                continue;
> +            }
> +
> +            match = xasprintf(
> +                "inport == %s && arp.tpa == "IP_FMT" && arp.op == 1",
> +                op->json_key, IP_ARGS(ip));
> +            actions = xasprintf(
> +                "eth.dst = eth.src; "
> +                "eth.src = "ETH_ADDR_FMT"; "
> +                "arp.op = 2; /* ARP reply */ "
> +                "arp.tha = arp.sha; "
> +                "arp.sha = "ETH_ADDR_FMT"; "
> +                "arp.tpa = arp.spa; "
> +                "arp.spa = "IP_FMT"; "
> +                "outport = %s; "
> +                "inport = \"\"; /* Allow sending out inport. */ "
> +                "output;",
> +                ETH_ADDR_ARGS(op->mac),
> +                ETH_ADDR_ARGS(op->mac),
> +                IP_ARGS(ip),
> +                op->json_key);
> +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90,
> +                          match, actions);
> +            free(match);
> +            free(actions);
> +        }
> +
>          /* Drop IP traffic to this router. */
>          match = xasprintf("ip4.dst == "IP_FMT, IP_ARGS(op->ip));
>          ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 60,
> @@ -2005,6 +2053,135 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>          free(match);
>      }
>
> +    /* NAT in Gateway routers. */
> +    HMAP_FOR_EACH (od, key_node, datapaths) {
> +        if (!od->nbr) {
> +            continue;
> +        }
> +
> +        /* Packets are allowed by default. */
> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_UNSNAT, 0, "1", "next;");
> +        ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 0, "1", "next;");
> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 0, "1", "next;");
> +
> +        /* NAT rules are only valid on Gateway routers. */
> +        if (!smap_get(&od->nbr->options, "chassis")) {
> +            continue;
> +        }
> +
> +        for (int i = 0; i < od->nbr->n_nat; i++) {
> +            const struct nbrec_nat *nat;
> +
> +            nat = od->nbr->nat[i];
> +
> +            ovs_be32 ip, mask;
> +
> +            char *error = ip_parse_masked(nat->external_ip, &ip, &mask);
> +            if (error || mask != OVS_BE32_MAX) {
> +                static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
> +                VLOG_WARN_RL(&rl, "bad external ip %s for nat",
> +                             nat->external_ip);
> +                free(error);
> +                continue;
> +            }
> +
> +            /* Check the validity of nat->logical_ip. 'logical_ip' can
> +             * be a subnet when the type is "snat". */
> +            error = ip_parse_masked(nat->logical_ip, &ip, &mask);
> +            if (!strcmp(nat->type, "snat")) {
> +                if (error) {
> +                    static struct vlog_rate_limit rl =
> +                        VLOG_RATE_LIMIT_INIT(5, 1);
> +                    VLOG_WARN_RL(&rl, "bad ip network or ip %s for snat "
> +                                 "in router "UUID_FMT"",
> +                                 nat->logical_ip, UUID_ARGS(&od->key));
> +                    free(error);
> +                    continue;
> +                }
> +            } else {
> +                if (error || mask != OVS_BE32_MAX) {
> +                    static struct vlog_rate_limit rl =
> +                        VLOG_RATE_LIMIT_INIT(5, 1);
> +                    VLOG_WARN_RL(&rl, "bad ip %s for dnat in router "
> +                        ""UUID_FMT"", nat->logical_ip,
> UUID_ARGS(&od->key));
> +                    free(error);
> +                    continue;
> +                }
> +            }
> +
> +
> +            char *match, *actions;
> +
> +            /* Ingress UNSNAT table: It is for already established
> connections'
> +             * reverse traffic. i.e., SNAT has already been done in egress
> +             * pipeline and now the packet has entered the ingress
> pipeline as
> +             * part of a reply. We undo the SNAT here.
> +             *
> +             * Undoing SNAT has to happen before DNAT processing.  This is
> +             * because when the packet was DNATed in ingress pipeline, it
> did
> +             * not know about the possibility of eventual additional SNAT
> in
> +             * egress pipeline. */
> +            if (!strcmp(nat->type, "snat")
> +                || !strcmp(nat->type, "dnat_and_snat")) {
> +                match = xasprintf("ip && ip4.dst == %s",
> nat->external_ip);
> +                ovn_lflow_add(lflows, od, S_ROUTER_IN_UNSNAT, 100,
> +                              match, "ct_snat; next;");
> +                free(match);
> +            }
> +
> +            /* Ingress DNAT table: Packets enter the pipeline with
> destination
> +             * IP address that needs to be DNATted from a external IP
> address
> +             * to a logical IP address. */
> +            if (!strcmp(nat->type, "dnat")
> +                || !strcmp(nat->type, "dnat_and_snat")) {
> +                /* Packet when it goes from the initiator to destination.
> +                 * We need to zero the inport because the router can
> +                 * send the packet back through the same interface. */
> +                match = xasprintf("ip && ip4.dst == %s",
> nat->external_ip);
> +                actions = xasprintf("inport = \"\"; ct_dnat(%s);",
> +                                    nat->logical_ip);
> +                ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
> +                           match, actions);
> +                free(match);
> +                free(actions);
> +            }
> +
> +            /* Egress SNAT table: Packets enter the egress pipeline with
> +             * source ip address that needs to be SNATted to a external ip
> +             * address. */
> +            if (!strcmp(nat->type, "snat")
> +                || !strcmp(nat->type, "dnat_and_snat")) {
> +                match = xasprintf("ip && ip4.src == %s", nat->logical_ip);
> +                actions = xasprintf("ct_snat(%s);", nat->external_ip);
> +
> +                /* The priority here is calculated such that the
> +                 * nat->logical_ip with the longest mask gets a higher
> +                 * priority. */
> +                ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT,
> +                              count_1bits(ntohl(mask)) + 1, match,
> actions);
> +                free(match);
> +                free(actions);
> +            }
> +        }
> +
> +        /* Re-circulate every packet through the DNAT zone.
> +        * This helps with two things.
> +        *
> +        * 1. Any packet that needs to be unDNATed in the reverse
> +        * direction gets unDNATed. Ideally this could be done in
> +        * the egress pipeline. But since the gateway router
> +        * does not have any feature that depends on the source
> +        * ip address being external IP address for IP routing,
> +        * we can do it here, saving a future re-circulation.
> +        *
> +        * 2. Any packet that was sent through SNAT zone in the
> +        * previous table automatically gets re-circulated to get
> +        * back the new destination IP address that is needed for
> +        * routing in the openflow pipeline. */
> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
> +                      "ip", "inport = \"\"; ct_dnat;");
> +    }
> +
>      /* Logical router ingress table 2: IP Routing.
>       *
>       * A packet that arrives at this table is an IP packet that should be
> @@ -2205,7 +2382,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>          ovn_lflow_add(lflows, od, S_ROUTER_IN_ARP_REQUEST, 0, "1",
> "output;");
>      }
>
> -    /* Logical router egress table 0: Delivery (priority 100).
> +    /* Logical router egress table 1: Delivery (priority 100).
>       *
>       * Priority 100 rules deliver packets to enabled logical ports. */
>      HMAP_FOR_EACH (op, key_node, ports) {
> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> index fa21b30..ac6ca14 100644
> --- a/ovn/ovn-nb.ovsschema
> +++ b/ovn/ovn-nb.ovsschema
> @@ -1,7 +1,7 @@
>  {
>      "name": "OVN_Northbound",
> -    "version": "2.1.2",
> -    "cksum": "429668869 5325",
> +    "version": "2.1.3",
> +    "cksum": "3631923697 6121",
>      "tables": {
>          "Logical_Switch": {
>              "columns": {
> @@ -78,6 +78,11 @@
>                                     "max": "unlimited"}},
>                  "default_gw": {"type": {"key": "string", "min": 0, "max":
> 1}},
>                  "enabled": {"type": {"key": "boolean", "min": 0, "max":
> 1}},
> +                "nat": {"type": {"key": {"type": "uuid",
> +                                         "refTable": "NAT",
> +                                         "refType": "strong"},
> +                                 "min": 0,
> +                                 "max": "unlimited"}},
>                  "options": {
>                       "type": {"key": "string",
>                                "value": "string",
> @@ -104,6 +109,16 @@
>                  "ip_prefix": {"type": "string"},
>                  "nexthop": {"type": "string"},
>                  "output_port": {"type": {"key": "string", "min": 0,
> "max": 1}}},
> +            "isRoot": false},
> +        "NAT": {
> +            "columns": {
> +                "external_ip": {"type": "string"},
> +                "logical_ip": {"type": "string"},
> +                "type": {"type": {"key": {"type": "string",
> +                                           "enum": ["set", ["dnat",
> +                                                             "snat",
> +
>  "dnat_and_snat"
> +                                                               ]]}}}},
>              "isRoot": false}
>      }
>  }
> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> index 130b63b..36d1158 100644
> --- a/ovn/ovn-nb.xml
> +++ b/ovn/ovn-nb.xml
> @@ -631,18 +631,31 @@
>        router has all ingress and egress traffic dropped.
>      </column>
>
> +    <column name="nat">
> +      One or more NAT rules for the router. NAT rules only work on the
> +      Gateway routers.
> +    </column>
> +
>      <group title="Options">
>        <p>
>          Additional options for the logical router.
>        </p>
>
>        <column name="options" key="chassis">
> -        If set, indicates that the logical router in question is
> -        a Gateway router (which is centralized) and resides in the set
> -        chassis.  The same value is also used by
> <code>ovn-controller</code>
> -        to uniquely identify the chassis in the OVN deployment and
> -        comes from <code>external_ids:system-id</code> in the
> -        <code>Open_vSwitch</code> table of Open_vSwitch database.
> +        <p>
> +          If set, indicates that the logical router in question is a
> Gateway
> +          router (which is centralized) and resides in the set chassis.
> The
> +          same value is also used by <code>ovn-controller</code> to
> +          uniquely identify the chassis in the OVN deployment and
> +          comes from <code>external_ids:system-id</code> in the
> +          <code>Open_vSwitch</code> table of Open_vSwitch database.
> +        </p>
> +
> +        <p>
> +          The Gateway router can only be connected to a distributed router
> +          via a switch if SNAT and DNAT are to be configured in the
> Gateway
> +          router.
> +        </p>
>        </column>
>      </group>
>
> @@ -765,4 +778,44 @@
>      </column>
>    </table>
>
> +  <table name="NAT" title="NAT rules for a Gateway router.">
> +    <p>
> +      Each record represents a NAT rule in a Gateway router.
> +    </p>
> +
> +    <column name="type">
> +      <p>Type of the NAT rule.</p>
> +      <ul>
> +        <li>
> +          When <ref column="type"/> is <code>dnat</code>, the externally
> +          visible IP address <ref column="external_ip"/> is DNATted to
> the IP
> +          address <ref column="logical_ip"/> in the logical space.
> +        </li>
> +        <li>
> +          When <ref column="type"/> is <code>snat</code>, IP packets
> +          with their source IP address that either matches the IP address
> +          in <ref column="logical_ip"/> or is in the network provided by
> +          <ref column="logical_ip"/> is SNATed into the IP address in
> +          <ref column="external_ip"/>.
> +        </li>
> +        <li>
> +          When <ref column="type"/> is <code>dnat_and_snat</code>, the
> +          externally visible IP address <ref column="external_ip"/> is
> +          DNATted to the IP address <ref column="logical_ip"/> in the
> +          logical space. In addition, IP packets with the source IP
> +          address that matches <ref column="logical_ip"/> is SNATed into
> +          the IP address in <ref column="external_ip"/>.
> +        </li>
> +      </ul>
> +    </column>
> +
> +    <column name="external_ip">
> +      An IPv4 address.
> +    </column>
> +
> +    <column name="logical_ip">
> +      An IPv4 network (e.g 192.168.1.0/24) or an IPv4 address.
> +    </column>
> +  </table>
> +
>  </database>
> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
> index 1231b4e..5665871 100644
> --- a/ovn/ovn-sb.xml
> +++ b/ovn/ovn-sb.xml
> @@ -951,6 +951,47 @@
>            </p>
>          </dd>
>
> +        <dt><code>ct_dnat;</code></dt>
> +        <dt><code>ct_dnat(<var>IP</var>);</code></dt>
> +        <dd>
> +          <p>
> +            <code>ct_dnat</code> sends the packet through the DNAT zone in
> +            connection tracking table to unDNAT any packet that was
> DNATed in
> +            the opposite direction.  The packet is then automatically
> sent to
> +            to the next tables as if followed by <code>next;</code>
> action.
> +            The next tables will see the changes in the packet caused by
> +            the connection tracker.
> +          </p>
> +          <p>
> +            <code>ct_dnat(<var>IP</var>)</code> sends the packet through
> the
> +            DNAT zone to change the destination IP address of the packet
> to
> +            the one provided inside the parenthesis and commits the
> connection.
> +            The packet is then automatically sent to the next tables as if
> +            followed by <code>next;</code> action.  The next tables will
> see
> +            the changes in the packet caused by the connection tracker.
> +          </p>
> +        </dd>
> +
> +        <dt><code>ct_snat;</code></dt>
> +        <dt><code>ct_snat(<var>IP</var>);</code></dt>
> +        <dd>
> +          <p>
> +            <code>ct_snat</code> sends the packet through the SNAT zone to
> +            unSNAT any packet that was SNATed in the opposite direction.
> If
> +            the packet needs to be sent to the next tables, then it
> should be
> +            followed by a <code>next;</code> action.  The next tables
> will not
> +            see the changes in the packet caused by the connection
> tracker.
> +          </p>
> +          <p>
> +            <code>ct_snat(<var>IP</var>)</code> sends the packet through
> the
> +            SNAT zone to change the source IP address of the packet to
> +            the one provided inside the parenthesis and commits the
> connection.
> +            The packet is then automatically sent to the next tables as if
> +            followed by <code>next;</code> action.  The next tables will
> see the
> +            changes in the packet caused by the connection tracker.
> +          </p>
> +        </dd>
> +
>          <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
>          <dd>
>            <p>
> diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
> index 321040e..b821307 100644
> --- a/ovn/utilities/ovn-nbctl.c
> +++ b/ovn/utilities/ovn-nbctl.c
> @@ -1449,6 +1449,11 @@ static const struct ctl_table_class tables[] = {
>         NULL},
>        {NULL, NULL, NULL}}},
>
> +    {&nbrec_table_nat,
> +     {{&nbrec_table_nat, NULL,
> +       NULL},
> +      {NULL, NULL, NULL}}},
> +
>      {NULL, {{NULL, NULL, NULL}, {NULL, NULL, NULL}}}
>  };
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 633cf35..19d5c73 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -507,6 +507,23 @@ ip.ttl => Syntax error at end of input expecting `--'.
>  ct_next; => actions=ct(table=27,zone=NXM_NX_REG5[0..15]), prereqs=ip
>  ct_commit; => actions=ct(commit,zone=NXM_NX_REG5[0..15]), prereqs=ip
>
> +# dnat
> +ct_dnat; => actions=ct(table=27,zone=NXM_NX_REG3[0..15],nat), prereqs=ip
> +ct_dnat(192.168.1.2); =>
> actions=ct(commit,table=27,zone=NXM_NX_REG3[0..15],nat(dst=192.168.1.2)),
> prereqs=ip
> +ct_dnat(192.168.1.2, 192.168.1.3); => Syntax error at `,' expecting `)'.
> +ct_dnat(foo); => Syntax error at `foo' invalid ip.
> +ct_dnat(foo, bar); => Syntax error at `foo' invalid ip.
> +ct_dnat(); => Syntax error at `)' invalid ip.
> +
> +# snat
> +ct_snat; => actions=ct(zone=NXM_NX_REG4[0..15],nat), prereqs=ip
> +ct_snat(192.168.1.2); =>
> actions=ct(commit,table=27,zone=NXM_NX_REG4[0..15],nat(src=192.168.1.2)),
> prereqs=ip
> +ct_snat(192.168.1.2, 192.168.1.3); => Syntax error at `,' expecting `)'.
> +ct_snat(foo); => Syntax error at `foo' invalid ip.
> +ct_snat(foo, bar); => Syntax error at `foo' invalid ip.
> +ct_snat(); => Syntax error at `)' invalid ip.
> +
> +
>  # arp
>  arp { eth.dst = ff:ff:ff:ff:ff:ff; output; }; =>
> actions=controller(userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.ff.ff.00.00.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.40.00.00.00),
> prereqs=ip4
>
> --
> 1.9.1
>
>



More information about the dev mailing list