[ovs-dev] [PATCH v4] ovn: DNAT and SNAT on a gateway router.

Guru Shetty guru at ovn.org
Tue Jun 21 14:46:12 UTC 2016


On 20 June 2016 at 19:36, Flaviof <flavio at flaviof.com> wrote:

> On Mon, Jun 13, 2016 at 6:45 AM, Gurucharan Shetty <guru at ovn.org> wrote:
>
> > For traffic from physical space to virtual space we need DNAT.
> > The DNAT happens in the gateway router and reaches the logical
> > port. The return traffic should be unDNATed.
> >
> > Traffic originating in virtual space heading to physical space
> > should be SNATed. The return traffic is unSNATted.
> >
> > East-west traffic with the public destination IP address needs
> > a DNAT. This traffic is punted to the l3 gateway where DNAT
> > takes place. This traffic is also SNATed and eventually loops back to
> > its destination. The SNAT is needed because we need the reverse traffic
> > to go back to the l3 gateway and not short-circuit directly to the
> source.
> >
> > This commit introduces 4 new logical actions.
> > 1. ct_snat: To send the packet through SNAT zone to unSNAT packets.
> > 2. ct_snat(IP): To SNAT to the provided IP address.
> > 3. ct_dnat: To send the packet throgh DNAT zone to unDNAT packets.
> > 4. ct_dnat(IP): To DNAT to the provided IP.
> >
> > This commit only provides the ability to do IP based NAT. This will
> > eventually be enhanced to do PORT based NAT too.
> >
> > Command hints:
> >
> > Consider a distributed router "R1" that has switch foo (192.168.1.0/24)
> > with a lport foo1 (192.168.1.2) and bar (192.168.2.0/24) with lport bar1
> > (192.168.2.2) connected to it. You connect "R1" to
> > a gateway router "R2" via a switch "join" in (20.0.0.0/24) network.
> >
> > R2 has a switch "alice" (172.16.1.0/24) connected to it (to simulate
> > external network).
> >
> > case: Add pure DNAT (north-south)
> >
> > Add a DNAT rule in R2:
> > ovn-nbctl -- --id=@nat create nat type="dnat" logical_ip=192.168.1.2 \
> > external_ip=30.0.0.2 -- add logical_router R2 nat @nat
> >
> > Now alice1 should be able to ping 192.168.1.2 via 30.0.0.2.
> >
> > case2 : Add pure SNAT (south-north)
> >
> > Add a SNAT rule in R2:
> >
> > ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \
> > external_ip=30.0.0.1 -- add logical_router R2 nat @nat
> >
> > (You need a static route in R1 to send packets destined to outside
> > world to go through R2. The logical_ip can be a subnet.)
> >
> > When bar1 pings alice1, alice1 receives traffic from 30.0.0.1
> >
> > case3 : SNAT and DNAT (east-west traffic)
> >
> > When bar1 pings 30.0.0.2, the traffic jumps to the gateway router
> > and loops back to foo1 with a source ip address of 30.0.0.1
> >
> >
> So, is 30.0.0.0/x network an external network that R2 has a port too?
>

The example above does not have that. In the above example 30.0.0.0/x is
being treated as virtual address. But in a real setup (non-simulated), you
are right. R2 will be connected to a 30.0.0.0/x network and will have a
port in it. It will also have a static route (0.0.0.0/0) or a
default_gateway to point to the physical router IP address as its next hop.
(I have not tested it as I do not have a real setup at hand, but based on
the simulation, it should ideally work.)


> What is the next hop that R2 would use to reach a destination beyond
> that subnet?
>
Answered above.


>
> I think this may be clear when a test is added to ovn.at, which uses foo,
> bar, join, alice
>
The unit tests do not have the ability to do conntrack NAT right now. I
think we should add one once Daniele introduces NAT to usespace conntrack.
But the unit test "ovn -- 2 HVs, 2 LRs connected via LS, gateway router"
does something very similar (it has foo - R1 - join - R2 - alice).


>
> Based on the code and my little test setup, there seems to be a high cost
> for DNAT entries in that an ARP response rule will be added per DNAT x all
> router ports.

The intention was to add only on the router where DNAT entry is defined and
not on all router ports of all routers. Is it not true? (If so, this is a
bug. ). The for loop which adds this entry, only looks at that datapath's
NAT entries.

On the gateway router itself, there would be typically two DNAT entries.
One of them connected to internal network (for east-west) and another one
at external port (facing physical router).



> In the example used by the commit message, ingress table 1 of
> the logical router will have arp response entries for inports alice and
> R2_join.
>
Right. That is because as explained above, I need to do DNAT for both
east-west as well as north-south. (It is very possible that I did not
understand your concern)


>
>
> Table 3: do we really intend to apply the actions 'inport = ""; ct_dnat;'
> to all ip packets that do not have an explicit dnat mapping?
>
Yes. This is a little tricky. I have tried to explain the rationale in a
comment above. The general idea is that in a gateway router, there will be
atleast one DNAT or SNAT entry. Otherwise, why have a gateway router? Also,
a re-circulation is considered to be very expensive. What we want is to
minimize re-circulations. With the code above, we have a minimum of
one-recirculation no matter what and a maximum of two re-circulations. I
have tried different ways to optimize it. There was a possibility of 3
re-circulations as a worst case if I did not force the minimum one
re-circulation. Probably there is a different way to optimize it (that I
haven't thought about).




>
> SNAT: do we need ARP reply rules for the SNAT addresses, similar to the
> ones added for DNAT?
>
I don't think we need ARP reply rules for SNAT entries. What is the use
case?


>
> SNAT: looking at the openflow table I see n mentioning of the address added
> to support SNAT. Ist that because that is all handled by connect_tracker
> and there is nothing to be done via openflow? Or maybe part of another
> patchset?
>

We do add SNAT specific rules. Search for S_ROUTER_IN_UNSNAT
and S_ROUTER_OUT_SNAT.


>
> Thanks,
>
> -- flaviof
>
>
>
>
> > Signed-off-by: Gurucharan Shetty <guru at ovn.org>
> > ---
> >  ovn/lib/actions.c           |  83 ++++++++++++++++++++
> >  ovn/northd/ovn-northd.8.xml | 131 ++++++++++++++++++++++++++++---
> >  ovn/northd/ovn-northd.c     | 187
> > ++++++++++++++++++++++++++++++++++++++++++--
> >  ovn/ovn-nb.ovsschema        |  19 ++++-
> >  ovn/ovn-nb.xml              |  65 +++++++++++++--
> >  ovn/ovn-sb.xml              |  41 ++++++++++
> >  ovn/utilities/ovn-nbctl.c   |   5 ++
> >  tests/ovn.at                |  17 ++++
> >  8 files changed, 524 insertions(+), 24 deletions(-)
> >
> > diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
> > index 5f0bf19..4a486a0 100644
> > --- a/ovn/lib/actions.c
> > +++ b/ovn/lib/actions.c
> > @@ -442,6 +442,85 @@ emit_ct(struct action_context *ctx, bool
> recirc_next,
> > bool commit)
> >      add_prerequisite(ctx, "ip");
> >  }
> >
> > +static void
> > +parse_ct_nat(struct action_context *ctx, bool snat)
> > +{
> > +    const size_t ct_offset = ctx->ofpacts->size;
> > +    ofpbuf_pull(ctx->ofpacts, ct_offset);
> > +
> > +    struct ofpact_conntrack *ct = ofpact_put_CT(ctx->ofpacts);
> > +
> > +    if (ctx->ap->cur_ltable < ctx->ap->n_tables) {
> > +        ct->recirc_table = ctx->ap->first_ptable + ctx->ap->cur_ltable +
> > 1;
> > +    } else {
> > +        action_error(ctx,
> > +                     "\"ct_[sd]nat\" action not allowed in last
> table.");
> > +        return;
> > +    }
> > +
> > +    if (snat) {
> > +        ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
> > +    } else {
> > +        ct->zone_src.field = mf_from_id(MFF_LOG_DNAT_ZONE);
> > +    }
> > +    ct->zone_src.ofs = 0;
> > +    ct->zone_src.n_bits = 16;
> > +    ct->flags = 0;
> > +    ct->alg = 0;
> > +
> > +    add_prerequisite(ctx, "ip");
> > +
> > +    struct ofpact_nat *nat;
> > +    size_t nat_offset;
> > +    nat_offset = ctx->ofpacts->size;
> > +    ofpbuf_pull(ctx->ofpacts, nat_offset);
> > +
> > +    nat = ofpact_put_NAT(ctx->ofpacts);
> > +    nat->flags = 0;
> > +    nat->range_af = AF_UNSPEC;
> > +
> > +    int commit = 0;
> > +    if (lexer_match(ctx->lexer, LEX_T_LPAREN)) {
> > +        ovs_be32 ip;
> > +        if (ctx->lexer->token.type == LEX_T_INTEGER
> > +            && ctx->lexer->token.format == LEX_F_IPV4) {
> > +            ip = ctx->lexer->token.value.ipv4;
> > +        } else {
> > +            action_syntax_error(ctx, "invalid ip");
> > +            return;
> > +        }
> > +
> > +        nat->range_af = AF_INET;
> > +        nat->range.addr.ipv4.min = ip;
> > +        if (snat) {
> > +            nat->flags |= NX_NAT_F_SRC;
> > +        } else {
> > +            nat->flags |= NX_NAT_F_DST;
> > +        }
> > +        commit = NX_CT_F_COMMIT;
> > +        lexer_get(ctx->lexer);
> > +        if (!lexer_match(ctx->lexer, LEX_T_RPAREN)) {
> > +            action_syntax_error(ctx, "expecting `)'");
> > +            return;
> > +        }
> > +    }
> > +
> > +    ctx->ofpacts->header = ofpbuf_push_uninit(ctx->ofpacts, nat_offset);
> > +    ct = ctx->ofpacts->header;
> > +    ct->flags |= commit;
> > +
> > +    /* XXX: For performance reasons, we try to prevent additional
> > +     * recirculations.  So far, ct_snat which is used in a gateway
> router
> > +     * does not need a recirculation. ct_snat(IP) does need a
> > recirculation.
> > +     * Should we consider a method to let the actions specify whether a
> > action
> > +     * needs recirculation if there more use cases?. */
> > +    if (!commit && snat) {
> > +        ct->recirc_table = NX_CT_RECIRC_NONE;
> > +    }
> > +    ofpact_finish(ctx->ofpacts, &ct->ofpact);
> > +    ofpbuf_push_uninit(ctx->ofpacts, ct_offset);
> > +}
> > +
> >  static bool
> >  parse_action(struct action_context *ctx)
> >  {
> > @@ -469,6 +548,10 @@ parse_action(struct action_context *ctx)
> >          emit_ct(ctx, true, false);
> >      } else if (lexer_match_id(ctx->lexer, "ct_commit")) {
> >          emit_ct(ctx, false, true);
> > +    } else if (lexer_match_id(ctx->lexer, "ct_dnat")) {
> > +        parse_ct_nat(ctx, false);
> > +    } else if (lexer_match_id(ctx->lexer, "ct_snat")) {
> > +        parse_ct_nat(ctx, true);
> >      } else if (lexer_match_id(ctx->lexer, "arp")) {
> >          parse_arp_action(ctx);
> >      } else if (lexer_match_id(ctx->lexer, "get_arp")) {
> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> > index 1983812..c237604 100644
> > --- a/ovn/northd/ovn-northd.8.xml
> > +++ b/ovn/northd/ovn-northd.8.xml
> > @@ -517,11 +517,40 @@ next;
> >
> >        <li>
> >          <p>
> > -          Reply to ARP requests.  These flows reply to ARP requests for
> > the
> > -          router's own IP address.  For each router port <var>P</var>
> > that owns
> > -          IP address <var>A</var> and Ethernet address <var>E</var>, a
> > -          priority-90 flow matches <code>inport == <var>P</var>
> &amp;&amp;
> > -          arp.op == 1 &amp;&amp; arp.tpa == <var>A</var></code> (ARP
> > request)
> > +          Reply to ARP requests.
> > +        </p>
> > +
> > +        <p>
> > +          These flows reply to ARP requests for the router's own IP
> > address.
> > +          For each router port <var>P</var> that owns IP address
> > <var>A</var>
> > +          and Ethernet address <var>E</var>, a priority-90 flow matches
> > +          <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
> > +          arp.tpa == <var>A</var></code> (ARP request) with the
> following
> > +          actions:
> > +        </p>
> > +
> > +        <pre>
> > +eth.dst = eth.src;
> > +eth.src = <var>E</var>;
> > +arp.op = 2; /* ARP reply. */
> > +arp.tha = arp.sha;
> > +arp.sha = <var>E</var>;
> > +arp.tpa = arp.spa;
> > +arp.spa = <var>A</var>;
> > +outport = <var>P</var>;
> > +inport = ""; /* Allow sending out inport. */
> > +output;
> > +        </pre>
> > +      </li>
> > +
> > +      <li>
> > +        <p>
> > +          These flows reply to ARP requests for the virtual IP addresses
> > +          configured in the router for DNAT. For a configured DNAT IP
> > address
> > +          <var>A</var>, for each router port <var>P</var> with Ethernet
> > +          address <var>E</var>, a priority-90 flow matches
> > +          <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
> > +          arp.tpa == <var>A</var></code> (ARP request)
> >            with the following actions:
> >          </p>
> >
> > @@ -663,7 +692,62 @@ icmp4 {
> >        </li>
> >      </ul>
> >
> > -    <h3>Ingress Table 2: IP Routing</h3>
> > +    <h3>Ingress Table 2: UNSNAT</h3>
> > +
> > +    <p>
> > +      This is for already established connections' reverse traffic.
> > +      i.e., SNAT has already been done in egress pipeline and now the
> > +      packet has entered the ingress pipeline as part of a reply.  It is
> > +      unSNATted here.
> > +    </p>
> > +
> > +    <ul>
> > +      <li>
> > +        <p>
> > +          For each configuration in the OVN Northbound database, that
> asks
> > +          to change the source IP address of a packet from <var>A</var>
> to
> > +          <var>B</var>, a priority-100 flow matches <code>ip &amp;&amp;
> > +          ip4.dst == <var>B</var></code> with an action
> > +          <code>ct_snat; next;</code>.
> > +        </p>
> > +
> > +        <p>
> > +          A priority-0 logical flow with match <code>1</code> has
> actions
> > +          <code>next;</code>.
> > +        </p>
> > +      </li>
> > +    </ul>
> > +
> > +    <h3>Ingress Table 3: DNAT</h3>
> > +
> > +    <p>
> > +      Packets enter the pipeline with destination IP address that needs
> to
> > +      be DNATted from a virtual IP address to a real IP address.
> Packets
> > +      in the reverse direction needs to be unDNATed.
> > +    </p>
> > +    <ul>
> > +      <li>
> > +        <p>
> > +          For each configuration in the OVN Northbound database, that
> asks
> > +          to change the destination IP address of a packet from
> > <var>A</var> to
> > +          <var>B</var>, a priority-100 flow matches <code>ip &amp;&amp;
> > +          ip4.dst == <var>A</var></code> with an action <code>inport =
> "";
> > +          ct_dnat(<var>B</var>);</code>.
> > +        </p>
> > +
> > +        <p>
> > +          For all IP packets of a Gateway router, a priority-50 flow
> with
> > an
> > +          action <code>inport = ""; ct_dnat;</code>.
> > +        </p>
> > +
> > +        <p>
> > +          A priority-0 logical flow with match <code>1</code> has
> actions
> > +          <code>next;</code>.
> > +        </p>
> > +      </li>
> > +    </ul>
> > +
> > +    <h3>Ingress Table 4: IP Routing</h3>
> >
> >      <p>
> >        A packet that arrives at this table is an IP packet that should be
> > routed
> > @@ -672,7 +756,7 @@ icmp4 {
> >        <code>ip4.dst</code>, the packet's final destination, unchanged)
> and
> >        advances to the next table for ARP resolution.  It also sets
> >        <code>reg1</code> to the IP address owned by the selected router
> > port
> > -      (which is used later in table 4 as the IP source address for an
> ARP
> > +      (which is used later in table 6 as the IP source address for an
> ARP
> >        request, if needed).
> >      </p>
> >
> > @@ -743,7 +827,7 @@ icmp4 {
> >        </li>
> >      </ul>
> >
> > -    <h3>Ingress Table 3: ARP Resolution</h3>
> > +    <h3>Ingress Table 5: ARP Resolution</h3>
> >
> >      <p>
> >        Any packet that reaches this table is an IP packet whose next-hop
> IP
> > @@ -798,7 +882,7 @@ icmp4 {
> >        </li>
> >      </ul>
> >
> > -    <h3>Ingress Table 4: ARP Request</h3>
> > +    <h3>Ingress Table 6: ARP Request</h3>
> >
> >      <p>
> >        In the common case where the Ethernet destination has been
> > resolved, this
> > @@ -823,7 +907,7 @@ arp {
> >          </pre>
> >
> >          <p>
> > -          (Ingress table 2 initialized <code>reg1</code> with the IP
> > address
> > +          (Ingress table 4 initialized <code>reg1</code> with the IP
> > address
> >            owned by <code>outport</code>.)
> >          </p>
> >
> > @@ -838,7 +922,32 @@ arp {
> >        </li>
> >      </ul>
> >
> > -    <h3>Egress Table 0: Delivery</h3>
> > +    <h3>Egress Table 0: SNAT</h3>
> > +
> > +    <p>
> > +      Packets that are configured to be SNATed get their source IP
> address
> > +      changed based on the configuration in the OVN Northbound database.
> > +    </p>
> > +    <ul>
> > +      <li>
> > +        <p>
> > +          For each configuration in the OVN Northbound database, that
> asks
> > +          to change the source IP address of a packet from an IP address
> > of
> > +          <var>A</var> or to change the source IP address of a packet
> that
> > +          belongs to network <var>A</var> to <var>B</var>, a flow
> matches
> > +          <code>ip &amp;&amp; ip4.src == <var>A</var></code> with an
> > action
> > +          <code>ct_snat(<var>B</var>);</code>.  The priority of the flow
> > +          is calculated based on the mask of <var>A</var>, with matches
> > +          having larger masks getting higher priorities.
> > +        </p>
> > +        <p>
> > +          A priority-0 logical flow with match <code>1</code> has
> actions
> > +          <code>next;</code>.
> > +        </p>
> > +      </li>
> > +    </ul>
> > +
> > +    <h3>Egress Table 1: Delivery</h3>
> >
> >      <p>
> >        Packets that reach this table are ready for delivery.  It contains
> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > index cac0148..4683780 100644
> > --- a/ovn/northd/ovn-northd.c
> > +++ b/ovn/northd/ovn-northd.c
> > @@ -105,12 +105,15 @@ enum ovn_stage {
> >      /* Logical router ingress stages. */                              \
> >      PIPELINE_STAGE(ROUTER, IN,  ADMISSION,   0, "lr_in_admission")    \
> >      PIPELINE_STAGE(ROUTER, IN,  IP_INPUT,    1, "lr_in_ip_input")     \
> > -    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  2, "lr_in_ip_routing")   \
> > -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 3, "lr_in_arp_resolve")  \
> > -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 4, "lr_in_arp_request")  \
> > +    PIPELINE_STAGE(ROUTER, IN,  UNSNAT,      2, "lr_in_unsnat")       \
> > +    PIPELINE_STAGE(ROUTER, IN,  DNAT,        3, "lr_in_dnat")         \
> > +    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  4, "lr_in_ip_routing")   \
> > +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 5, "lr_in_arp_resolve")  \
> > +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 6, "lr_in_arp_request")  \
> >                                                                        \
> >      /* Logical router egress stages. */                               \
> > -    PIPELINE_STAGE(ROUTER, OUT, DELIVERY,    0, "lr_out_delivery")
> > +    PIPELINE_STAGE(ROUTER, OUT, SNAT,      0, "lr_out_snat")          \
> > +    PIPELINE_STAGE(ROUTER, OUT, DELIVERY,  1, "lr_out_delivery")
> >
> >  #define PIPELINE_STAGE(DP_TYPE, PIPELINE, STAGE, TABLE, NAME)   \
> >      S_##DP_TYPE##_##PIPELINE##_##STAGE                          \
> > @@ -1998,6 +2001,51 @@ build_lrouter_flows(struct hmap *datapaths, struct
> > hmap *ports,
> >          free(match);
> >          free(actions);
> >
> > +        /* ARP handling for external IP addresses.
> > +         *
> > +         * DNAT IP addresses are external IP addresses that need ARP
> > +         * handling. */
> > +        for (int i = 0; i < op->od->nbr->n_nat; i++) {
> > +            const struct nbrec_nat *nat;
> > +
> > +            nat = op->od->nbr->nat[i];
> > +
> > +            if(!strcmp(nat->type, "snat")) {
> > +                continue;
> > +            }
> > +
> > +            ovs_be32 ip;
> > +            if (!ip_parse(nat->external_ip, &ip) || !ip) {
> > +                static struct vlog_rate_limit rl =
> > VLOG_RATE_LIMIT_INIT(5, 1);
> > +                VLOG_WARN_RL(&rl, "bad ip address %s in dnat
> > configuration "
> > +                             "for router %s", nat->external_ip,
> op->key);
> > +                continue;
> > +            }
> > +
> > +            match = xasprintf(
> > +                "inport == %s && arp.tpa == "IP_FMT" && arp.op == 1",
> > +                op->json_key, IP_ARGS(ip));
> > +            actions = xasprintf(
> > +                "eth.dst = eth.src; "
> > +                "eth.src = "ETH_ADDR_FMT"; "
> > +                "arp.op = 2; /* ARP reply */ "
> > +                "arp.tha = arp.sha; "
> > +                "arp.sha = "ETH_ADDR_FMT"; "
> > +                "arp.tpa = arp.spa; "
> > +                "arp.spa = "IP_FMT"; "
> > +                "outport = %s; "
> > +                "inport = \"\"; /* Allow sending out inport. */ "
> > +                "output;",
> > +                ETH_ADDR_ARGS(op->mac),
> > +                ETH_ADDR_ARGS(op->mac),
> > +                IP_ARGS(ip),
> > +                op->json_key);
> > +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90,
> > +                          match, actions);
> > +            free(match);
> > +            free(actions);
> > +        }
> > +
> >          /* Drop IP traffic to this router. */
> >          match = xasprintf("ip4.dst == "IP_FMT, IP_ARGS(op->ip));
> >          ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 60,
> > @@ -2005,6 +2053,135 @@ build_lrouter_flows(struct hmap *datapaths,
> struct
> > hmap *ports,
> >          free(match);
> >      }
> >
> > +    /* NAT in Gateway routers. */
> > +    HMAP_FOR_EACH (od, key_node, datapaths) {
> > +        if (!od->nbr) {
> > +            continue;
> > +        }
> > +
> > +        /* Packets are allowed by default. */
> > +        ovn_lflow_add(lflows, od, S_ROUTER_IN_UNSNAT, 0, "1", "next;");
> > +        ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 0, "1", "next;");
> > +        ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 0, "1", "next;");
> > +
> > +        /* NAT rules are only valid on Gateway routers. */
> > +        if (!smap_get(&od->nbr->options, "chassis")) {
> > +            continue;
> > +        }
> > +
> > +        for (int i = 0; i < od->nbr->n_nat; i++) {
> > +            const struct nbrec_nat *nat;
> > +
> > +            nat = od->nbr->nat[i];
> > +
> > +            ovs_be32 ip, mask;
> > +
> > +            char *error = ip_parse_masked(nat->external_ip, &ip, &mask);
> > +            if (error || mask != OVS_BE32_MAX) {
> > +                static struct vlog_rate_limit rl =
> > VLOG_RATE_LIMIT_INIT(5, 1);
> > +                VLOG_WARN_RL(&rl, "bad external ip %s for nat",
> > +                             nat->external_ip);
> > +                free(error);
> > +                continue;
> > +            }
> > +
> > +            /* Check the validity of nat->logical_ip. 'logical_ip' can
> > +             * be a subnet when the type is "snat". */
> > +            error = ip_parse_masked(nat->logical_ip, &ip, &mask);
> > +            if (!strcmp(nat->type, "snat")) {
> > +                if (error) {
> > +                    static struct vlog_rate_limit rl =
> > +                        VLOG_RATE_LIMIT_INIT(5, 1);
> > +                    VLOG_WARN_RL(&rl, "bad ip network or ip %s for snat
> "
> > +                                 "in router "UUID_FMT"",
> > +                                 nat->logical_ip, UUID_ARGS(&od->key));
> > +                    free(error);
> > +                    continue;
> > +                }
> > +            } else {
> > +                if (error || mask != OVS_BE32_MAX) {
> > +                    static struct vlog_rate_limit rl =
> > +                        VLOG_RATE_LIMIT_INIT(5, 1);
> > +                    VLOG_WARN_RL(&rl, "bad ip %s for dnat in router "
> > +                        ""UUID_FMT"", nat->logical_ip,
> > UUID_ARGS(&od->key));
> > +                    free(error);
> > +                    continue;
> > +                }
> > +            }
> > +
> > +
> > +            char *match, *actions;
> > +
> > +            /* Ingress UNSNAT table: It is for already established
> > connections'
> > +             * reverse traffic. i.e., SNAT has already been done in
> egress
> > +             * pipeline and now the packet has entered the ingress
> > pipeline as
> > +             * part of a reply. We undo the SNAT here.
> > +             *
> > +             * Undoing SNAT has to happen before DNAT processing.  This
> is
> > +             * because when the packet was DNATed in ingress pipeline,
> it
> > did
> > +             * not know about the possibility of eventual additional
> SNAT
> > in
> > +             * egress pipeline. */
> > +            if (!strcmp(nat->type, "snat")
> > +                || !strcmp(nat->type, "dnat_and_snat")) {
> > +                match = xasprintf("ip && ip4.dst == %s",
> > nat->external_ip);
> > +                ovn_lflow_add(lflows, od, S_ROUTER_IN_UNSNAT, 100,
> > +                              match, "ct_snat; next;");
> > +                free(match);
> > +            }
> > +
> > +            /* Ingress DNAT table: Packets enter the pipeline with
> > destination
> > +             * IP address that needs to be DNATted from a external IP
> > address
> > +             * to a logical IP address. */
> > +            if (!strcmp(nat->type, "dnat")
> > +                || !strcmp(nat->type, "dnat_and_snat")) {
> > +                /* Packet when it goes from the initiator to
> destination.
> > +                 * We need to zero the inport because the router can
> > +                 * send the packet back through the same interface. */
> > +                match = xasprintf("ip && ip4.dst == %s",
> > nat->external_ip);
> > +                actions = xasprintf("inport = \"\"; ct_dnat(%s);",
> > +                                    nat->logical_ip);
> > +                ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
> > +                           match, actions);
> > +                free(match);
> > +                free(actions);
> > +            }
> > +
> > +            /* Egress SNAT table: Packets enter the egress pipeline with
> > +             * source ip address that needs to be SNATted to a external
> ip
> > +             * address. */
> > +            if (!strcmp(nat->type, "snat")
> > +                || !strcmp(nat->type, "dnat_and_snat")) {
> > +                match = xasprintf("ip && ip4.src == %s",
> nat->logical_ip);
> > +                actions = xasprintf("ct_snat(%s);", nat->external_ip);
> > +
> > +                /* The priority here is calculated such that the
> > +                 * nat->logical_ip with the longest mask gets a higher
> > +                 * priority. */
> > +                ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT,
> > +                              count_1bits(ntohl(mask)) + 1, match,
> > actions);
> > +                free(match);
> > +                free(actions);
> > +            }
> > +        }
> > +
> > +        /* Re-circulate every packet through the DNAT zone.
> > +        * This helps with two things.
> > +        *
> > +        * 1. Any packet that needs to be unDNATed in the reverse
> > +        * direction gets unDNATed. Ideally this could be done in
> > +        * the egress pipeline. But since the gateway router
> > +        * does not have any feature that depends on the source
> > +        * ip address being external IP address for IP routing,
> > +        * we can do it here, saving a future re-circulation.
> > +        *
> > +        * 2. Any packet that was sent through SNAT zone in the
> > +        * previous table automatically gets re-circulated to get
> > +        * back the new destination IP address that is needed for
> > +        * routing in the openflow pipeline. */
> > +        ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
> > +                      "ip", "inport = \"\"; ct_dnat;");
> > +    }
> > +
> >      /* Logical router ingress table 2: IP Routing.
> >       *
> >       * A packet that arrives at this table is an IP packet that should
> be
> > @@ -2205,7 +2382,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
> > hmap *ports,
> >          ovn_lflow_add(lflows, od, S_ROUTER_IN_ARP_REQUEST, 0, "1",
> > "output;");
> >      }
> >
> > -    /* Logical router egress table 0: Delivery (priority 100).
> > +    /* Logical router egress table 1: Delivery (priority 100).
> >       *
> >       * Priority 100 rules deliver packets to enabled logical ports. */
> >      HMAP_FOR_EACH (op, key_node, ports) {
> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> > index fa21b30..ac6ca14 100644
> > --- a/ovn/ovn-nb.ovsschema
> > +++ b/ovn/ovn-nb.ovsschema
> > @@ -1,7 +1,7 @@
> >  {
> >      "name": "OVN_Northbound",
> > -    "version": "2.1.2",
> > -    "cksum": "429668869 5325",
> > +    "version": "2.1.3",
> > +    "cksum": "3631923697 6121",
> >      "tables": {
> >          "Logical_Switch": {
> >              "columns": {
> > @@ -78,6 +78,11 @@
> >                                     "max": "unlimited"}},
> >                  "default_gw": {"type": {"key": "string", "min": 0,
> "max":
> > 1}},
> >                  "enabled": {"type": {"key": "boolean", "min": 0, "max":
> > 1}},
> > +                "nat": {"type": {"key": {"type": "uuid",
> > +                                         "refTable": "NAT",
> > +                                         "refType": "strong"},
> > +                                 "min": 0,
> > +                                 "max": "unlimited"}},
> >                  "options": {
> >                       "type": {"key": "string",
> >                                "value": "string",
> > @@ -104,6 +109,16 @@
> >                  "ip_prefix": {"type": "string"},
> >                  "nexthop": {"type": "string"},
> >                  "output_port": {"type": {"key": "string", "min": 0,
> > "max": 1}}},
> > +            "isRoot": false},
> > +        "NAT": {
> > +            "columns": {
> > +                "external_ip": {"type": "string"},
> > +                "logical_ip": {"type": "string"},
> > +                "type": {"type": {"key": {"type": "string",
> > +                                           "enum": ["set", ["dnat",
> > +                                                             "snat",
> > +
> >  "dnat_and_snat"
> > +                                                               ]]}}}},
> >              "isRoot": false}
> >      }
> >  }
> > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> > index 130b63b..36d1158 100644
> > --- a/ovn/ovn-nb.xml
> > +++ b/ovn/ovn-nb.xml
> > @@ -631,18 +631,31 @@
> >        router has all ingress and egress traffic dropped.
> >      </column>
> >
> > +    <column name="nat">
> > +      One or more NAT rules for the router. NAT rules only work on the
> > +      Gateway routers.
> > +    </column>
> > +
> >      <group title="Options">
> >        <p>
> >          Additional options for the logical router.
> >        </p>
> >
> >        <column name="options" key="chassis">
> > -        If set, indicates that the logical router in question is
> > -        a Gateway router (which is centralized) and resides in the set
> > -        chassis.  The same value is also used by
> > <code>ovn-controller</code>
> > -        to uniquely identify the chassis in the OVN deployment and
> > -        comes from <code>external_ids:system-id</code> in the
> > -        <code>Open_vSwitch</code> table of Open_vSwitch database.
> > +        <p>
> > +          If set, indicates that the logical router in question is a
> > Gateway
> > +          router (which is centralized) and resides in the set chassis.
> > The
> > +          same value is also used by <code>ovn-controller</code> to
> > +          uniquely identify the chassis in the OVN deployment and
> > +          comes from <code>external_ids:system-id</code> in the
> > +          <code>Open_vSwitch</code> table of Open_vSwitch database.
> > +        </p>
> > +
> > +        <p>
> > +          The Gateway router can only be connected to a distributed
> router
> > +          via a switch if SNAT and DNAT are to be configured in the
> > Gateway
> > +          router.
> > +        </p>
> >        </column>
> >      </group>
> >
> > @@ -765,4 +778,44 @@
> >      </column>
> >    </table>
> >
> > +  <table name="NAT" title="NAT rules for a Gateway router.">
> > +    <p>
> > +      Each record represents a NAT rule in a Gateway router.
> > +    </p>
> > +
> > +    <column name="type">
> > +      <p>Type of the NAT rule.</p>
> > +      <ul>
> > +        <li>
> > +          When <ref column="type"/> is <code>dnat</code>, the externally
> > +          visible IP address <ref column="external_ip"/> is DNATted to
> > the IP
> > +          address <ref column="logical_ip"/> in the logical space.
> > +        </li>
> > +        <li>
> > +          When <ref column="type"/> is <code>snat</code>, IP packets
> > +          with their source IP address that either matches the IP
> address
> > +          in <ref column="logical_ip"/> or is in the network provided by
> > +          <ref column="logical_ip"/> is SNATed into the IP address in
> > +          <ref column="external_ip"/>.
> > +        </li>
> > +        <li>
> > +          When <ref column="type"/> is <code>dnat_and_snat</code>, the
> > +          externally visible IP address <ref column="external_ip"/> is
> > +          DNATted to the IP address <ref column="logical_ip"/> in the
> > +          logical space. In addition, IP packets with the source IP
> > +          address that matches <ref column="logical_ip"/> is SNATed into
> > +          the IP address in <ref column="external_ip"/>.
> > +        </li>
> > +      </ul>
> > +    </column>
> > +
> > +    <column name="external_ip">
> > +      An IPv4 address.
> > +    </column>
> > +
> > +    <column name="logical_ip">
> > +      An IPv4 network (e.g 192.168.1.0/24) or an IPv4 address.
> > +    </column>
> > +  </table>
> > +
> >  </database>
> > diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
> > index 1231b4e..5665871 100644
> > --- a/ovn/ovn-sb.xml
> > +++ b/ovn/ovn-sb.xml
> > @@ -951,6 +951,47 @@
> >            </p>
> >          </dd>
> >
> > +        <dt><code>ct_dnat;</code></dt>
> > +        <dt><code>ct_dnat(<var>IP</var>);</code></dt>
> > +        <dd>
> > +          <p>
> > +            <code>ct_dnat</code> sends the packet through the DNAT zone
> in
> > +            connection tracking table to unDNAT any packet that was
> > DNATed in
> > +            the opposite direction.  The packet is then automatically
> > sent to
> > +            to the next tables as if followed by <code>next;</code>
> > action.
> > +            The next tables will see the changes in the packet caused by
> > +            the connection tracker.
> > +          </p>
> > +          <p>
> > +            <code>ct_dnat(<var>IP</var>)</code> sends the packet through
> > the
> > +            DNAT zone to change the destination IP address of the packet
> > to
> > +            the one provided inside the parenthesis and commits the
> > connection.
> > +            The packet is then automatically sent to the next tables as
> if
> > +            followed by <code>next;</code> action.  The next tables will
> > see
> > +            the changes in the packet caused by the connection tracker.
> > +          </p>
> > +        </dd>
> > +
> > +        <dt><code>ct_snat;</code></dt>
> > +        <dt><code>ct_snat(<var>IP</var>);</code></dt>
> > +        <dd>
> > +          <p>
> > +            <code>ct_snat</code> sends the packet through the SNAT zone
> to
> > +            unSNAT any packet that was SNATed in the opposite direction.
> > If
> > +            the packet needs to be sent to the next tables, then it
> > should be
> > +            followed by a <code>next;</code> action.  The next tables
> > will not
> > +            see the changes in the packet caused by the connection
> > tracker.
> > +          </p>
> > +          <p>
> > +            <code>ct_snat(<var>IP</var>)</code> sends the packet through
> > the
> > +            SNAT zone to change the source IP address of the packet to
> > +            the one provided inside the parenthesis and commits the
> > connection.
> > +            The packet is then automatically sent to the next tables as
> if
> > +            followed by <code>next;</code> action.  The next tables will
> > see the
> > +            changes in the packet caused by the connection tracker.
> > +          </p>
> > +        </dd>
> > +
> >          <dt><code>arp { <var>action</var>; </code>...<code>
> };</code></dt>
> >          <dd>
> >            <p>
> > diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
> > index 321040e..b821307 100644
> > --- a/ovn/utilities/ovn-nbctl.c
> > +++ b/ovn/utilities/ovn-nbctl.c
> > @@ -1449,6 +1449,11 @@ static const struct ctl_table_class tables[] = {
> >         NULL},
> >        {NULL, NULL, NULL}}},
> >
> > +    {&nbrec_table_nat,
> > +     {{&nbrec_table_nat, NULL,
> > +       NULL},
> > +      {NULL, NULL, NULL}}},
> > +
> >      {NULL, {{NULL, NULL, NULL}, {NULL, NULL, NULL}}}
> >  };
> >
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 633cf35..19d5c73 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -507,6 +507,23 @@ ip.ttl => Syntax error at end of input expecting
> `--'.
> >  ct_next; => actions=ct(table=27,zone=NXM_NX_REG5[0..15]), prereqs=ip
> >  ct_commit; => actions=ct(commit,zone=NXM_NX_REG5[0..15]), prereqs=ip
> >
> > +# dnat
> > +ct_dnat; => actions=ct(table=27,zone=NXM_NX_REG3[0..15],nat), prereqs=ip
> > +ct_dnat(192.168.1.2); =>
> > actions=ct(commit,table=27,zone=NXM_NX_REG3[0..15],nat(dst=192.168.1.2)),
> > prereqs=ip
> > +ct_dnat(192.168.1.2, 192.168.1.3); => Syntax error at `,' expecting `)'.
> > +ct_dnat(foo); => Syntax error at `foo' invalid ip.
> > +ct_dnat(foo, bar); => Syntax error at `foo' invalid ip.
> > +ct_dnat(); => Syntax error at `)' invalid ip.
> > +
> > +# snat
> > +ct_snat; => actions=ct(zone=NXM_NX_REG4[0..15],nat), prereqs=ip
> > +ct_snat(192.168.1.2); =>
> > actions=ct(commit,table=27,zone=NXM_NX_REG4[0..15],nat(src=192.168.1.2)),
> > prereqs=ip
> > +ct_snat(192.168.1.2, 192.168.1.3); => Syntax error at `,' expecting `)'.
> > +ct_snat(foo); => Syntax error at `foo' invalid ip.
> > +ct_snat(foo, bar); => Syntax error at `foo' invalid ip.
> > +ct_snat(); => Syntax error at `)' invalid ip.
> > +
> > +
> >  # arp
> >  arp { eth.dst = ff:ff:ff:ff:ff:ff; output; }; =>
> >
> actions=controller(userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.ff.ff.00.00.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.40.00.00.00),
> > prereqs=ip4
> >
> > --
> > 1.9.1
> >
> >
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>



More information about the dev mailing list