[ovs-dev] [PATCH ovn v2] northd: Provide the Gateway router option 'lb_force_snat_ip' to take router port ips.

Mark Gray mark.d.gray at redhat.com
Thu Feb 18 17:27:12 UTC 2021


On 18/02/2021 17:21, Numan Siddique wrote:
> On Thu, Feb 18, 2021 at 9:08 PM Mark Gray <mark.d.gray at redhat.com> wrote:
>>
>> On 18/02/2021 14:30, numans at ovn.org wrote:
>>> From: Numan Siddique <numans at ovn.org>
>>>
>>> When a Gateway router is configured with a load balancer
>>> and it is also configured with options:lb_force_snat_ip=<IP>,
>>> OVN after load balancing the destination IP to one of the
>>> backend also does a NAT on the source ip with the
>>> lb_force_snat_ip if the packet is destined to a load balancer
>>> VIP.
>>>
>>> There is a problem with the snat of source ip to 'lb_force_snat_ip'
>>> in one particular usecase.  When the packet enters the Gateway router
>>> from a provider logical switch destined to the load balancer VIP,
>>> then it is first load balanced to one of the backend and then
>>> the source ip is snatted to 'lb_force_snat_ip'.  If the chosen
>>> backend is reachable via the provider logical switch, then the
>>> packet is hairpinned back and it may hit the wire with
>>> the source ip 'lb_force_snat_ip'.  If 'lb_force_snat_ip' happens
>>> to be an OVN internal IP then the packet may be dropped.
>>>
>>> This patch addresses this issue by providing the option to
>>> set the option - 'lb_force_snat_ip=router_ip'.  If 'router_ip'
>>> is set, then OVN will snat the load balanced packet to the
>>> router ip of the logical router port which is chosen as 'outport'
>>> in lr_in_ip_routing stage.
>>>
>>> Example.
>>>
>>> If the gateway router is
>>>
>>> ovn-nbctl show lr0
>>> router 68f20092-5563-44b8-9ccb-b11de3e3a66c (lr0)
>>>     port lr0-sw0
>>>         mac: "00:00:00:00:ff:01"
>>>         networks: ["10.0.0.1/24"]
>>>     port lr0-public
>>>         mac: "00:00:20:20:12:13"
>>>         networks: ["172.168.0.100/24"]
>>>
>>> Then the below logical flows are added if 'lb_force_snat_ip'
>>> is configured to 'router_ip'.
>>>
>>> table=1 (lr_out_snat), priority=110
>>>    match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"),
>>>    action=(ct_snat(172.168.0.100);)
>>>
>>> table=1 (lr_out_snat), priority=110
>>>    match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0")
>>>    action=(ct_snat(10.0.0.1);)
>>>
>>> For the above described scenario, the packet will have source ip as
>>> 172.168.0.100 which belongs to the provider logical switch CIDR.
>>>
>>> Reported-by: Tim Rozet <trozet at redhat.com>
>>> Signed-off-by: Numan Siddique <numans at ovn.org>
>>> ---
>>>
>>> v1 -> v2
>>> -------
>>>   * Addressed review comments from Mark Gray.
>>>
>>>  northd/ovn-northd.8.xml | 35 ++++++++++++++++
>>>  northd/ovn-northd.c     | 93 +++++++++++++++++++++++++++++++++++++----
>>>  ovn-nb.xml              | 33 ++++++++++-----
>>>  tests/ovn-northd.at     | 79 ++++++++++++++++++++++++++++++++++
>>>  4 files changed, 222 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>>> index 2f8b4e8c30..1d8f5edee1 100644
>>> --- a/northd/ovn-northd.8.xml
>>> +++ b/northd/ovn-northd.8.xml
>>> @@ -3654,6 +3654,32 @@ nd_ns {
>>>            <code>flags.force_snat_for_dnat == 1 && ip</code> with an
>>>            action <code>ct_snat(<var>B</var>);</code>.
>>>          </p>
>>> +      </li>
>>> +
>>> +      <li>
>>> +        <p>
>>> +          If the Gateway router in the OVN Northbound database has been
>>> +          configured to force SNAT a packet (that has been previously
>>> +          load-balanced) using router IP (i.e <ref column="options"
>>> +          table="Logical_Router"/>:lb_force_snat_ip=router_ip), then for
>>> +          each logical router port <var>P</var> attached to the Gateway
>>> +          router, a priority-110 flow matches
>>> +          <code>flags.force_snat_for_lb == 1 && outport == <var>P</var>
>>> +          </code> with an action <code>ct_snat(<var>R</var>);</code>
>>> +          where <var>R</var> is the IP configured on the router port.
>>> +          If <code>R</code> is an IPv4 address then the match will also
>>> +          include <code>ip4</code> and if it is an IPv6 address, then the
>>> +          match will also include <code>ip6</code>.
>>> +        </p>
>>> +
>>> +        <p>
>>> +          If the logical router port <var>P</var> is configured with multiple
>>> +          IPv4 and multiple IPv6 addresses, only the first IPv4 and first IPv6
>>> +          address is considered.
>>> +        </p>
>>> +      </li>
>>> +
>>> +      <li>
>>>          <p>
>>>            If the Gateway router in the OVN Northbound database has been
>>>            configured to force SNAT a packet (that has been previously
>>> @@ -3661,6 +3687,9 @@ nd_ns {
>>>            <code>flags.force_snat_for_lb == 1 && ip</code> with an
>>>            action <code>ct_snat(<var>B</var>);</code>.
>>>          </p>
>>> +      </li>
>>> +
>>> +      <li>
>>>          <p>
>>>            For each configuration in the OVN Northbound database, that asks
>>>            to change the source IP address of a packet from an IP address of
>>> @@ -3674,14 +3703,18 @@ nd_ns {
>>>            options, then the action would be <code>ip4/6.src=
>>>            (<var>B</var>)</code>.
>>>          </p>
>>> +      </li>
>>>
>>> +      <li>
>>>          <p>
>>>            If the NAT rule has <code>allowed_ext_ips</code> configured, then
>>>            there is an additional match <code>ip4.dst == <var>allowed_ext_ips
>>>            </var></code>. Similarly, for IPV6, match would be <code>ip6.dst ==
>>>            <var>allowed_ext_ips</var></code>.
>>>          </p>
>>> +      </li>
>>>
>>> +      <li>
>>>          <p>
>>>            If the NAT rule has <code>exempted_ext_ips</code> set, then
>>>            there is an additional flow configured at the priority + 1 of
>>> @@ -3690,7 +3723,9 @@ nd_ns {
>>>            </code>. This flow is used to bypass the ct_snat action for a packet
>>>            which is destinted to <code>exempted_ext_ips</code>.
>>>          </p>
>>> +      </li>
>>>
>>> +      <li>
>>>          <p>
>>>            A priority-0 logical flow with match <code>1</code> has actions
>>>            <code>next;</code>.
>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>> index 39d7987820..816836ab02 100644
>>> --- a/northd/ovn-northd.c
>>> +++ b/northd/ovn-northd.c
>>> @@ -624,6 +624,7 @@ struct ovn_datapath {
>>>
>>>      struct lport_addresses dnat_force_snat_addrs;
>>>      struct lport_addresses lb_force_snat_addrs;
>>> +    bool lb_force_snat_router_ip;
>>>
>>>      struct ovn_port **localnet_ports;
>>>      size_t n_localnet_ports;
>>> @@ -717,14 +718,29 @@ init_nat_entries(struct ovn_datapath *od)
>>>          }
>>>      }
>>>
>>> -    if (get_force_snat_ip(od, "lb", &od->lb_force_snat_addrs)) {
>>> -        if (od->lb_force_snat_addrs.n_ipv4_addrs) {
>>> -            snat_ip_add(od, od->lb_force_snat_addrs.ipv4_addrs[0].addr_s,
>>> -                        NULL);
>>> -        }
>>> -        if (od->lb_force_snat_addrs.n_ipv6_addrs) {
>>> -            snat_ip_add(od, od->lb_force_snat_addrs.ipv6_addrs[0].addr_s,
>>> -                        NULL);
>>> +    /* Check if 'lb_force_snat_ip' is configured with 'router_ip'. */
>>> +    const char *lb_force_snat =
>>> +        smap_get(&od->nbr->options, "lb_force_snat_ip");
>>> +    if (lb_force_snat && !strcmp(lb_force_snat, "router_ip")
>>> +            && smap_get(&od->nbr->options, "chassis")) {
>>> +        /* Set it to true only if its gateway router and
>>> +         * options:lb_force_snat_ip=router_ip. */
>>> +        od->lb_force_snat_router_ip = true;
>>> +    } else {
>>> +        od->lb_force_snat_router_ip = false;
>>> +
>>> +        /* Check if 'lb_force_snat_ip' is configured with a set of
>>> +         * IP address(es). */
>>> +
>> Remove vertical whitespace ^

I just mean the newline. There seems to be an unnecessary gap between
the comments and code. Sorry about the confusion.
> 
> You mean to have comments like below ?
> 
>    /* Check if 'lb_force_snat_ip' is configured with a set of
>     * IP address(es).
>     */
>>
>>> +        if (get_force_snat_ip(od, "lb", &od->lb_force_snat_addrs)) {
>>> +            if (od->lb_force_snat_addrs.n_ipv4_addrs) {
>>> +                snat_ip_add(od, od->lb_force_snat_addrs.ipv4_addrs[0].addr_s,
>>> +                            NULL);
>>> +            }
>>> +            if (od->lb_force_snat_addrs.n_ipv6_addrs) {
>>> +                snat_ip_add(od, od->lb_force_snat_addrs.ipv6_addrs[0].addr_s,
>>> +                            NULL);
>>> +            }
>>>          }
>>>      }
>>>
>>> @@ -8953,6 +8969,65 @@ build_lrouter_force_snat_flows(struct hmap *lflows, struct ovn_datapath *od,
>>>      ds_destroy(&actions);
>>>  }
>>>
>>> +static void
>>> +build_lrouter_force_snat_flows_op(struct ovn_port *op,
>>> +                                  struct hmap *lflows,
>>> +                                  struct ds *match, struct ds *actions)
>>> +{
>>> +    if (!op->nbrp || !op->peer || !op->od->lb_force_snat_router_ip) {
>>> +        return;
>>> +    }
>>> +
>>> +    if (op->lrp_networks.n_ipv4_addrs) {
>>> +        ds_clear(match);
>>> +        ds_clear(actions);
>>> +
>>> +        /* Higher priority rules to force SNAT with the router port ip.
>>> +         * This only takes effect when the packet has already been
>>> +         * load balanced once. */
>>> +        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip4 && "
>>> +                      "outport == %s", op->json_key);
>>> +        ds_put_format(actions, "ct_snat(%s);",
>>> +                      op->lrp_networks.ipv4_addrs[0].addr_s);
>>> +        ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
>>> +                      ds_cstr(match), ds_cstr(actions));
>>> +
>> Remove vertical whitespace ^
> 
> I am confused. Can you please elaborate ?

Same as above
> 
> 
>>> +        if (op->lrp_networks.n_ipv4_addrs > 2) {
>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>> +            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
>>> +                              "multiple IPv4 networks.  Only the first "
>> s/networks/addresses?
>>> +                              "IP [%s] is considered as SNAT for load "
>>> +                              "balancer", op->json_key,
>>> +                              op->lrp_networks.ipv4_addrs[0].addr_s);
>>> +        }
>>> +    }
>>> +
>>> +    /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
>>> +     * last in the list. So add the flows only if n_ipv6_addrs > 1. */
>>
>> How is the ordering guaranteed? This would cause problems if it was
>> changed. Any way to check?
> 
> I don't think we can check that. If a logical port networks has ['a', 'b'], then
> ovn-northd will always use 'a' in every loop it runs.
> 
> However if you add another item to networks like
> ovn-nbctl add logical_router_port networks 'c',
> then ovn-northd could see it as ['c', 'b', 'a'] for example and in that case
> it will use 'c'.
> 
> Since in most deployments, a router port would be configured with
> only one network, I think it's fine if we just consider the first address
> ovn-northd sees.

Ok

> 
> 
>>
>>> +    if (op->lrp_networks.n_ipv6_addrs > 1) {
>>> +        ds_clear(match);
>>> +        ds_clear(actions);
>>> +
>>> +        /* Higher priority rules to force SNAT with the router port ip.
>>> +         * This only takes effect when the packet has already been
>>> +         * load balanced once. */
>>> +        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
>>> +                      "outport == %s", op->json_key);
>>> +        ds_put_format(actions, "ct_snat(%s);",
>>> +                      op->lrp_networks.ipv6_addrs[0].addr_s);
>>> +        ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
>>> +                      ds_cstr(match), ds_cstr(actions));
>>> +        if (op->lrp_networks.n_ipv6_addrs > 2) {
>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>> +            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
>>> +                              "multiple IPv6 networks.  Only the first "
>> s/networks/addresses?
> 
> I used the word networks because the column name is 'networks'. The host part
> of the network obviously will be an IP address. Although I don't mind changing
> it to addresses if you prefer. Let me know what you think.
> 
> Thanks
> Numan
> 

It looked like the .xml documentation specified 'addresses' which is why
I suggested it. Change to whatever you think makes sense.

> 
>>> +                              "IP [%s] is considered as SNAT for load "
>>> +                              "balancer", op->json_key,
>>> +                              op->lrp_networks.ipv6_addrs[0].addr_s);
>>> +        }
>>> +    }
>>> +}
>>> +
>>>  static void
>>>  build_lrouter_bfd_flows(struct hmap *lflows, struct ovn_port *op)
>>>  {
>>> @@ -11503,6 +11578,8 @@ build_lswitch_and_lrouter_iterate_by_op(struct ovn_port *op,
>>>                                              &lsi->match, &lsi->actions);
>>>      build_lrouter_ipv4_ip_input(op, lsi->lflows,
>>>                                  &lsi->match, &lsi->actions);
>>> +    build_lrouter_force_snat_flows_op(op, lsi->lflows, &lsi->match,
>>> +                                      &lsi->actions);
>>>  }
>>>
>>>  static void
>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>> index a94918bb63..b0a4adffe5 100644
>>> --- a/ovn-nb.xml
>>> +++ b/ovn-nb.xml
>>> @@ -1936,16 +1936,29 @@
>>>        </column>
>>>        <column name="options" key="lb_force_snat_ip">
>>>          <p>
>>> -          If set, indicates a set of IP addresses to use to force SNAT a packet
>>> -          that has already been load-balanced in the gateway router.  When
>>> -          multiple gateway routers are configured, a packet can potentially
>>> -          enter any of the gateway routers, get DNATted as part of the load-
>>> -          balancing and eventually reach the logical switch port.
>>> -          For the return traffic to go back to the same gateway router (for
>>> -          unDNATing), the packet needs a SNAT in the first place.  This can be
>>> -          achieved by setting the above option with a gateway specific set of
>>> -          IP addresses. This option may have exactly one IPv4 and/or one IPv6
>>> -          address on it, separated by a space character.
>>> +          If set, this option can take two possible type of values.  Either
>>> +          a set of IP addresses or the string value - <code>router_ip</code>.
>>> +        </p>
>>> +
>>> +        <p>
>>> +          If a set of IP addresses are configured, it indicates to use to
>>> +          force SNAT a packet that has already been load-balanced in the
>>> +          gateway router.  When multiple gateway routers are configured, a
>>> +          packet can potentially enter any of the gateway routers, get
>>> +          DNATted as part of the load-balancing and eventually reach the
>>> +          logical switch port.  For the return traffic to go back to the
>>> +          same gateway router (for unDNATing), the packet needs a SNAT in the
>>> +          first place.  This can be achieved by setting the above option with
>>> +          a gateway specific set of IP addresses. This option may have exactly
>>> +          one IPv4 and/or one IPv6 address on it, separated by a space
>>> +          character.
>>> +        </p>
>>> +
>>> +        <p>
>>> +          If it is configured with the value <code>router_ip</code>, then
>>> +          the load balanced packet is SNATed with the IP of router port
>>> +          (attached to the gateway router) selected as the destination after
>>> +          taking the routing decision.
>>>          </p>
>>>        </column>
>>>        <column name="options" key="mcast_relay" type='{"type": "boolean"}'>
>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>> index 64a7880677..e4831aea9b 100644
>>> --- a/tests/ovn-northd.at
>>> +++ b/tests/ovn-northd.at
>>> @@ -2520,3 +2520,82 @@ check ovn-sbctl set chassis hv1 other_config:port-up-notif=true
>>>  wait_row_count nb:Logical_Switch_Port 1 up=false name=lsp1
>>>
>>>  AT_CLEANUP
>>> +
>>> +AT_SETUP([ovn -- lb_force_snat_ip for Gateway Routers])
>>> +ovn_start
>>> +
>>> +check ovn-nbctl ls-add sw0
>>> +check ovn-nbctl ls-add sw1
>>> +
>>> +# Create a logical router and attach both logical switches
>>> +check ovn-nbctl lr-add lr0
>>> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
>>> +check ovn-nbctl lsp-add sw0 sw0-lr0
>>> +check ovn-nbctl lsp-set-type sw0-lr0 router
>>> +check ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:ff:01
>>> +check ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0
>>> +
>>> +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24
>>> +check ovn-nbctl lsp-add sw1 sw1-lr0
>>> +check ovn-nbctl lsp-set-type sw1-lr0 router
>>> +check ovn-nbctl lsp-set-addresses sw1-lr0 00:00:00:00:ff:02
>>> +check ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1
>>> +
>>> +check ovn-nbctl ls-add public
>>> +check ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.168.0.100/24
>>> +check ovn-nbctl lsp-add public public-lr0
>>> +check ovn-nbctl lsp-set-type public-lr0 router
>>> +check ovn-nbctl lsp-set-addresses public-lr0 router
>>> +check ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public
>>> +
>>> +check ovn-nbctl set logical_router lr0 options:chassis=ch1
>>> +
>>> +ovn-sbctl dump-flows lr0 > lr0flows
>>> +AT_CAPTURE_FILE([lr0flows])
>>> +
>>> +AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], [0], [dnl
>>> +])
>>> +
>>> +check ovn-nbctl --wait=sb set logical_router lr0 options:lb_force_snat_ip="20.0.0.4 aef0::4"
>>> +
>>> +ovn-sbctl dump-flows lr0 > lr0flows
>>> +AT_CAPTURE_FILE([lr0flows])
>>> +
>>> +AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], [0], [dnl
>>> +  table=1 (lr_out_snat        ), priority=100  , match=(flags.force_snat_for_lb == 1 && ip4), action=(ct_snat(20.0.0.4);)
>>> +  table=1 (lr_out_snat        ), priority=100  , match=(flags.force_snat_for_lb == 1 && ip6), action=(ct_snat(aef0::4);)
>>> +])
>>> +
>>> +check ovn-nbctl --wait=sb set logical_router lr0 options:lb_force_snat_ip="router_ip"
>>> +
>>> +ovn-sbctl dump-flows lr0 > lr0flows
>>> +AT_CAPTURE_FILE([lr0flows])
>>> +
>>> +AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], [0], [dnl
>>> +  table=1 (lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
>>> +  table=1 (lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>>> +  table=1 (lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
>>> +])
>>> +
>>> +check ovn-nbctl --wait=sb remove logical_router lr0 options chassis
>>> +
>>> +ovn-sbctl dump-flows lr0 > lr0flows
>>> +AT_CAPTURE_FILE([lr0flows])
>>> +
>>> +AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], [0], [dnl
>>> +])
>>> +
>>> +check ovn-nbctl set logical_router lr0 options:chassis=ch1
>>> +check ovn-nbctl --wait=sb add logical_router_port lr0-sw1 networks "bef0\:\:1/64"
>>> +
>>> +ovn-sbctl dump-flows lr0 > lr0flows
>>> +AT_CAPTURE_FILE([lr0flows])
>>> +
>>> +AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], [0], [dnl
>>> +  table=1 (lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
>>> +  table=1 (lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>>> +  table=1 (lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
>>> +  table=1 (lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
>>> +])
>>> +
>>> +AT_CLEANUP
>>>
>>
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
> 



More information about the dev mailing list