[ovs-dev] [PATCH v3 2/5] ovn: Introduce l3 gateway router.

Darrell Ball dlu998 at gmail.com
Tue May 24 22:17:14 UTC 2016


On Tue, May 24, 2016 at 7:41 AM, Guru Shetty <guru at ovn.org> wrote:

>
>
> On 21 May 2016 at 11:48, Darrell Ball <dlu998 at gmail.com> wrote:
>
>> I made some modifications to code in Patches 1 and 2 to remove the Transit
>> LS
>> requirements.
>>
>
> These are the reasons for the need of a LS to be able to be connected to
> multiple routers.
> 1. I think it should be left to the upstream user on how they want to
> connect their DRs with GRs. On a 1000 node k8s cluster using peering would
> mean that I need to add 1000 DR router ports and manage 1000 subnets. If I
> use a LS in-between I need to add only one router port for the DR and
> manage only one subnet.
>

I realize part of the topology is not controllable for k8s and this case
specifically.

This is a case where 1000 HVs each have their own "GR router" connected to
a DR for east-west support.
There is a tradeoff b/w distributing 1000 DR ports and extra static route
associated flows for 1 DR datapath to each HV
 vs
1 DR port, 1000 Transit LS ports total and a transit LS datapath required
on 1000 HVs and distributing the
Transit LS datapath flows to all 1000 HVs, as well as each Transit LS peer
port requiring an extra arp flow. Thats 1000
extra arp flows for each HV.

For subnet management, I don't see much issues either way. /31 subnet
management is trivial and easy to automate.

For this specific K8s case, its not clear whether using a Transit LS is
worse overall factoring in both
data packet pipeline, number of flows, number of datapaths and extra
complexity.
In most cases, avoiding a Transit LS would be better.

Also, supporting transit LS should be prevent other optimizations.



> 2. The ability to connect multiple routers to a switch is needed on the
> north side of the GR as we will need to connect multiple GRs to a switch to
> be able to access the physical network for ARP resolution. This is for both
> north-south as well as east-west.
>

This is not a transit LS OVN case.



> 3. A Transit LS is needed for the final patch in this series to work (i.e
> actual DNAT and SNAT). The final patch needs the packet to enter the
> ingress pipeline of a router. The current implementation cannot handle
> peering as packets enter the egress pipeline of the router. To support
> peering, it will need further enhancements.
>

The dependency of the final patch on Transit LS usage/topology is something
that I wanted to make clear with
this exchange, especially for folks not part of the discussion last week.



>
>
>
>>
>> In summary:
>>
>> I removed all changes to lflow.c thereby reinstating the previous
>> optimization.
>>
>> I made some modifications to ovn-northd.c changes to remove the Transit LS
>> special
>> aspects and additional arp flows.
>>
>> I left the other code changes in Patches 1 and 2 as they were.
>>
>> The overall resulting diff to support both patches 1 and 2 is reduced in
>> ovn-northd.c
>> and becomes:
>>
>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> index b271f7f..2e2236b 100644
>> --- a/ovn/northd/ovn-northd.c
>> +++ b/ovn/northd/ovn-northd.c
>> @@ -702,11 +702,25 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>>  {
>>      sbrec_port_binding_set_datapath(op->sb, op->od->sb);
>>      if (op->nbr) {
>> -        sbrec_port_binding_set_type(op->sb, "patch");
>> +        /* If the router is for l3 gateway, it resides on a chassis
>> +         * and its port type is "gateway". */
>> +        const char *chassis = smap_get(&op->od->nbr->options, "chassis");
>> +
>> +        if (chassis && op->peer && op->peer->od && op->peer->od->nbs){
>> +            sbrec_port_binding_set_type(op->sb, "gateway");
>> +        } else {
>> +            sbrec_port_binding_set_type(op->sb, "patch");
>> +        }
>>
>>          const char *peer = op->peer ? op->peer->key : "<error>";
>> -        const struct smap ids = SMAP_CONST1(&ids, "peer", peer);
>> -        sbrec_port_binding_set_options(op->sb, &ids);
>> +        struct smap new;
>> +        smap_init(&new);
>> +        smap_add(&new, "peer", peer);
>> +        if (chassis) {
>> +            smap_add(&new, "gateway-chassis", chassis);
>> +        }
>> +        sbrec_port_binding_set_options(op->sb, &new);
>> +        smap_destroy(&new);
>>
>>          sbrec_port_binding_set_parent_port(op->sb, NULL);
>>          sbrec_port_binding_set_tag(op->sb, NULL, 0);
>> @@ -716,15 +730,31 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>>              sbrec_port_binding_set_type(op->sb, op->nbs->type);
>>              sbrec_port_binding_set_options(op->sb, &op->nbs->options);
>>          } else {
>> -            sbrec_port_binding_set_type(op->sb, "patch");
>> +            const char *chassis = NULL;
>> +            if (op->peer && op->peer->od && op->peer->od->nbr) {
>> +                chassis = smap_get(&op->peer->od->nbr->options,
>> "chassis");
>> +            }
>> +            /* A switch port connected to a gateway router is also of
>> +             * type "gateway". */
>> +            if (chassis) {
>> +                sbrec_port_binding_set_type(op->sb, "gateway");
>> +            } else {
>> +                sbrec_port_binding_set_type(op->sb, "patch");
>> +            }
>>
>>              const char *router_port = smap_get(&op->nbs->options,
>>                                                 "router-port");
>>              if (!router_port) {
>>                  router_port = "<error>";
>>              }
>> -            const struct smap ids = SMAP_CONST1(&ids, "peer",
>> router_port);
>> -            sbrec_port_binding_set_options(op->sb, &ids);
>> +            struct smap new;
>> +            smap_init(&new);
>> +            smap_add(&new, "peer", router_port);
>> +            if (chassis) {
>> +                smap_add(&new, "gateway-chassis", chassis);
>> +            }
>> +            sbrec_port_binding_set_options(op->sb, &new);
>> +            smap_destroy(&new);
>>          }
>>          sbrec_port_binding_set_parent_port(op->sb, op->nbs->parent_name);
>>          sbrec_port_binding_set_tag(op->sb, op->nbs->tag, op->nbs->n_tag);
>>
>>
>> I added a new test to demonstrate direct DR<->GR connectivity.
>>
>>
>> AT_SETUP([ovn -- 2 HVs, 3 LRs, 1 DR directly connected to 2 gateway
>> routers
>> ])
>> AT_KEYWORDS([ovndirectlyconnectedrouters])
>> AT_SKIP_IF([test $HAVE_PYTHON = no])
>> ovn_start
>>
>> # Logical network:
>> # Three LRs - R1, R2 and R3 that are connected to each other directly
>> # in 20.0.0.2/31 and 21.0.0.2/31 networks. R1 has switch foo (
>> 192.168.1.0/24
>> )
>> # connected to it. R2 has alice (172.16.1.0/24) and R3 has bob (
>> 10.32.1.0/24
>> )
>> # connected to it.
>>
>> ovn-nbctl create Logical_Router name=R1
>> ovn-nbctl create Logical_Router name=R2 options:chassis="hv2"
>> ovn-nbctl create Logical_Router name=R3 options:chassis="hv2"
>>
>> ovn-nbctl lswitch-add foo
>> ovn-nbctl lswitch-add alice
>> ovn-nbctl lswitch-add bob
>>
>> # Connect foo to R1
>> ovn-nbctl -- --id=@lrp create Logical_Router_port name=foo \
>> network=192.168.1.1/24 mac=\"00:00:01:01:02:03\" -- add Logical_Router
>> R1 \
>> ports @lrp -- lport-add foo rp-foo
>>
>> ovn-nbctl set Logical_port rp-foo type=router options:router-port=foo \
>> addresses=\"00:00:01:01:02:03\"
>>
>> # Connect alice to R2
>> ovn-nbctl -- --id=@lrp create Logical_Router_port name=alice \
>> network=172.16.1.1/24 mac=\"00:00:02:01:02:03\" -- add Logical_Router R2
>> \
>> ports @lrp -- lport-add alice rp-alice
>>
>> ovn-nbctl set Logical_port rp-alice type=router options:router-port=alice
>> \
>> addresses=\"00:00:02:01:02:03\"
>>
>> # Connect bob to R3
>> ovn-nbctl -- --id=@lrp create Logical_Router_port name=bob \
>> network=10.32.1.1/24 mac=\"00:00:03:01:02:03\" -- add Logical_Router R3 \
>> ports @lrp -- lport-add bob rp-bob
>>
>> ovn-nbctl set Logical_port rp-bob type=router options:router-port=bob \
>> addresses=\"00:00:03:01:02:03\"
>>
>> # Interconnect R1 and R2
>> lrp1_uuid_2_R2=`ovn-nbctl -- --id=@lrp create Logical_Router_port
>> name=R1_R2 \
>> network="20.0.0.2/31" mac=\"00:00:00:02:03:04\" \
>> -- add Logical_Router R1 ports @lrp`
>>
>> lrp2_uuid_2_R1=`ovn-nbctl -- --id=@lrp create Logical_Router_port
>> name=R2_R1 \
>> network="20.0.0.3/31" mac=\"00:00:00:02:03:05\" \
>> -- add Logical_Router R2 ports @lrp`
>>
>> ovn-nbctl set logical_router_port $lrp1_uuid_2_R2 peer="R2_R1"
>> ovn-nbctl set logical_router_port $lrp2_uuid_2_R1 peer="R1_R2"
>>
>> # Interconnect R1 and R3
>> lrp1_uuid_2_R3=`ovn-nbctl -- --id=@lrp create Logical_Router_port
>> name=R1_R3 \
>> network="21.0.0.2/31" mac=\"00:00:21:02:03:04\" \
>> -- add Logical_Router R1 ports @lrp`
>>
>> lrp3_uuid_2_R1=`ovn-nbctl -- --id=@lrp create Logical_Router_port
>> name=R3_R1 \
>> network="21.0.0.3/31" mac=\"00:00:21:02:03:05\" \
>> -- add Logical_Router R3 ports @lrp`
>>
>> ovn-nbctl set logical_router_port $lrp1_uuid_2_R3 peer="R3_R1"
>> ovn-nbctl set logical_router_port $lrp3_uuid_2_R1 peer="R1_R3"
>>
>> #install static route in R1 to get to alice
>> ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route \
>> ip_prefix=172.16.1.0/24 nexthop=20.0.0.3 -- add Logical_Router \
>> R1 static_routes @lrt
>>
>> #install static route in R1 to get to bob
>> ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route \
>> ip_prefix=10.32.1.0/24 nexthop=21.0.0.3 -- add Logical_Router \
>> R1 static_routes @lrt
>>
>> #install static route in R2 to get to foo
>> ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route \
>> ip_prefix=192.168.1.0/24 nexthop=20.0.0.2 -- add Logical_Router \
>> R2 static_routes @lrt
>>
>> # Create terminal logical ports
>> # Create logical port foo1 in foo
>> ovn-nbctl lport-add foo foo1 \
>> -- lport-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2"
>>
>> # Create logical port alice1 in alice
>> ovn-nbctl lport-add alice alice1 \
>> -- lport-set-addresses alice1 "f0:00:00:01:02:04 172.16.1.2"
>>
>> # Create logical port bob1 in bob
>> ovn-nbctl lport-add bob bob1 \
>> -- lport-set-addresses bob1 "f0:00:00:01:02:05 10.32.1.2"
>>
>> # Create two hypervisor and create OVS ports corresponding to logical
>> ports.
>> net_add n1
>>
>> sim_add hv1
>> as hv1
>> ovs-vsctl add-br br-phys
>> ovn_attach n1 br-phys 192.168.0.1
>> ovs-vsctl -- add-port br-int hv1-vif1 -- \
>>     set interface hv1-vif1 external-ids:iface-id=foo1 \
>>     options:tx_pcap=hv1/vif1-tx.pcap \
>>     options:rxq_pcap=hv1/vif1-rx.pcap \
>>     ofport-request=1
>>
>> sim_add hv2
>> as hv2
>> ovs-vsctl add-br br-phys
>> ovn_attach n1 br-phys 192.168.0.2
>> ovs-vsctl -- add-port br-int hv2-vif1 -- \
>>     set interface hv2-vif1 external-ids:iface-id=bob1 \
>>     options:tx_pcap=hv2/vif1-tx.pcap \
>>     options:rxq_pcap=hv2/vif1-rx.pcap \
>>     ofport-request=1
>>
>> ovs-vsctl -- add-port br-int hv2-vif2 -- \
>>     set interface hv2-vif2 external-ids:iface-id=alice1 \
>>     options:tx_pcap=hv2/vif2-tx.pcap \
>>     options:rxq_pcap=hv2/vif2-rx.pcap \
>>     ofport-request=2
>>
>> # Pre-populate the hypervisors' ARP tables so that we don't lose any
>> # packets for ARP resolution (native tunneling doesn't queue packets
>> # for ARP resolution).
>> ovn_populate_arp
>>
>> # Allow some time for ovn-northd and ovn-controller to catch up.
>> # XXX This should be more systematic.
>> sleep 1
>>
>> ip_to_hex() {
>>     printf "%02x%02x%02x%02x" "$@"
>> }
>> trim_zeros() {
>>     sed 's/\(00\)\{1,\}$//'
>> }
>>
>> # Send ip packets between foo1 and alice1
>> src_mac="f00000010203"
>> dst_mac="000001010203"
>> src_ip=`ip_to_hex 192 168 1 2`
>> dst_ip=`ip_to_hex 172 16 1 2`
>>
>> packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
>> as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
>> as hv1 ovs-appctl ofproto/trace br-int in_port=1 $packet
>>
>> # Send ip packets between foo1 and bob1
>> src_mac="f00000010203"
>> dst_mac="000001010203"
>> src_ip=`ip_to_hex 192 168 1 2`
>> dst_ip=`ip_to_hex 10 32 1 2`
>>
>> packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
>> as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
>>
>> # Send ip packets from alice1 to foo1
>> src_mac="f00000010204"
>> dst_mac="000002010203"
>> src_ip=`ip_to_hex 172 16 1 2`
>> dst_ip=`ip_to_hex 192 168 1 2`
>>
>> packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
>> as hv2 ovs-appctl netdev-dummy/receive hv2-vif2 $packet
>>
>> echo "---------NB dump-----"
>> ovn-nbctl show
>> echo "---------------------"
>> ovn-nbctl list logical_router
>> echo "---------------------"
>> ovn-nbctl list logical_router_port
>> echo "---------------------"
>>
>> echo "---------SB dump-----"
>> ovn-sbctl list datapath_binding
>> echo "---------------------"
>> ovn-sbctl list port_binding
>> echo "---------------------"
>> #ovn-sbctl dump-flows
>> echo "---------------------"
>>
>> echo "------ hv1 dump ----------"
>> as hv1 ovs-vsctl show
>> as hv1 ovs-ofctl show br-int
>> as hv1 ovs-ofctl dump-flows br-int
>> echo "------ hv2 dump ----------"
>> as hv2 ovs-vsctl show
>> as hv2 ovs-ofctl show br-int
>> as hv2 ovs-ofctl dump-flows br-int
>> echo "----------------------------"
>>
>> # Packet to Expect at alice1
>> src_mac="000002010203"
>> dst_mac="f00000010204"
>> src_ip=`ip_to_hex 192 168 1 2`
>> dst_ip=`ip_to_hex 172 16 1 2`
>>
>> expected=${dst_mac}${src_mac}08004500001c000000003e110200${src_ip}${dst_ip}0035111100080000
>>
>> $PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/vif2-tx.pcap |
>> trim_zeros >
>> received.packets
>> echo $expected | trim_zeros > expout
>> AT_CHECK([cat received.packets], [0], [expout])
>>
>> # Packet to Expect at bob1
>> src_mac="000003010203"
>> dst_mac="f00000010205"
>> src_ip=`ip_to_hex 192 168 1 2`
>> dst_ip=`ip_to_hex 10 32 1 2`
>>
>> expected=${dst_mac}${src_mac}08004500001c000000003e110200${src_ip}${dst_ip}0035111100080000
>>
>> $PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/vif1-tx.pcap |
>> trim_zeros >
>> received1.packets
>> echo $expected | trim_zeros > expout
>> AT_CHECK([cat received1.packets], [0], [expout])
>>
>> # Packet to Expect at foo1
>> src_mac="000001010203"
>> dst_mac="f00000010203"
>> src_ip=`ip_to_hex 172 16 1 2`
>> dst_ip=`ip_to_hex 192 168 1 2`
>>
>> expected=${dst_mac}${src_mac}08004500001c000000003e110200${src_ip}${dst_ip}0035111100080000
>>
>> $PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/vif1-tx.pcap |
>> trim_zeros >
>> received2.packets
>> echo $expected | trim_zeros > expout
>> AT_CHECK([cat received2.packets], [0], [expout])
>>
>> for sim in hv1 hv2; do
>>     as $sim
>>     OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>     OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>>     OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> done
>>
>> as ovn-sb
>> OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>
>> as ovn-nb
>> OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>
>> as northd
>> OVS_APP_EXIT_AND_WAIT([ovn-northd])
>>
>> as main
>> OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>> OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>
>> AT_CLEANUP
>>
>> On Thu, May 19, 2016 at 1:02 PM, Gurucharan Shetty <guru at ovn.org> wrote:
>>
>> > Currently OVN has distributed switches and routers. When a packet
>> > exits a container or a VM, the entire lifecycle of the packet
>> > through multiple switches and routers are calculated in source
>> > chassis itself. When the destination endpoint resides on a different
>> > chassis, the packet is sent to the other chassis and it only goes
>> > through the egress pipeline of that chassis once and eventually to
>> > the real destination.
>> >
>> > When the packet returns back, the same thing happens. The return
>> > packet leaves the VM/container on the chassis where it resides.
>> > The packet goes through all the switches and routers in the logical
>> > pipleline on that chassis and then sent to the eventual destination
>> > over the tunnel.
>> >
>> > The above makes the logical pipeline very flexible and easy. But,
>> > creates a problem for cases where you need to add stateful services
>> > (via conntrack) on switches and routers.
>> >
>> > For l3 gateways, we plan to leverage DNAT and SNAT functionality
>> > and we want to apply DNAT and SNAT rules on a router. So we ideally need
>> > the packet to go through that router in both directions in the same
>> > chassis. To achieve this, this commit introduces a new gateway router
>> > which is
>> > static and can be connected to your distributed router via a switch.
>> >
>> > To make minimal changes in OVN's logical pipeline, this commit
>> > tries to make the switch port connected to a l3 gateway router look like
>> > a container/VM endpoint for every other chassis except the chassis
>> > on which the l3 gateway router resides. On the chassis where the
>> > gateway router resides, the connection looks just like a patch port.
>> >
>> > This is achieved by the doing the following:
>> > Introduces a new type of port_binding record called 'gateway'.
>> > On the chassis where the gateway router resides, this port behaves just
>> > like the port of type 'patch'. The ovn-controller on that chassis
>> > populates the "chassis" column for this record as an indication for
>> > other ovn-controllers of its physical location. Other ovn-controllers
>> > treat this port as they would treat a VM/Container port on a different
>> > chassis.
>> >
>> > Signed-off-by: Gurucharan Shetty <guru at ovn.org>
>> > ---
>> >  ovn/controller/binding.c        |   3 +-
>> >  ovn/controller/ovn-controller.c |   5 +-
>> >  ovn/controller/patch.c          |  29 ++++++-
>> >  ovn/controller/patch.h          |   3 +-
>> >  ovn/northd/ovn-northd.c         |  42 +++++++--
>> >  ovn/ovn-nb.ovsschema            |   9 +-
>> >  ovn/ovn-nb.xml                  |  15 ++++
>> >  ovn/ovn-sb.xml                  |  35 +++++++-
>> >  tests/ovn.at                    | 184
>> > ++++++++++++++++++++++++++++++++++++++++
>> >  9 files changed, 309 insertions(+), 16 deletions(-)
>> >
>> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
>> > index a0d8b96..e5e55b1 100644
>> > --- a/ovn/controller/binding.c
>> > +++ b/ovn/controller/binding.c
>> > @@ -200,7 +200,8 @@ binding_run(struct controller_ctx *ctx, const struct
>> > ovsrec_bridge *br_int,
>> >                  }
>> >                  sbrec_port_binding_set_chassis(binding_rec,
>> chassis_rec);
>> >              }
>> > -        } else if (chassis_rec && binding_rec->chassis == chassis_rec)
>> {
>> > +        } else if (chassis_rec && binding_rec->chassis == chassis_rec
>> > +                   && strcmp(binding_rec->type, "gateway")) {
>> >              if (ctx->ovnsb_idl_txn) {
>> >                  VLOG_INFO("Releasing lport %s from this chassis.",
>> >                            binding_rec->logical_port);
>> > diff --git a/ovn/controller/ovn-controller.c
>> > b/ovn/controller/ovn-controller.c
>> > index 511b184..bc4c24f 100644
>> > --- a/ovn/controller/ovn-controller.c
>> > +++ b/ovn/controller/ovn-controller.c
>> > @@ -364,8 +364,9 @@ main(int argc, char *argv[])
>> >                      &local_datapaths);
>> >          }
>> >
>> > -        if (br_int) {
>> > -            patch_run(&ctx, br_int, &local_datapaths,
>> &patched_datapaths);
>> > +        if (br_int && chassis_id) {
>> > +            patch_run(&ctx, br_int, chassis_id, &local_datapaths,
>> > +                      &patched_datapaths);
>> >
>> >              struct lport_index lports;
>> >              struct mcgroup_index mcgroups;
>> > diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
>> > index 4808146..e8abe30 100644
>> > --- a/ovn/controller/patch.c
>> > +++ b/ovn/controller/patch.c
>> > @@ -267,12 +267,28 @@ add_patched_datapath(struct hmap
>> *patched_datapaths,
>> >  static void
>> >  add_logical_patch_ports(struct controller_ctx *ctx,
>> >                          const struct ovsrec_bridge *br_int,
>> > +                        const char *local_chassis_id,
>> >                          struct shash *existing_ports,
>> >                          struct hmap *patched_datapaths)
>> >  {
>> > +    const struct sbrec_chassis *chassis_rec;
>> > +    chassis_rec = get_chassis(ctx->ovnsb_idl, local_chassis_id);
>> > +    if (!chassis_rec) {
>> > +        return;
>> > +    }
>> > +
>> >      const struct sbrec_port_binding *binding;
>> >      SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
>> > -        if (!strcmp(binding->type, "patch")) {
>> > +        bool local_port = false;
>> > +        if (!strcmp(binding->type, "gateway")) {
>> > +            const char *chassis = smap_get(&binding->options,
>> > +                                           "gateway-chassis");
>> > +            if (!strcmp(local_chassis_id, chassis)) {
>> > +                local_port = true;
>> > +            }
>> > +        }
>> > +
>> > +        if (!strcmp(binding->type, "patch") || local_port) {
>> >              const char *local = binding->logical_port;
>> >              const char *peer = smap_get(&binding->options, "peer");
>> >              if (!peer) {
>> > @@ -287,13 +303,19 @@ add_logical_patch_ports(struct controller_ctx
>> *ctx,
>> >              free(dst_name);
>> >              free(src_name);
>> >              add_patched_datapath(patched_datapaths, binding);
>> > +            if (local_port) {
>> > +                if (binding->chassis != chassis_rec &&
>> > ctx->ovnsb_idl_txn) {
>> > +                    sbrec_port_binding_set_chassis(binding,
>> chassis_rec);
>> > +                }
>> > +            }
>> >          }
>> >      }
>> >  }
>> >
>> >  void
>> >  patch_run(struct controller_ctx *ctx, const struct ovsrec_bridge
>> *br_int,
>> > -          struct hmap *local_datapaths, struct hmap *patched_datapaths)
>> > +          const char *chassis_id, struct hmap *local_datapaths,
>> > +          struct hmap *patched_datapaths)
>> >  {
>> >      if (!ctx->ovs_idl_txn) {
>> >          return;
>> > @@ -313,7 +335,8 @@ patch_run(struct controller_ctx *ctx, const struct
>> > ovsrec_bridge *br_int,
>> >       * 'existing_ports' any patch ports that do exist in the database
>> and
>> >       * should be there. */
>> >      add_bridge_mappings(ctx, br_int, &existing_ports, local_datapaths);
>> > -    add_logical_patch_ports(ctx, br_int, &existing_ports,
>> > patched_datapaths);
>> > +    add_logical_patch_ports(ctx, br_int, chassis_id, &existing_ports,
>> > +                            patched_datapaths);
>> >
>> >      /* Now 'existing_ports' only still contains patch ports that exist
>> in
>> > the
>> >       * database but shouldn't.  Delete them from the database. */
>> > diff --git a/ovn/controller/patch.h b/ovn/controller/patch.h
>> > index d5d842e..7920a48 100644
>> > --- a/ovn/controller/patch.h
>> > +++ b/ovn/controller/patch.h
>> > @@ -27,6 +27,7 @@ struct hmap;
>> >  struct ovsrec_bridge;
>> >
>> >  void patch_run(struct controller_ctx *, const struct ovsrec_bridge
>> > *br_int,
>> > -               struct hmap *local_datapaths, struct hmap
>> > *patched_datapaths);
>> > +               const char *chassis_id, struct hmap *local_datapaths,
>> > +               struct hmap *patched_datapaths);
>> >
>> >  #endif /* ovn/patch.h */
>> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> > index f469e89..7852d83 100644
>> > --- a/ovn/northd/ovn-northd.c
>> > +++ b/ovn/northd/ovn-northd.c
>> > @@ -690,11 +690,24 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>> >  {
>> >      sbrec_port_binding_set_datapath(op->sb, op->od->sb);
>> >      if (op->nbr) {
>> > -        sbrec_port_binding_set_type(op->sb, "patch");
>> > +        /* If the router is for l3 gateway, it resides on a chassis
>> > +         * and its port type is "gateway". */
>> > +        const char *chassis = smap_get(&op->od->nbr->options,
>> "chassis");
>> > +        if (chassis) {
>> > +            sbrec_port_binding_set_type(op->sb, "gateway");
>> > +        } else {
>> > +            sbrec_port_binding_set_type(op->sb, "patch");
>> > +        }
>> >
>> >          const char *peer = op->peer ? op->peer->key : "<error>";
>> > -        const struct smap ids = SMAP_CONST1(&ids, "peer", peer);
>> > -        sbrec_port_binding_set_options(op->sb, &ids);
>> > +        struct smap new;
>> > +        smap_init(&new);
>> > +        smap_add(&new, "peer", peer);
>> > +        if (chassis) {
>> > +            smap_add(&new, "gateway-chassis", chassis);
>> > +        }
>> > +        sbrec_port_binding_set_options(op->sb, &new);
>> > +        smap_destroy(&new);
>> >
>> >          sbrec_port_binding_set_parent_port(op->sb, NULL);
>> >          sbrec_port_binding_set_tag(op->sb, NULL, 0);
>> > @@ -704,15 +717,32 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>> >              sbrec_port_binding_set_type(op->sb, op->nbs->type);
>> >              sbrec_port_binding_set_options(op->sb, &op->nbs->options);
>> >          } else {
>> > -            sbrec_port_binding_set_type(op->sb, "patch");
>> > +            const char *chassis = NULL;
>> > +            if (op->peer && op->peer->od && op->peer->od->nbr) {
>> > +                chassis = smap_get(&op->peer->od->nbr->options,
>> > "chassis");
>> > +            }
>> > +
>> > +            /* A switch port connected to a gateway router is also of
>> > +             * type "gateway". */
>> > +            if (chassis) {
>> > +                sbrec_port_binding_set_type(op->sb, "gateway");
>> > +            } else {
>> > +                sbrec_port_binding_set_type(op->sb, "patch");
>> > +            }
>> >
>> >              const char *router_port = smap_get(&op->nbs->options,
>> >                                                 "router-port");
>> >              if (!router_port) {
>> >                  router_port = "<error>";
>> >              }
>> > -            const struct smap ids = SMAP_CONST1(&ids, "peer",
>> > router_port);
>> > -            sbrec_port_binding_set_options(op->sb, &ids);
>> > +            struct smap new;
>> > +            smap_init(&new);
>> > +            smap_add(&new, "peer", router_port);
>> > +            if (chassis) {
>> > +                smap_add(&new, "gateway-chassis", chassis);
>> > +            }
>> > +            sbrec_port_binding_set_options(op->sb, &new);
>> > +            smap_destroy(&new);
>> >          }
>> >          sbrec_port_binding_set_parent_port(op->sb,
>> op->nbs->parent_name);
>> >          sbrec_port_binding_set_tag(op->sb, op->nbs->tag,
>> op->nbs->n_tag);
>> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>> > index 8163f6a..fa21b30 100644
>> > --- a/ovn/ovn-nb.ovsschema
>> > +++ b/ovn/ovn-nb.ovsschema
>> > @@ -1,7 +1,7 @@
>> >  {
>> >      "name": "OVN_Northbound",
>> > -    "version": "2.1.1",
>> > -    "cksum": "2615511875 5108",
>> > +    "version": "2.1.2",
>> > +    "cksum": "429668869 5325",
>> >      "tables": {
>> >          "Logical_Switch": {
>> >              "columns": {
>> > @@ -78,6 +78,11 @@
>> >                                     "max": "unlimited"}},
>> >                  "default_gw": {"type": {"key": "string", "min": 0,
>> "max":
>> > 1}},
>> >                  "enabled": {"type": {"key": "boolean", "min": 0, "max":
>> > 1}},
>> > +                "options": {
>> > +                     "type": {"key": "string",
>> > +                              "value": "string",
>> > +                              "min": 0,
>> > +                              "max": "unlimited"}},
>> >                  "external_ids": {
>> >                      "type": {"key": "string", "value": "string",
>> >                               "min": 0, "max": "unlimited"}}},
>> > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>> > index d7fd595..d239499 100644
>> > --- a/ovn/ovn-nb.xml
>> > +++ b/ovn/ovn-nb.xml
>> > @@ -630,6 +630,21 @@
>> >        column is set to <code>false</code>, the router is disabled.  A
>> > disabled
>> >        router has all ingress and egress traffic dropped.
>> >      </column>
>> > +
>> > +    <group title="Options">
>> > +      <p>
>> > +        Additional options for the logical router.
>> > +      </p>
>> > +
>> > +      <column name="options" key="chassis">
>> > +        If set, indicates that the logical router in question is
>> > +        non-distributed and resides in the set chassis. The same
>> > +        value is also used by <code>ovn-controller</code> to
>> > +        uniquely identify the chassis in the OVN deployment and
>> > +        comes from <code>external_ids:system-id</code> in the
>> > +        <code>Open_vSwitch</code> table of Open_vSwitch database.
>> > +      </column>
>> > +    </group>
>> >
>> >      <group title="Common Columns">
>> >        <column name="external_ids">
>> > diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
>> > index efd2f9a..741228c 100644
>> > --- a/ovn/ovn-sb.xml
>> > +++ b/ovn/ovn-sb.xml
>> > @@ -1220,7 +1220,12 @@ tcp.flags = RST;
>> >        which
>> <code>ovn-controller</code>/<code>ovn-controller-vtep</code>
>> > in
>> >        turn finds out by monitoring the local hypervisor's Open_vSwitch
>> >        database, which identifies logical ports via the conventions
>> > described
>> > -      in <code>IntegrationGuide.md</code>.
>> > +      in <code>IntegrationGuide.md</code>. (The exceptions are for
>> > +      <code>Port_Binding</code> records of <code>type</code> 'gateway',
>> > +      whose locations are identified by <code>ovn-northd</code> via
>> > +      the <code>options:gateway-chassis</code> column in this table.
>> > +      <code>ovn-controller</code> is still responsible to populate the
>> > +      <code>chassis</code> column.)
>> >      </p>
>> >
>> >      <p>
>> > @@ -1298,6 +1303,14 @@ tcp.flags = RST;
>> >              a logical router to a logical switch or to another logical
>> > router.
>> >            </dd>
>> >
>> > +          <dt><code>gateway</code></dt>
>> > +          <dd>
>> > +            One of a pair of logical ports that act as if connected by
>> a
>> > patch
>> > +            cable across multiple chassis.  Useful for connecting a
>> > logical
>> > +            switch with a gateway router (which is only resident on a
>> > +            particular chassis).
>> > +          </dd>
>> > +
>> >            <dt><code>localnet</code></dt>
>> >            <dd>
>> >              A connection to a locally accessible network from each
>> > @@ -1336,6 +1349,26 @@ tcp.flags = RST;
>> >        </column>
>> >      </group>
>> >
>> > +    <group title="Gateway Options">
>> > +      <p>
>> > +        These options apply to logical ports with <ref column="type"/>
>> of
>> > +        <code>gateway</code>.
>> > +      </p>
>> > +
>> > +      <column name="options" key="peer">
>> > +        The <ref column="logical_port"/> in the <ref
>> > table="Port_Binding"/>
>> > +        record for the other side of the 'gateway' port.  The named
>> <ref
>> > +        column="logical_port"/> must specify this <ref
>> > column="logical_port"/>
>> > +        in its own <code>peer</code> option.  That is, the two
>> 'gateway'
>> > +        logical ports must have reversed <ref column="logical_port"/>
>> and
>> > +        <code>peer</code> values.
>> > +      </column>
>> > +
>> > +      <column name="options" key="gateway-chassis">
>> > +        The <code>chassis</code> in which the port resides.
>> > +      </column>
>> > +    </group>
>> > +
>> >      <group title="Localnet Options">
>> >        <p>
>> >          These options apply to logical ports with <ref column="type"/>
>> of
>> > diff --git a/tests/ovn.at b/tests/ovn.at
>> > index a827b71..9d93064 100644
>> > --- a/tests/ovn.at
>> > +++ b/tests/ovn.at
>> > @@ -2848,3 +2848,187 @@ OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>> >  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> >
>> >  AT_CLEANUP
>> > +
>> > +
>> > +AT_SETUP([ovn -- 2 HVs, 2 LRs connected via LS, gateway router])
>> > +AT_KEYWORDS([ovngatewayrouter])
>> > +AT_SKIP_IF([test $HAVE_PYTHON = no])
>> > +ovn_start
>> > +
>> > +# Logical network:
>> > +# Two LRs - R1 and R2 that are connected to each other via LS "join"
>> > +# in 20.0.0.0/24 network. R1 has switchess foo (192.168.1.0/24)
>> > +# connected to it. R2 has alice (172.16.1.0/24) connected to it.
>> > +# R2 is a gateway router.
>> > +
>> > +
>> > +
>> > +# Create two hypervisor and create OVS ports corresponding to logical
>> > ports.
>> > +net_add n1
>> > +
>> > +sim_add hv1
>> > +as hv1
>> > +ovs-vsctl add-br br-phys
>> > +ovn_attach n1 br-phys 192.168.0.1
>> > +ovs-vsctl -- add-port br-int hv1-vif1 -- \
>> > +    set interface hv1-vif1 external-ids:iface-id=foo1 \
>> > +    options:tx_pcap=hv1/vif1-tx.pcap \
>> > +    options:rxq_pcap=hv1/vif1-rx.pcap \
>> > +    ofport-request=1
>> > +
>> > +
>> > +sim_add hv2
>> > +as hv2
>> > +ovs-vsctl add-br br-phys
>> > +ovn_attach n1 br-phys 192.168.0.2
>> > +ovs-vsctl -- add-port br-int hv2-vif1 -- \
>> > +    set interface hv2-vif1 external-ids:iface-id=alice1 \
>> > +    options:tx_pcap=hv2/vif1-tx.pcap \
>> > +    options:rxq_pcap=hv2/vif1-rx.pcap \
>> > +    ofport-request=1
>> > +
>> > +# Pre-populate the hypervisors' ARP tables so that we don't lose any
>> > +# packets for ARP resolution (native tunneling doesn't queue packets
>> > +# for ARP resolution).
>> > +ovn_populate_arp
>> > +
>> > +ovn-nbctl create Logical_Router name=R1
>> > +ovn-nbctl create Logical_Router name=R2 options:chassis="hv2"
>> > +
>> > +ovn-nbctl lswitch-add foo
>> > +ovn-nbctl lswitch-add alice
>> > +ovn-nbctl lswitch-add join
>> > +
>> > +# Connect foo to R1
>> > +ovn-nbctl -- --id=@lrp create Logical_Router_port name=foo \
>> > +network=192.168.1.1/24 mac=\"00:00:01:01:02:03\" -- add Logical_Router
>> > R1 \
>> > +ports @lrp -- lport-add foo rp-foo
>> > +
>> > +ovn-nbctl set Logical_port rp-foo type=router options:router-port=foo \
>> > +addresses=\"00:00:01:01:02:03\"
>> > +
>> > +# Connect alice to R2
>> > +ovn-nbctl -- --id=@lrp create Logical_Router_port name=alice \
>> > +network=172.16.1.1/24 mac=\"00:00:02:01:02:03\" -- add Logical_Router
>> R2
>> > \
>> > +ports @lrp -- lport-add alice rp-alice
>> > +
>> > +ovn-nbctl set Logical_port rp-alice type=router
>> options:router-port=alice
>> > \
>> > +addresses=\"00:00:02:01:02:03\"
>> > +
>> > +
>> > +# Connect R1 to join
>> > +ovn-nbctl -- --id=@lrp create Logical_Router_port name=R1_join \
>> > +network=20.0.0.1/24 mac=\"00:00:04:01:02:03\" -- add Logical_Router
>> R1 \
>> > +ports @lrp -- lport-add join r1-join
>> > +
>> > +ovn-nbctl set Logical_port r1-join type=router
>> > options:router-port=R1_join \
>> > +addresses='"00:00:04:01:02:03"'
>> > +
>> > +# Connect R2 to join
>> > +ovn-nbctl -- --id=@lrp create Logical_Router_port name=R2_join \
>> > +network=20.0.0.2/24 mac=\"00:00:04:01:02:04\" -- add Logical_Router
>> R2 \
>> > +ports @lrp -- lport-add join r2-join
>> > +
>> > +ovn-nbctl set Logical_port r2-join type=router
>> > options:router-port=R2_join \
>> > +addresses='"00:00:04:01:02:04"'
>> > +
>> > +
>> > +#install static routes
>> > +ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route \
>> > +ip_prefix=172.16.1.0/24 nexthop=20.0.0.2 -- add Logical_Router \
>> > +R1 static_routes @lrt
>> > +
>> > +ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route \
>> > +ip_prefix=192.168.1.0/24 nexthop=20.0.0.1 -- add Logical_Router \
>> > +R2 static_routes @lrt
>> > +
>> > +# Create logical port foo1 in foo
>> > +ovn-nbctl lport-add foo foo1 \
>> > +-- lport-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2"
>> > +
>> > +# Create logical port alice1 in alice
>> > +ovn-nbctl lport-add alice alice1 \
>> > +-- lport-set-addresses alice1 "f0:00:00:01:02:04 172.16.1.2"
>> > +
>> > +
>> > +# Allow some time for ovn-northd and ovn-controller to catch up.
>> > +# XXX This should be more systematic.
>> > +sleep 2
>> > +
>> > +ip_to_hex() {
>> > +    printf "%02x%02x%02x%02x" "$@"
>> > +}
>> > +trim_zeros() {
>> > +    sed 's/\(00\)\{1,\}$//'
>> > +}
>> > +
>> > +# Send ip packets between foo1 and alice1
>> > +src_mac="f00000010203"
>> > +dst_mac="000001010203"
>> > +src_ip=`ip_to_hex 192 168 1 2`
>> > +dst_ip=`ip_to_hex 172 16 1 2`
>> >
>> >
>> +packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
>> > +
>> > +echo "---------NB dump-----"
>> > +ovn-nbctl show
>> > +echo "---------------------"
>> > +ovn-nbctl list logical_router
>> > +echo "---------------------"
>> > +ovn-nbctl list logical_router_port
>> > +echo "---------------------"
>> > +
>> > +echo "---------SB dump-----"
>> > +ovn-sbctl list datapath_binding
>> > +echo "---------------------"
>> > +ovn-sbctl list port_binding
>> > +echo "---------------------"
>> > +ovn-sbctl dump-flows
>> > +echo "---------------------"
>> > +ovn-sbctl list chassis
>> > +ovn-sbctl list encap
>> > +echo "---------------------"
>> > +
>> > +echo "------ hv1 dump ----------"
>> > +as hv1 ovs-ofctl show br-int
>> > +as hv1 ovs-ofctl dump-flows br-int
>> > +echo "------ hv2 dump ----------"
>> > +as hv2 ovs-ofctl show br-int
>> > +as hv2 ovs-ofctl dump-flows br-int
>> > +echo "----------------------------"
>> > +
>> > +# Packet to Expect at alice1
>> > +src_mac="000002010203"
>> > +dst_mac="f00000010204"
>> > +src_ip=`ip_to_hex 192 168 1 2`
>> > +dst_ip=`ip_to_hex 172 16 1 2`
>> >
>> >
>> +expected=${dst_mac}${src_mac}08004500001c000000003e110200${src_ip}${dst_ip}0035111100080000
>> > +
>> > +
>> > +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
>> > +as hv1 ovs-appctl ofproto/trace br-int in_port=1 $packet
>> > +
>> > +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/vif1-tx.pcap |
>> > trim_zeros > received1.packets
>> > +echo $expected | trim_zeros > expout
>> > +AT_CHECK([cat received1.packets], [0], [expout])
>> > +
>> > +for sim in hv1 hv2; do
>> > +    as $sim
>> > +    OVS_APP_EXIT_AND_WAIT([ovn-controller])
>> > +    OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>> > +    OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> > +done
>> > +
>> > +as ovn-sb
>> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> > +
>> > +as ovn-nb
>> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> > +
>> > +as northd
>> > +OVS_APP_EXIT_AND_WAIT([ovn-northd])
>> > +
>> > +as main
>> > +OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> > +
>> > +AT_CLEANUP
>> > --
>> > 1.9.1
>> >
>> > _______________________________________________
>> > dev mailing list
>> > dev at openvswitch.org
>> > http://openvswitch.org/mailman/listinfo/dev
>> >
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>>
>
>



More information about the dev mailing list