[ovs-git] [ovn-org/ovn] 895cdc: Prepare for post-21.09.0.

Han Zhou noreply at github.com
Tue Oct 5 17:35:14 UTC 2021


  Branch: refs/heads/main
  Home:   https://github.com/ovn-org/ovn
  Commit: 895cdc1c1b52189017e15a827cd3e67b39917e4c
      https://github.com/ovn-org/ovn/commit/895cdc1c1b52189017e15a827cd3e67b39917e4c
  Author: Mark Michelson <mmichels at redhat.com>
  Date:   2021-09-17 (Fri, 17 Sep 2021)

  Changed paths:
    M NEWS
    M configure.ac
    M debian/changelog

  Log Message:
  -----------
  Prepare for post-21.09.0.

Signed-off-by: Mark Michelson <mmichels at redhat.com>


  Commit: 62ca8b9620cc1168ace6905575b7d36438363aed
      https://github.com/ovn-org/ovn/commit/62ca8b9620cc1168ace6905575b7d36438363aed
  Author: Vladislav Odintsov <odivlad at gmail.com>
  Date:   2021-09-17 (Fri, 17 Sep 2021)

  Changed paths:
    M northd/northd.c
    M northd/ovn-northd.8.xml
    M northd/ovn_northd.dl
    M tests/ovn-northd.at

  Log Message:
  -----------
  northd: support HW VTEP with stateful datapath

A packet going from HW VTEP device to VIF port when arrives to
hypervisor chassis should go through LS ingress pipeline to l2_lkp
stage without any match. In l2_lkp stage an output port is
determined and then packet passed to LS egress pipeline for futher
processing and to VIF port delivery.

Prior to this commit a packet, which was received from HW VTEP
device was dropped in an LS ingress datapath, where stateful services
were defined (ACLs, LBs).

To fix this issue we add a special flag-bit which can be used in LS
pipelines, to check whether the packet came from HW VTEP devices.
In ls_in_pre_acl and ls_in_pre_lb we add new flow with priority 110
to skip such packets.

Signed-off-by: Vladislav Odintsov <odivlad at gmail.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: b2614db87060312bb4193212adfe919414429f95
      https://github.com/ovn-org/ovn/commit/b2614db87060312bb4193212adfe919414429f95
  Author: zhen wang <zhewang at nvidia.com>
  Date:   2021-09-21 (Tue, 21 Sep 2021)

  Changed paths:
    M northd/northd.c
    M northd/ovn-northd.c

  Log Message:
  -----------
  northd: Update the probe interval in main loop.

When ovn-northd work in HA mode, ovn-northd will not update the
probe interval in standby mode. This patch address the problem by
updating the probe value in main loop.

Signed-off-by: zhen wang <zhewang at nvidia.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>


  Commit: 4748dbfa0a87404d1387e0688b667b12ab053ef7
      https://github.com/ovn-org/ovn/commit/4748dbfa0a87404d1387e0688b667b12ab053ef7
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-09-21 (Tue, 21 Sep 2021)

  Changed paths:
    M controller/physical.c
    M northd/lrouter.dl
    M northd/multicast.dl
    M northd/northd.c
    M northd/ovn_northd.dl
    M tests/ovn.at

  Log Message:
  -----------
  northd: Fix multicast relay when DGP are configured.

IP multicast relay didn't work properly if traffic had to be forwarded
across a distributed gateway port in the router pipeline.  That is
because the multicast_group used as output logical port is expanded in
the egress pipeline, way after 'lr_in_gw_redirect' where unicast traffic
would normally be forwarded to the chassis-redirect port.

In order to achieve the same behavior for IP multicast routed traffic we
now store the chassis-redirect port binding in the multicast_group on
which IP multicast is routed.  On the remote hypervisor, to make sure
traffic is delivered to the correct destination switch pipeline, we make
sure that ovn-controller translates chassis-redirect ports from
multicast groups to the logical patch ports they were created from.

This patch also adds a test to simulate the ovn-kubernetes IP multicast
use case (where this issue was first observed).

Fixes: 5d1527b11e94 ("ovn-northd: Add IGMP Relay support")
Reported-by: Alexander Constantinescu <aconstan at redhat.com>
Reported-at: https://bugzilla.redhat.com/2006306
Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Acked-by: Han Zhou <hzhou at ovn.org>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 9f2eb5cdb6d67fc8c7e6f477bc4bdc94cfbca028
      https://github.com/ovn-org/ovn/commit/9f2eb5cdb6d67fc8c7e6f477bc4bdc94cfbca028
  Author: Venugopal Iyer <venugopali at nvidia.com>
  Date:   2021-09-22 (Wed, 22 Sep 2021)

  Changed paths:
    M controller/encaps.c
    M controller/encaps.h
    M controller/ovn-controller.8.xml
    M controller/ovn-controller.c
    M tests/ovn-controller.at

  Log Message:
  -----------
  controller: Allow specifying tos option for tunnel interface

Currently, OVN tunnel interface supports the csum option along
with remote_ip and key. There are applications (e.g. RoCE) that rely
on setting the DSCP bits and expect it to be moved to the outer/
tunnel header as well.

This commit adds an "ovn-encap-tos" external-id that can be used to
set the tos option  on the OVS tunnel interface, using:

ovs-vsctl set Open_vSwitch . external_ids:ovn-encap-tos=inherit

Tested by setting the external_id (as above) and checking the geneve
interfaces created, e.g:

    options             : {csum="true", key=flow, remote_ip="X.X.X.X", tos=inherit}

Also, added a simple test case to make sure the tos option is carried to the
tunnel interface when set.

Signed-off-by: Venugopal Iyer <venugopali at nvidia.com>
Signed-off-by: Han Zhou <hzhou at ovn.org>


  Commit: 3fb27238f510fcd11c934549e3501e0072743444
      https://github.com/ovn-org/ovn/commit/3fb27238f510fcd11c934549e3501e0072743444
  Author: Vladislav Odintsov <odivlad at gmail.com>
  Date:   2021-09-23 (Thu, 23 Sep 2021)

  Changed paths:
    M rhel/ovn-fedora.spec.in

  Log Message:
  -----------
  rhel: replace try-restart with restart in ovn-controller %postun

In commit [1] support for graceful stop during ovn-controller RPM
upgrade was added. Unfortunately there was an error, where after
ovn-controller service stop via ctl socket was invoked, systemd
service transitioned to dead state and subsequent try-restart
didn't start the service.

This commit fixes such situation by checking actual ovn-controller
service status and doing an unconditional restart if service was
running before upgrade.

[1] https://github.com/ovn-org/ovn/commit/8540c544f0e67d3dc475bbeb350ea3053a1772dd

Signed-off-by: Vladislav Odintsov <odivlad at gmail.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 9242f27f6358582995ac1ad06a1f414940f56e82
      https://github.com/ovn-org/ovn/commit/9242f27f6358582995ac1ad06a1f414940f56e82
  Author: Dumitru Ceara <dceara at redhat.com>
  Date:   2021-09-28 (Tue, 28 Sep 2021)

  Changed paths:
    M ovs

  Log Message:
  -----------
  ovs: Include ovsdb-data optimizations.

Recent OVS commits have been optimizing the ovsdb-data module, with
immediate goal of optimizing ovsdb-server operations.  However, some of
the ovsdb-data functions are also used on the client side (e.g.,
db-ctl-base.c, ovsdb-idl.c).

Update the OVS submodule to include the following patches:
  51946d22274c ("ovsdb-data: Optimize union of sets.")
  bb12b6317638 ("ovsdb-data: Optimize subtraction of sets.")
  429b114c5aad ("ovsdb-data: Deduplicate string atoms.")

Signed-off-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 758bb52d99213afdd2e2651d9e7b0b047f171562
      https://github.com/ovn-org/ovn/commit/758bb52d99213afdd2e2651d9e7b0b047f171562
  Author: Frode Nordahl <frode.nordahl at canonical.com>
  Date:   2021-09-30 (Thu, 30 Sep 2021)

  Changed paths:
    M controller/ovn-controller.c
    M tests/ovn.at

  Log Message:
  -----------
  controller: Allocate zone ids for localport port bindings.

Commit d4bca93 limited ct zone allocation to VIF-ports only, this
breaks OpenStack's use of localport for providing metadata to
instances.

This patch adds lports of type 'localport' to the list of lport
types to allocate zone ids for.  An existing test case is
updated to cover this scenario.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2021-September/051473.html
Reported-by: Benjamin Reichel <b.reichel at syseleven.de>
Fixes: d4bca93 ("controller: Don't allocate zone ids for non-VIF port bindings. ")
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 973c78b9a162eeee59379f3558fbdbe1dd054bfe
      https://github.com/ovn-org/ovn/commit/973c78b9a162eeee59379f3558fbdbe1dd054bfe
  Author: Lorenzo Bianconi <lorenzo.bianconi at redhat.com>
  Date:   2021-10-01 (Fri, 01 Oct 2021)

  Changed paths:
    M tests/ovn-northd.at
    M utilities/ovn-nbctl.c

  Log Message:
  -----------
  nbctl: allow multiple bfd sessions with same nexthop and different outports

Allow CMS to create multiple bfd sessions with the same nexthop but
different outports:

ovn-nbctl --bfd --policy=src-ip --ecmp-symmetric-reply lr-route-add GR_ovn-worker 10.244.0.5/32 172.18.0.4 rtoe-GR_ovn-worker
ovn-nbctl --bfd --policy=src-ip --ecmp-symmetric-reply lr-route-add GR_ovn-worker2 10.244.2.5/32 172.18.0.4 rtoe-GR_ovn-worker2

https://bugzilla.redhat.com/show_bug.cgi?id=2007549
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi at redhat.com>
Acked-by: Mark Michelson
Signed-off-by: Mark Michelson <mmichels at redhat.com>


  Commit: 0272f8d903c05d0f475401bfe01af93cfda80699
      https://github.com/ovn-org/ovn/commit/0272f8d903c05d0f475401bfe01af93cfda80699
  Author: Vladislav Odintsov <odivlad at gmail.com>
  Date:   2021-10-04 (Mon, 04 Oct 2021)

  Changed paths:
    M controller/encaps.c
    M controller/ovn-controller.h
    M tests/ovn-controller.at

  Log Message:
  -----------
  controller: configure only matching encaps between chassis

Previously tunnels encap on a chassis was based on the remote chassis
"best" encap type. Suppose we have 2 chassis: one with STT, another with
GENEVE. In this case on chassis 1 there was configured STT tunnel, and
GENEVE on another one. No traffic could be sent between these chassis.

With this approach it was impossible to change encap type
for chassis one-by-one, because different tunnel types were
configured on different edges of the link.
Suppose we have 2 chassis: one with STT and VXLAN configured encaps,
another with GENEVE and STT. In
this case on chassis 1 there was configured STT tunnel (best of VXLAN
and STT) and GENEVE on another one ("best" of GENEVE and STT).
No traffic could be sent between these chassis. Though the common STT
could be used.

Now we configure only matching encaps between nodes.

Acked-by: Dumitru Ceara <dceara at redhat.com>
Signed-off-by: Vladislav Odintsov <odivlad at gmail.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: fd44d75959cedcedf1f103173be1d9fa1abd9cb8
      https://github.com/ovn-org/ovn/commit/fd44d75959cedcedf1f103173be1d9fa1abd9cb8
  Author: Ihar Hrachyshka <ihrachys at redhat.com>
  Date:   2021-10-04 (Mon, 04 Oct 2021)

  Changed paths:
    M lib/mcast-group-index.h
    M northd/northd.c
    M northd/ovn_northd.dl
    M tests/ovn.at

  Log Message:
  -----------
  Enforce datapath and port key constraints in vxlan mode

With vxlan enabled for in-cluster communication, the ranges available
for port and datapath keys are limited to 12 bits (including multigroup
keys). (The default range is 16 bit long.)

This means that OVN should avoid allocating, or allowing to request,
tunnel keys for datapaths and ports that are equal or higher than
2 << 11. This was not enforced before, and this patch adds the missing
enforcement rules.

Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Ihar Hrachyshka <ihrachys at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 2cc42bdefd0e703d8d4cd9b02f13ceb5efcbe905
      https://github.com/ovn-org/ovn/commit/2cc42bdefd0e703d8d4cd9b02f13ceb5efcbe905
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-10-04 (Mon, 04 Oct 2021)

  Changed paths:
    M NEWS
    M northd/northd.c
    M northd/ovn_northd.dl
    M ovn-nb.xml
    M tests/ovn-northd.at

  Log Message:
  -----------
  northd: Change the default value of ignore_lsp_down to true.

The current default behavior is that ARP responder flows for VIFs are
added by northd after the port-binding state is UP, which creates more
trouble than benefit in most use cases. To make the default behavior
desirable for majority of the use cases, set the option ignore_lsp_down
to true by default. This would help saving the control plane cost in
large scale environment, reduce the e2e latency for all flows to be
installed for a VIF, and making the VIF readiness checking more convenient
in test cases and likely in CMS as well. User can still set it to false
in circumstances (if any) when this behavior is not desired.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Acked-by: Mark D. Gray <mark.d.gray at redhat.com>
Acked-by: Numan Siddique <numans at ovn.org>


  Commit: 1c360bbd911cab9fadd6df8cd528d992ffa7a998
      https://github.com/ovn-org/ovn/commit/1c360bbd911cab9fadd6df8cd528d992ffa7a998
  Author: Ihar Hrachyshka <ihrachys at redhat.com>
  Date:   2021-10-04 (Mon, 04 Oct 2021)

  Changed paths:
    M controller-vtep/gateway.c
    M controller/physical.c
    M ovn-architecture.7.xml
    M tests/ovn.at

  Log Message:
  -----------
  Fix basic multicast flows for vxlan (non-vtep) tunnels

The 15-bit port key range used for multicast groups can't be covered
by 12-bit key space available for port keys in VXLAN. To make
multicast keys work, we have to transform 16-bit multicast port keys
to 12-bit keys before fanning out packets through VXLAN tunnels.
Otherwise significant bits are not retained, and multicast / broadcast
traffic does not reach ports located on other chassis.

This patch introduces a mapping scheme between core 16-bit multicast
port keys and 12-bit key range available in VXLAN. The scheme is as
follows:

1) Before sending a packet through VXLAN tunnel, the most significant
   bit of a 16-bit port key is copied into the most significant bit of
   12-bit VXLAN key. The 11 least significant bits of a 16-bit port
   key are copied to the least significant bits of 12-bit VXLAN key.

2) When receiving a packet through VXLAN tunnel, the most significant
   bit of a VXLAN 12-bit port key is copied into the most significant
   bit of 16-bit port key used in core. The 11 least significant bits
   of a VXLAN 12-bit port key are copied into the least significant
   bits of a 16-bit port key used in core.

This change also implies that the range available for multicast port
keys is more limited and fits into 11-bit space. The same rule should
be enforced for unicast port keys, like we already do for tunnel keys
when a VXLAN encap is present in a cluster. This enforcement is
implied here but missing in master and will be implemented in a
separate patch. (The missing enforcement is an oversight of the
original patch that added support for VXLAN tunnels.)

Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Ihar Hrachyshka <ihrachys at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 8c897f5ae1a747c55365d68454b5d318897dbd36
      https://github.com/ovn-org/ovn/commit/8c897f5ae1a747c55365d68454b5d318897dbd36
  Author: Vladislav Odintsov <odivlad at gmail.com>
  Date:   2021-10-04 (Mon, 04 Oct 2021)

  Changed paths:
    M rhel/ovn-fedora.spec.in

  Log Message:
  -----------
  spec: add tcpdump to BuildRequires

tcpdump is used in tests. When RPM package is built it
appropriate test fails if tcpdump package is not installed.

This commit fixes this issue by adding a new conditional
BuildRequires: tcpdump to specfile.

Signed-off-by: Vladislav Odintsov <odivlad at gmail.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: 9dadbd482cc09b86a81e72887f8ce7849f70b2af
      https://github.com/ovn-org/ovn/commit/9dadbd482cc09b86a81e72887f8ce7849f70b2af
  Author: Frode Nordahl <frode.nordahl at canonical.com>
  Date:   2021-10-04 (Mon, 04 Oct 2021)

  Changed paths:
    M tests/ovn.at

  Log Message:
  -----------
  test: Fix options:requested-chassis with hostname

This test currently passes, but is broken in two ways.

1) The `fetch_column` helper should be used to retrieve the value
   of hostname, not `fetch` wich results in a "fetch: command not
   found" error which is currently not caught by the test.  As a
   consequence the requested-chassis option was set to the empty
   string ("") and not the chassis hostname.

2) When we introduced testing with TLS+RBAC in c948d6bb05b4 the
   ovn_az_attach helper was updated to set the hostname to match
   system-id.  This of course also makes the name and hostname
   columns in the Chassis record contain the same value which made
   this test no longer do what it says on the tin.

Update test to explicitly set the value to be used for
requested-chassis option in the Chassis hostname record, and
add a check for it not being empty nor equal to chassis name.

Fixes: 4afe409e95c7 ("tests: Introduce new testing helpers.")
Fixes: c948d6bb05b4 ("tests: Test with SSL and RBAC for controller by default")
Signed-off-by: Frode Nordahl <frode.nordahl at canonical.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: f5e2120ee19bc1005d21375b843b2cb3aee67ccc
      https://github.com/ovn-org/ovn/commit/f5e2120ee19bc1005d21375b843b2cb3aee67ccc
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-10-05 (Tue, 05 Oct 2021)

  Changed paths:
    M northd/northd.c

  Log Message:
  -----------
  northd.c: Remove redundant condition check in ovn_dp_group_add_with_reference().

Signed-off-by: Han Zhou <hzhou at ovn.org>
Acked-by: Numan Siddique <numans at ovn.org>


  Commit: a1945b2154059b37839e0a4bfbb00b59df8cfe36
      https://github.com/ovn-org/ovn/commit/a1945b2154059b37839e0a4bfbb00b59df8cfe36
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-10-05 (Tue, 05 Oct 2021)

  Changed paths:
    M northd/northd.c

  Log Message:
  -----------
  northd.c: Lock to protect against possible od->group corruption.

When parallel build is used, od->group can be updated by threads outside
of the function do_ovn_lflow_add_pd (for lb related flow building). So
use the function ovn_dp_group_add_with_reference() to update it in
function do_ovn_lflow_add() when it is not a newly created flow.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Acked-by: Numan Siddique <numans at ovn.org>


  Commit: ab4bdb55cd04ca38f1823f367f6ffb28a3fd5250
      https://github.com/ovn-org/ovn/commit/ab4bdb55cd04ca38f1823f367f6ffb28a3fd5250
  Author: Lorenzo Bianconi <lorenzo.bianconi at redhat.com>
  Date:   2021-10-05 (Tue, 05 Oct 2021)

  Changed paths:
    M northd/northd.c

  Log Message:
  -----------
  northd: do not run find_lrp_member_ip with empty nexthop

Do not run find_lrp_member_ip in find_static_route_outport if the route
has been configured without a valid nexthop. This patch fixes the
following northd warning:

2021-09-18T06:01:37.909Z|00008|ovn_northd|WARN|bad ipv6 address

Fixes: c00852288 ("northd: allow to configure routes with no nexthop")
Reported-by: Jianlin Shi <jishi at redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi at redhat.com>
Signed-off-by: Numan Siddique <numans at ovn.org>


  Commit: f9c302d3317a8f48451294cf8979c97d2a9a1aef
      https://github.com/ovn-org/ovn/commit/f9c302d3317a8f48451294cf8979c97d2a9a1aef
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-10-05 (Tue, 05 Oct 2021)

  Changed paths:
    M northd/northd.c

  Log Message:
  -----------
  northd.c: Optimize parallel build performance with hash based locks.

The current implementation of parallel build in northd with dp-groups
enabled results in bad performance when the below assumption is not
true:

 * 3. Most RW ops are not flow adds, but changes to the
 * od groups.

In fact most (if not all) of the ovn port based flows don't share
dp-groups, so the assumption doesn't hold in the reality, and in a scale
test environment with ovn-k8s topology of 800 nodes, the parallel build
shows 5 times more time spent for one northd iteration than without
parallel build on a test machine with 24 cores (24 northd worker
threads). This is because of lock contension on the global rw lock
protecting the lflow hmap.

This patch optimizes it by using an array of bash based locks instead of
a single global lock. It is similar to the approach prior to the commit
8c980ce6, but with two major differences:

1) It uses a fixed length mutex array instead of the dynamic array of
   the struct hashrow_locks. It is equally efficient considering the low
   chance of contention in a large array of locks, but without the burden
   of resizing every time the hmap size changes. The uniqueness of the
   lock is guaranteed by combining the masks of both the hmap and the
   mutex array.

2) It fixes the corrupted hmap size after each round of parallel flow
   build. The hash based lock protects the list in each bucket, but
   doesn't protect the hmap size. The patch uses thread-local counters
   and aggregate them at the end of each iteration, which is lock free.
   This approach has lower cost than alternatively using atomic
   incrementing a global counter.

This patch ends up with 8 times speedup than the current parallel build
with dp-group enabled for the same scale test (which is 30% faster than
without parallel).

Test command: ovn-nbctl --print-wait-time --wait=sb sync

Before:

no parallel:
ovn-northd completion: 7807ms

parallel:
ovn-northd completion: 41267ms

After:

no parallel: (no change)

parallel:
ovn-northd completion: 5081ms
(8x faster than before, 30% faster than no parallel)

Note: all the above tests are with dp-groups enabled)

Signed-off-by: Han Zhou <hzhou at ovn.org>
Acked-by: Numan Siddique <numans at ovn.org>


  Commit: 3fb397b63663297acbcbf794e1233951222ae5af
      https://github.com/ovn-org/ovn/commit/3fb397b63663297acbcbf794e1233951222ae5af
  Author: Han Zhou <hzhou at ovn.org>
  Date:   2021-10-05 (Tue, 05 Oct 2021)

  Changed paths:
    M lib/ovn-parallel-hmap.h

  Log Message:
  -----------
  ovn-parallel-hmap.h: Minor fixes for hashrow_lock.

Although not used currently, it is better to fix:
1. The type of the mask field should be the same as hmap->mask: size_t.
2. Calculating the index is better to use & instead of %.

Signed-off-by: Han Zhou <hzhou at ovn.org>
Acked-by: Numan Siddique <numans at ovn.org>


Compare: https://github.com/ovn-org/ovn/compare/895cdc1c1b52%5E...3fb397b63663


More information about the git mailing list