[ovs-dev] [PATCH ovn 0/6] Combine Logical Flows by Logical Datapath.

Mark Michelson mmichels at redhat.com
Fri Nov 20 02:06:04 UTC 2020


Hi Ilya,

With regards to the test failures, I also see test 58 fail.

Test 59 sometimes passes and sometimes fails. I saw the crash you 
referenced one time out of probably 25 test runs. As a double-check, I 
reverted this series and ran the test in a loop. I never saw the test 
fail or crash :(. here's something strange about the backtrace:

#0  0x00000000004afb29 in hmap_next_with_hash__ (hash=1182893543, 
node=0xe1) at ./include/openvswitch/hmap.h:323
#1  hmap_first_with_hash (hmap=0x2510070, hmap=0x2510070, 
hash=1182893543) at ./include/openvswitch/hmap.h:334

(gdb) f 1
#1  hmap_first_with_hash (hmap=0x2510070, hmap=0x2510070, 
hash=1182893543) at ./include/openvswitch/hmap.h:334
334         return hmap_next_with_hash__(hmap->buckets[hash & 
hmap->mask], hash);
(gdb) list
329     /* Returns the first node in 'hmap' with the given 'hash', or a 
null pointer if
330      * no nodes have that hash value. */
331     static inline struct hmap_node *
332     hmap_first_with_hash(const struct hmap *hmap, size_t hash)
333     {
334         return hmap_next_with_hash__(hmap->buckets[hash & 
hmap->mask], hash);
335     }
336
337     /* Returns the first node in 'hmap' in the bucket in which the 
given 'hash'
338      * would land, or a null pointer if that bucket is empty. */

Notice that on line 334, we pass the node as the first parameter and the 
hash as the second parameter. This seems to match the function signature 
of hmap_next_with_hash__(). But if you look at the backtrace, frame #0 
shows the parameters reversed. Is that a normal gdb thing?

Test 153 passes for me every time. I ran it about 20 times and it always 
passed. That test has had some trouble in the past with being too strict 
with the matches it tries to make in the openflow table. If that's the 
issue, then the match may need to be reworked again.

Test 159's failure was caused by my SNAT CT zone request patch and I 
have a fix up for it now: 
https://patchwork.ozlabs.org/project/ovn/patch/20201120001343.1485936-1-mmichels@redhat.com/

I think before we get this merged, we definitely need to get the test 
failures figured out. Test 59 in particular is worrying.

On 11/14/20 2:56 AM, Ilya Maximets wrote:
> This series is aimed to reduce size of a Southbound DB.
> In modern OVN cluster there is a big probability that same logical
> flows are installed for several logical switches.  Currently, OVN
> will create a separate logical flow for each logical switch (datapath)
> filled with exact same match, actions and so on.  The only difference
> between these flows is a logical_datapath column.  Depending on
> a size of a cluster, there could be hundreds or even thousands of
> copies of the same flow.
> 
> This change introduces new table Logical Datapath Group.  With
> this table we could combine datapaths in a single set and link
> the whole datapath group to a single logical flow.
> 
> In my testing for a 7MB Northbound DB with 100 logical switches
> taken from the ovn-k8s cluster test, size of a Southbound DB
> decreased from 480 MB to 90 MB.
> 
> Number of logical flows in SB DB for this test case:
>    Currently    - 1173779
>    With patches - 171632
> 
> Current implementation is functional, but it has few small
> unfinished parts:
> 
> 1. Backward compatibility code for ovn-controller is missing.
> 
> 2. Some tests are failing and I don't know why yet (didn't spend
>     much time on this):
>      58: ovn -- dhcpv6 : 1 HV, 2 LS, 5 LSPs              FAILED (ovn.at:6209)
>      59: ovn -- 2 HVs, 2 LRs connected via LS, gateway router FAILED (ovs-macros.at:222)
>     153: ovn -- Symmetric ECMP reply flows               FAILED (ovn.at:21589)
>     159: ovn -- ARP replies for SNAT external ips        FAILED (ovn.at:22328)
> 
>    Test 59 fails with crash inside idl code on ovn-controller side
>    and that is suspicious (might be not entirely my fault):
> 
>    SIGSEGV detected, backtrace:
>    0x4567de         <fatal_signal_handler+0x1e>
>    0x7f27270b96b0   <killpg+0x40>
>    0x4b1e40         <hmap_next_with_hash__+0x10>
>    0x4abeb7         <ovsdb_idl_get_row+0x27>
>    0x4b0cbf         <ovsdb_idl_process_update2+0x1f>
>    0x4b0b43         <ovsdb_idl_db_parse_update__+0x213>
>    0x4b04a9         <ovsdb_idl_db_parse_update+0x9>
>    0x4afa0f         <ovsdb_idl_db_parse_update_rpc+0xff>
>    0x4aa554         <ovsdb_idl_process_msg+0x74>
>    0x4aa35f         <ovsdb_idl_run+0xcf>
>    0x4aec62         <ovsdb_idl_loop_run+0x12>
>    0x42043c         <main+0x117c>
>    0x7f27270a41a3   <__libc_start_main+0xf3>
>    0x404e5e         <_start+0x2e>
> 
> 3. Patches 3-5 should be a single patch in order to work.
>     Split for now to make review easier.
> 
> 4. Code might be not so elegnt in a few places.
> 
> First 5 patches implements 1:1 relation between lflow and datapath
> group, but it's still beneficial to have.  First 5 patches reduces
> DB in my case form 480 to 150 MB.
> 
> I also didn't run any performance tests.  This will be necessary, since
> flow processing is altered and this might give extra load on northd
> and ovn-controller.
> 
> P.S. It looks like it's still Friday the 13th PST.. :)
> 
> Ilya Maximets (6):
>    tests: Sort flow and database dumps.
>    ovn-sb.ovsschema: Add Logical Datapath Groups.
>    northd: Add support for logical datapath groups.
>    controller: Unfinished changes to support Logical Datapath Groups.
>    utilities: Add support for Logical Datapath Groups.
>    northd: Use same Logical Datapath Group for different flows.
> 
>   controller/lflow.c          | 105 ++++++++++------
>   controller/lflow.h          |   1 -
>   controller/ovn-controller.c |  23 +---
>   lib/ovn-util.c              |  14 +--
>   lib/ovn-util.h              |   3 +-
>   northd/ovn-northd.c         | 244 ++++++++++++++++++++++++++++++------
>   ovn-sb.ovsschema            |  22 +++-
>   ovn-sb.xml                  |  42 +++++--
>   tests/ovn-northd.at         |   6 +-
>   tests/ovn.at                |   2 +-
>   utilities/ovn-sbctl.c       |  57 ++++++---
>   utilities/ovn-trace.c       |  35 ++++--
>   12 files changed, 400 insertions(+), 154 deletions(-)
> 



More information about the dev mailing list