[ovs-dev] [ovn] branch-20.09 tests fail with OVS higher than 2.14.0

Numan Siddique numans at ovn.org
Fri Jul 16 14:57:54 UTC 2021


On Tue, Jul 13, 2021 at 2:00 PM Vladislav Odintsov <odivlad at gmail.com> wrote:
>
> Hi Dumitru, Numan,
>
> Regards,
> Vladislav Odintsov
>
> > On 12 Jul 2021, at 21:39, Numan Siddique <numans at ovn.org> wrote:
> >
> > On Fri, Jul 9, 2021 at 9:01 AM Dumitru Ceara <dceara at redhat.com <mailto:dceara at redhat.com>> wrote:
> >>
> >> On 7/8/21 6:34 PM, Vladislav Odintsov wrote:
> >>> Hi,
> >>
> >> Hi Vladislav,
> >>
> >>>
> >>> I see constantly failing test while OVN branch-20.09 against OVS higher than 2.14.0 (2.14.1, 2.14.2, branch-2.14):
> >>> ovn -- ensure one gw controller restart in HA doesn't bounce the master
> >>>
> >>> ## ---------------- ##
> >>> ## Tested programs. ##
> >>> ## ---------------- ##
> >>> ./testsuite.at:1: /builddir/build/BUILD/ovn-20.09.1/openvswitch-2.14.1/vswitchd/ovs-vswitchd --version
> >>> ovs-vswitchd (Open vSwitch) 2.14.1
> >>> ./testsuite.at:1: /builddir/build/BUILD/ovn-20.09.1/openvswitch-2.14.1/utilities/ovs-vsctl --version
> >>> ovs-vsctl (Open vSwitch) 2.14.1
> >>> DB Schema 8.2.0
> >>> ## ------------------ ##
> >>> ## Running the tests. ##
> >>> ## ------------------ ##
> >>> testsuite: starting at: Thu Jul  8 17:44:21 MSK 2021
> >>> testsuite: ending at: Thu Jul  8 17:44:52 MSK 2021
> >>> testsuite: test suite duration: 0h 0m 31s
> >>> ## ------------- ##
> >>> ## Test results. ##
> >>> ## ------------- ##
> >>> ERROR: 1 test was run,
> >>> 1 failed unexpectedly.
> >>> ## ------------------------ ##
> >>> ## Summary of the failures. ##
> >>> ## ------------------------ ##
> >>> Failed tests:
> >>> ovn 20.09.1 test suite test groups:
> >>> NUM: FILE-NAME:LINE     TEST-GROUP-NAME
> >>>      KEYWORDS
> >>>  91: ovn.at:12245       ovn -- ensure one gw controller restart in HA doesn't bounce the master
> >>> ## ---------------------- ##
> >>> ## Detailed failed tests. ##
> >>> ## ---------------------- ##
> >>> #                             -*- compilation -*-
> >>> 91. ovn.at:12245: testing ovn -- ensure one gw controller restart in HA doesn't bounce the master ...
> >>> creating ovn-sb database
> >>> creating ovn-nb database
> >>> starting ovn-northd
> >>> starting backup ovn-northd
> >>> adding simulator 'main'
> >>> adding simulator 'gw1'
> >>> adding simulator 'gw2'
> >>> adding simulator 'hv1'
> >>> ./ovn.at:12277: ovn_populate_arp__
> >>> stdout:
> >>> OK
> >>> OK
> >>> OK
> >>> OK
> >>> OK
> >>> OK
> >>> 194ab858-5fe5-448c-9600-f00a52a120e6
> >>> 511c7f52-8f85-4193-872b-f87c23420dfd
> >>> dc46bb8e-35f5-420c-8874-b493c843fd31
> >>> Waiting until 1 rows in sb Chassis with name=gw2...
> >>> ovn-macros.at:346: waiting until test $count = $(count_rows $db:$table $a $b $c)...
> >>> ovn-macros.at:346: wait failed after 30 seconds
> >>> sb table Chassis has the following rows. 0 rows match instead of expected 1:
> >>> _uuid               : a74cd080-1302-4224-9590-462017d88783
> >>> encaps              : [393b112b-329d-4db1-9926-cd06124a6f2b, df12c311-1563-4202-a197-02686d835867]
> >>> external_ids        : {datapath-type="", iface-types="dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-monitor-all="false"}
> >>> hostname            : bldrvm02
> >>> name                : hv1
> >>> nb_cfg              : 0
> >>> other_config        : {datapath-type="", iface-types="dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-monitor-all="false"}
> >>> transport_zones     : []
> >>> vtep_logical_switches: []
> >>> _uuid               : 780ad589-47d9-4658-b1fb-e0ec96f96ad0
> >>> encaps              : [2bed8f4c-dc55-444d-a259-b12c04a63b62, 752086e4-6372-4840-9f92-fd4a8df3ba21]
> >>> external_ids        : {datapath-type="", iface-types="dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", ovn-bridge-mappings="phys:br-phys", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-monitor-all="false"}
> >>> hostname            : bldrvm02
> >>> name                : gw1
> >>> nb_cfg              : 0
> >>> other_config        : {datapath-type="", iface-types="dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", ovn-bridge-mappings="phys:br-phys", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-monitor-all="false"}
> >>> transport_zones     : []
> >>> vtep_logical_switches: []
> >>> ./ovs-macros.at:222: hard failure
> >>> 91. ovn.at:12245: 91. ovn -- ensure one gw controller restart in HA doesn't bounce the master (ovn.at:12245): FAILED (ovs-macros.at:222)
> >>>
> >>>
> >>> I also tried to build OVN with OVS 2.15 and it doesn’t build at all because of renaming "slave" to "member" in OVS.
> >>>
> >>> Some questions here:
> >>
> >> I'm not a maintainer (maintainers in cc) but I'll try to answer some of
> >> your questions.
> >>
> >>>
> >>> 1. As I understand, changes in OVS branch brought regression in OVN. Shouldn't OVN periodically run github actions workflow for supported branches, not only for master branch? Against supported OVS branches and tags.
> >>
> >> We run github actions and tests for supported branches, e.g.:
> >>
> >> https://github.com/ovn-org/ovn/actions?query=branch%3Abranch-20.09
>
> Yes, but these checks are:
> 1. running against one OVS tag - 2.14.0 (which luckily doesn't trigger test fails);
> 2. not periodical. I mean that periodical testing (for instance) branch-20.09 against supported OVS (in this case branch-2.14) could find the problem earlier automatically.
>
> Also, I see that ovsrobot checks PW patch series against latest OVS (I guess, master branch), and in my opinion it is also not correct.
>
> So my propose here:
> 1. Modify GH Actions workflow so that it run tests against all supported OVS branches (not tags).
> 2. Add to all maintained branches' GH Actions workflows a weekly task schedule, that runs OVN agains all supported OVS tags.

(2) sounds  a good idea to me.   I think we do need to differentiate
between OVS version used for OVN builds and actual OVS version
used at run time.

I would suggest to add a github periodic job which compiles OVN with
the included OVS submodule but runs the test with each OVS branch.
I'm not sure how tricky it would be to do.  Is that possible ?  If we
compile OVN using OVS branch then there can be compilation errors for
OVS version < 2.14 (I think).



>
> I’ve done a small POC draft for master branch here: https://github.com/odivlad/ovn/commit/81d42bc7daf21a9a7b71004179063efd08bcd42f <https://github.com/odivlad/ovn/commit/81d42bc7daf21a9a7b71004179063efd08bcd42f>
> In actions it looks like this: https://github.com/odivlad/ovn/actions/runs/1027318307 <https://github.com/odivlad/ovn/actions/runs/1027318307> (don’t look at failed cases, I need some help there, but it just shows my idea). Currently I don’t have an idea why current build run tests for some configurations and for some doesn’t and for other it runs another set of tests. Why to build without ssl support and some builds with jemalloc. Why not all combinations? On macos we run only one configuration without tests. Is it supported only for build phase? If somebody explain this to me, I can try to improve ci here...

I'm not sure about the exact reason.  I'm fine if you want to run
tests as well for these build-only configs.

>
> >>
> >>> 2. Is there any place where OVN versions are written against supported OVS versions, like OVS does for kernel versions and DPDK?
> >>
> >> This is hard.  More recent branches contain OVS as a git submodule and
> >> that is the version we run tests against in CI, that is, the version
> >> known to be working fine at least for build requirements.
> >>
>
> For latest - yes, but for other branches limitations should be well-known. As an OVN user I have to understand which OVS to take to run OVN with. In other words, which OVS is tested with each OVN version, which should and should not work. Is it possible to add appropriate table like this:

I think it is a good idea.  Do you want to propose a patch ?


>
> +-------+-----------------+-----------------+---------------+---------------+
> |  OVN  | OVS runtime min | OVS runtime max | OVS build min | OVS build max |
> +-------+-----------------+-----------------+---------------+---------------+
> | 20.03 | 2.13.0          | 2.13.4          | 2.13.0        | 2.13.4        |
> | 20.06 | 2.13.0          | 2.13.4          | 2.13.0        | 2.13.4        |
> | 20.09 | 2.13.0          | 2.14.2          | 2.14.0        | 2.14.2        |
> | 20.12 | ?               | ?               | ?             | ?             |
> | 21.03 | ?               | ?               | ?             | ?             |
> | 21.06 | ?               | ?               | ?             | ?             |
> +-------+-----------------+-----------------+---------------+---------------+
>
> Information for this table can be picked from GH Actions runs against different OVS versions.
>
> >>> 3. About the failing test: I found such fail was already observed in OVN (https://mail.openvswitch.org/pipermail/ovs-dev/2020-October/376362.html) and there was a fix in OVS, but now it reproduces again. I tried to apply patch http://patchwork.ozlabs.org/project/ovn/patch/1603185512-8070-1-git-send-email-dceara@redhat.com/ and it solved the problem. Now it looksk the only one OVS tag, with which branch-20.09 can be built - 2.14.0.
> >>
> >> Maybe we should backport this to 20.09 then?
> >
> > The patch is backported now.  Hope this would solve your issue.
> >
>
> Yes, patch is here: https://patchwork.ozlabs.org/project/ovn/patch/20210709131043.51831-1-odivlad@gmail.com/ <https://patchwork.ozlabs.org/project/ovn/patch/20210709131043.51831-1-odivlad@gmail.com/>
>
> >>
> >>> 4. It is not clear which OVN versions are suppoted/LTS. Does OVN project have such decisions?
> >>
> >> As far as I know, OVN mentions LTS in the documentation but until now,
> >> since the OVS-OVN split there was no release tagged as LTS.
> >
> > My personal opinion would be to support the last release or maybe the
> > last two releases.  The reason
> > being we release every 3 months and this makes it harder to support
> > all the branches.
> >
> > Maintainers do try to backport fixes to all possible branches  (as far
> > as they can be applied cleanly without requiring to resolve
> > conflicts).
> >
>
> According to the documentation, at most two latest branches are maintained: latest and LTS. If there is no LTS release, I suppose pre-latest branch takes its place. And release cycle is three months, so this means that formally, OVN maintains only two latest releases for a half of a year each. It’s not connected with reality.
> Maybe this should be updated in documentation?

Agree.

>
> >
> >>
> >>> 5. It is not also clear to understand the upgrade policy. Should administrator upgrade to each major OVN version or some versions can be skipped especially when since 20.12 release there is a version pinning between ovn-controller and ovn-northd? I couldn’t find answer for this question in OVN documentation.
> >>
> >> Version pinning just ensures that there's no dataplane disruption when
> >> upgrading between versions that change the DB schema.  I don't think
> >> there's a limitation about upgrading to the next major OVN version
> >> (unless there's a bug somewhere).
> >
> > Agree.  You should be able to jump versions and upgrade.
>
> Thanks for clarifying!
> Should this be mentioned in public upgrade documentation?
I agree.

Thanks
Numan

>
> >
> > Thanks
> > Numan
> >
> >>
> >>>
> >>> Sorry for a lot of questions, but there is some misunderstanding, which I need to resolve...
> >>>
> >>> Regards,
> >>> Vladislav Odintsov
> >>>
> >>
> >> Regards,
> >> Dumitru
> >>
> >> _______________________________________________
> >> dev mailing list
> >> dev at openvswitch.org <mailto:dev at openvswitch.org>
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list