[ovs-dev] [RFC] Federating the 0-day robot, and improving the testing

Wed Sep 12 15:29:08 UTC 2018

"Eelco Chaudron" <echaudro at redhat.com> writes:

> On 11 Sep 2018, at 17:51, Aaron Conole wrote:
>
>> "Eelco Chaudron" <echaudro at redhat.com> writes:
>>
>>> On 6 Sep 2018, at 10:56, Aaron Conole wrote:
>>>
>>>> As of June, the 0-day robot has tested over 450 patch series.
>>>> Occasionally it spams the list (apologies for that), but for the
>>>> majority of the time it has caught issues before they made it to the
>>>> tree - so it's accomplishing the initial goal just fine.
>>>>
>>>> I see lots of ways it can improve.  Currently, the bot runs on a
>>>> light
>>>> system.  It takes ~20 minutes to complete a set of tests, including
>>>> all
>>>> the checkpatch and rebuild runs.  That's not a big issue.  BUT, it
>>>> does
>>>> mean that the machine isn't able to perform all the kinds of
>>>> regression
>>>> tests that we would want.  I want to improve this in a way that
>>>> various
>>>> contributors can bring their own hardware and regression tests to
>>>> the
>>>> party.  In that way, various projects can detect potential issues
>>>> before
>>>> they would ever land on the tree and it could flag functional
>>>> changes
>>>> earlier in the process.
>>>>
>>>> I'm not sure the best way to do that.  One thing I'll be doing is
>>>> updating the bot to push a series that successfully builds and
>>>> passes
>>>> checkpatch to a special branch on a github repository to kick off
>>>> travis
>>>> builds.  That will give us a more complete regression coverage,
>>>> and we
>>>> could be confident that a series won't break something major.  After
>>>> that, I'm not sure how to notify various alternate test
>>>> infrastructures
>>>> how to kick off their own tests using the patched sources.
>>>>
>>>> My goal is to get really early feedback on patch series.  I've sent
>>>> this
>>>> out to the folks I know are involved in testing and test discussions
>>>> in
>>>> the hopes that we can talk about how best to get more CI happening.
>>>> The
>>>> open questions:
>>>>
>>>> 1. How can we notify various downstream consumers of OvS of these
>>>>    0-day builds?  Should we just rely on people rolling their own?
>>>>    Should there be a more formalized framework?  How will these
>>>> other
>>>>    test frameworks report any kind of failures?
>>>>
>>>> 2. What kinds of additional testing do we want to see the robot
>>>> include?
>>>
>>> First of all thanks for the 0-day robot, I really like the idea…
>>>
>>> One thing I feel would really benefit is some basic performance
>>> testing, like a PVP test for the kernel/dpdk datapath. This will help
>>> easily identifying performance impacting patches as they happen…
>>> Rather than people figuring out after a release why their performance
>>> has dropped.
>>
>> Yes - I hope to pull in the work you've done for ovs_perf to have some
>> kind of baselines.
>>
>> For this to make sense, I think it also needs to have a bunch of
>> hardware that we can benchmark (hint hint to some of the folks in
>> the CC
>> list :).  Not for absolute numbers, but at least to detect significant
>> changes.
>>
>> I'm also not sure how to measure a 'problem.'  Do we run a test
>> pre-series, and then run it post-series?  In that case, we could
>> slowly
>> degrade performance over time without any noticing.  Do we take it
>> from
>> the previous release, and compare?  Might make more sense, but I don't
>> know if it has other problems associated.  What are the thresholds we
>> use for saying something is a regression?  How do we report it to
>> developers?
>
> Guess both in an ideal world, and maybe add a weekly baseline for
> master :)
>
> Having a graph of this would be really nice. However, this might be a
> whole project on itself, i.e. performance runs on all commits to
> master…

I consider that a useful thing to have (a dashboard with information on
series tested, etc).  But I agree, it's another whole project.

>>>>    Should the test results be made available in general on some kind
>>>> of
>>>>    public facing site?  Should it just stay as a "bleep bloop -
>>>>    failure!" marker?
>>>>
>>>> 3. What other concerns should be addressed?
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev at openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev