[ovs-dev] [no-slow 2/6] ofproto-dpif: Reorganize upcall handling.

Gregory Rose gvrose8192 at gmail.com
Fri Jan 5 16:05:09 UTC 2018


On 1/3/2018 9:37 AM, Gregory Rose wrote:
> On 1/2/2018 11:42 AM, Justin Pettit wrote:
>>> On Dec 28, 2017, at 3:22 PM, Gregory Rose <gvrose8192 at gmail.com> wrote:
>>>
>>> SFAICT it emulates exactly what the system-traffic.at test 001 
>>> does.  And it works fine... /shrug.
>>>
>>> What distribution, kernel, etc are you using for your check-kmod 
>>> testing?  I'll try to emulate that
>>> exactly and then see if I can get similar results.
>> I'm using Ubuntu 16.04 with kernel 4.4.0-104-generic.  I sent you a 
>> link on our Slack channel to the internal tester that runs different 
>> OSs.  It fails a few of tests, but they're the same ones that fail on 
>> master.  (We need to address those, but they shouldn't be related to 
>> my patches.)
>>
>> --Justin
>>
>>
>

I've created a script that runs a test outside of the m4sh autotest 
framework that exactly emulates
what test 001 of the 'make check-kmod' test performs.

When running the test I've created with this patch applied I 
consistently see the following error
message:

2018-01-05T15:53:14.440Z|00001|ofproto_dpif_upcall(handler1)|WARN|invalid 
user cookie of type 0 and size 4

But the ping test does succeed.  Without this patch applied the error 
messsage does not occur.

I've attached the script.  I'm not sure why the test on the internal 
server you pointed me to does not have
this error in it but as mentioned before I can reproduce it reliably on 
several different VMs with both Ubuntu
and RHEL based distributions and various kernels.

For now that's about as far as I can take my investigation since I have 
a few other things I need to work
on.  If you can think of another test I should run or something for me 
to check into let me know.

Thanks,

- Greg

> Justin,
>
> I have done more testing last night and this morning and have a couple 
> of findings.
>
> First, the tests themselves *all* succeed.  However, they are marked 
> as failed because of warnings that
> occur during OVS_TRAFFIC_VSWITCHD_STOP in system-traffic.at.  If I 
> comment out
> OVS_TRAFFIC_VSWITCHD_STOP then the test runs successfully.
>
> AT_SETUP([datapath - ping between two ports])
> OVS_TRAFFIC_VSWITCHD_START()
>
> AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
>
> ADD_NAMESPACES(at_ns0, at_ns1)
>
> ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
> ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
>
> NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | 
> FORMAT_PING], [0], [dnl
> 3 packets transmitted, 3 received, 0% packet loss, time 0ms
> ])
> NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.2 | 
> FORMAT_PING], [0], [dnl
> 3 packets transmitted, 3 received, 0% packet loss, time 0ms
> ])
> NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.2 | 
> FORMAT_PING], [0], [dnl
> 3 packets transmitted, 3 received, 0% packet loss, time 0ms
> ])
>
> dnl OVS_TRAFFIC_VSWITCHD_STOP
> AT_CLEANUP
>
> ## ------------------------------ ##
> ## openvswitch 2.8.90 test suite. ##
> ## ------------------------------ ##
>   1: datapath - ping between two ports               ok
>
> ## ------------- ##
> ## Test results. ##
> ## ------------- ##
>
> 1 test was successful.
>
> I'm now debugging the OVS_TRAFFIC_VSWITCHD_STOP macro and trying to 
> determine what
> is causing the problem.  Here are the log messages:
>
> 2018-01-03T17:30:52.340Z|00039|netdev_linux|WARN|ovs-p1: removing 
> policing failed: No such device
> 2018-01-03T17:30:52.341Z|00040|ofproto|WARN|br0: cannot get STP status 
> on nonexistent port 2
> 2018-01-03T17:30:52.341Z|00041|ofproto|WARN|br0: cannot get RSTP 
> status on nonexistent port 2
> 2018-01-03T17:30:52.343Z|00042|bridge|INFO|bridge br0: deleted 
> interface ovs-p1 on port 2
> 2018-01-03T17:30:52.346Z|00043|bridge|WARN|could not open network 
> device ovs-p1 (No such device)
> 2018-01-03T17:30:52.360Z|00044|bridge|INFO|bridge br0: deleted 
> interface ovs-p0 on port 1
> 2018-01-03T17:30:52.364Z|00045|bridge|WARN|could not open network 
> device ovs-p0 (No such device)
> 2018-01-03T17:30:52.367Z|00046|bridge|WARN|could not open network 
> device ovs-p1 (No such device)
>
> It is the WARNS from the OVS_TRAFFIC_VSWITCHD_STOP part of the test 
> that are causing all tests to fail.
>
> Again, I see this on multiple systems.  They are all VMs though so I'm 
> wondering if the internal test that
> you are referring to was run on bare metal?
>
> Thanks,
>
> - Greg
>

-------------- next part --------------
#!/bin/bash
if [ ! -f vswitch.ovsschema ]; then
    echo "No schema file found - please copy vswitch.ovsschema to this directory"
    exit 1
fi
rm -f logfile
modprobe openvswitch
touch .conf.db.~lock~
ovsdb-tool create conf.db vswitch.ovsschema
ovsdb-server conf.db --detach --no-chdir --pidfile --log-file --remote=punix:/usr/local/var/run/openvswitch/db.sock
ovs-vsctl --no-wait init
ovs-vswitchd --detach --no-chdir --pidfile --log-file=logfile -vvconn -vofproto_dpif -vunixctl
ovs-vsctl -- add-br br0 -- set Bridge br0 protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15 fail-mode=secure
ovs-ofctl add-flow br0 "actions=normal"
ip netns add at_ns0
ip netns exec at_ns0 sysctl -w net.netfilter.nf_conntrack_helper=0
ip netns add at_ns1
ip netns exec at_ns1 sysctl -w net.netfilter.nf_conntrack_helper=0
ip link add p0 type veth peer name ovs-p0
ip link set p0 netns at_ns0
ip link set dev ovs-p0 up
ovs-vsctl add-port br0 ovs-p0 -- set interface ovs-p0 external-ids:iface-id="p0"
ip netns exec at_ns0 ip addr add 10.1.1.1/24 dev p0
ip netns exec at_ns0 ip link set dev p0 up
ip link add p1 type veth peer name ovs-p1
ip link set p1 netns at_ns1
ip link set dev ovs-p1 up
ovs-vsctl add-port br0 ovs-p1 -- set interface ovs-p1 external-ids:iface-id="p1"
ip netns exec at_ns1 ip addr add 10.1.1.2/24 dev p1
ip netns exec at_ns1 ip link set dev p1 up
ip netns exec at_ns0 ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | grep "transmitted" | sed 's/time.*ms$/time 0ms/'
ip netns exec at_ns0 ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.2 | grep "transmitted" | sed 's/time.*ms$/time 0ms/'
ip netns exec at_ns0 ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.2 | grep "transmitted" | sed 's/time.*ms$/time 0ms/'
ovs-appctl --timeout=10 -t ovs-vswitchd exit --cleanup
ovs-appctl --timeout=10 -t ovsdb-server exit
ip link del ovs-p0
ip link del ovs-p1
ip netns del at_ns0
ip netns del at_ns1
ovs-dpctl del-dp ovs-system
rm -f conf.db
modprobe -r openvswitch
grep WARN logfile


More information about the dev mailing list