[ovs-discuss] Errors in test-controller

Vasu Dasari vdasari at gmail.com
Thu Feb 27 12:34:01 UTC 2014


Hi,

I am seeing some uncontrollable message logs generated by test-controller
under one negative test scenario.

1. Launch test-controller:
       test-controller ptcp: -v
2. From another shell, just do the following:

vdasari at ovs:~/mininet$ telnet 127.0.0.1 6633
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
^]
telnet> c
Connection closed.

3. We can see a lot of log messages which look like these:
2014-02-27T11:53:48Z|21819|poll_loop|DBG|wakeup due to 0-ms timeout at
../lib/vconn.c:916 (0% CPU usage)

This is my analysis so far:
1. Here user is just doing a connect and close, no "hello" handshake is
involved.
2. Controller, on connecting to the client, vconn state machine moves to
send_hello state. And expects to receive hello.
3. In place of receiving hello, controller receives close event, and the
vconn state ultimately moves to VCS_DISCONNECTED via the following code
path. (Ignore line numbers as the code contains my debug logs as well, so
might not match with the one in git)

(gdb) where
#0  vcs_recv_hello (vconn=0x813b460) at ../lib/vconn.c:506
#1  0x0808c3dd in vconn_connect (vconn=0x813b460) at ../lib/vconn.c:565
#2  0x0808b96f in vconn_run (vconn=0x813b460) at ../lib/vconn.c:272
#3  0x08081ab1 in rconn_run (rc=0x813b780) at ../lib/rconn.c:595
#4  0x0804ceef in lswitch_run (sw=0x813b880) at ../lib/learning-switch.c:258
#5  0x0804baa1 in main (argc=2, argv=0xbffff2f4) at
../tests/test-controller.c:176

Because of the sequence, vconn state moves to VCS_DISCONNECTED silently.
rconn_run() has no idea that vconn has moved to this disconnected state.
And hence lswitch does not know about this vconn state as well. And in the
main while loop in main(), poll_block() would return immediately as it is
polling on an invalid socket, and the above "wakeup ..." log message is
generated via lswitch_wait()/rconn_recv_wait()/vconn_wait() code path. And
this happens continuously without yielding CPU.

I am implying that the design of this system believes the cleanup of closed
connections is supposed to happen via lswitch_is_alive() code path alone.
And in this function, it is checking for rconn_is_alive() only. And
rconn_is_alive() checks its own state state to declare if rconn is valid.

For solution I believe one of the following should happen:
1. rconn_is_alive() should also check for vconn state as well to declare if
rconn is alive or not.
2. When vconn_connect() returns failure, vconn_run() should propagate the
same to rconn_run() so it can set it's state accordingly.

Please let me know your thoughts.

Thanks,
-Vasu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20140227/37a834df/attachment-0002.html>


More information about the discuss mailing list