[ovs-dev] [PATCH] utilities: Implement ovs-vlan-test script

Ethan Jackson ethan at nicira.com
Wed Dec 15 23:30:41 UTC 2010


Note on the testing of this patch:

Of course I tested it in the basic case where there are no VLAN
connectivity problems.  I also simulated the known failure conditions
with openflow rules which drop VLAN traffic and it successfully
detected those conditions.

I tried the script on an igb driver which drops mtu sized packets and
successfully detected that error.  I did not test the script with a
bnx2x driver and don't intend to.

I tested the script with e1000e but was not able to reproduce the
checksum offloading problem other people have had.  Before I merge
this upstream I will figure that out.

Ethan

On Wed, Dec 15, 2010 at 3:23 PM, Ethan Jackson <ethan at nicira.com> wrote:
> This patch implements a script which may be used to check for
> connectivity issues caused by bugs in linux drivers relating to
> VLAN traffic.
> ---
>  utilities/automake.mk        |   10 +-
>  utilities/ovs-vlan-test.1.in |   56 ++++++
>  utilities/ovs-vlan-test.in   |  441 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 506 insertions(+), 1 deletions(-)
>  create mode 100644 utilities/ovs-vlan-test.1.in
>  create mode 100755 utilities/ovs-vlan-test.in
>
> diff --git a/utilities/automake.mk b/utilities/automake.mk
> index 9a334e3..51c3f8e 100644
> --- a/utilities/automake.mk
> +++ b/utilities/automake.mk
> @@ -11,7 +11,10 @@ bin_SCRIPTS += utilities/ovs-pki utilities/ovs-vsctl
>  if HAVE_PYTHON
>  bin_SCRIPTS += utilities/ovs-pcap utilities/ovs-tcpundump
>  endif
> -noinst_SCRIPTS += utilities/ovs-pki-cgi utilities/ovs-parse-leaks
> +noinst_SCRIPTS += \
> +       utilities/ovs-pki-cgi \
> +       utilities/ovs-parse-leaks \
> +       utilities/ovs-vlan-test
>
>  EXTRA_DIST += \
>        utilities/ovs-appctl.8.in \
> @@ -30,6 +33,8 @@ EXTRA_DIST += \
>        utilities/ovs-pki.in \
>        utilities/ovs-tcpundump.1.in \
>        utilities/ovs-tcpundump.in \
> +       utilities/ovs-vlan-test.in \
> +       utilities/ovs-vlan-test.1.in \
>        utilities/ovs-vsctl.8.in
>  DISTCLEANFILES += \
>        utilities/ovs-appctl.8 \
> @@ -47,6 +52,8 @@ DISTCLEANFILES += \
>        utilities/ovs-pki.8 \
>        utilities/ovs-tcpundump \
>        utilities/ovs-tcpundump.1 \
> +       utilities/ovs-vlan-test \
> +       utilities/ovs-vlan-test.1 \
>        utilities/ovs-vsctl.8
>
>  man_MANS += \
> @@ -61,6 +68,7 @@ man_MANS += \
>        utilities/ovs-pcap.1 \
>        utilities/ovs-pki.8 \
>        utilities/ovs-tcpundump.1 \
> +       utilities/ovs-vlan-test.1 \
>        utilities/ovs-vsctl.8
>
>  utilities_ovs_appctl_SOURCES = utilities/ovs-appctl.c
> diff --git a/utilities/ovs-vlan-test.1.in b/utilities/ovs-vlan-test.1.in
> new file mode 100644
> index 0000000..7153a59
> --- /dev/null
> +++ b/utilities/ovs-vlan-test.1.in
> @@ -0,0 +1,56 @@
> +.TH ovs\-vlan-test 1 "December 2010" "Open vSwitch" "Open vSwitch Manual"
> +.
> +.SH NAME
> +\fBovs\-vlan\-test\fR \- check linux drivers for problems with vlan traffic
> +.
> +.SH SYNOPSIS
> +\fBovs\-vlan\-test\fR [\fIoptions\fR] \fIcontrol_ip\fR \fIvlan_ip\fR
> +.
> +.SH DESCRIPTION
> +The \fBovs\-vlan\-test\fR program may be used to check linux kernel drivers for
> +problems sending VLAN traffic which may occur when running Open vSwitch.
> +.
> +.SS "Client Mode"
> +An \fBovs\-vlan\-test\fR client may be run on a host to check for VLAN
> +connectivity problems.  The client must be able to establish HTTP connections
> +with an \fBovs\-vlan\-test\fR server located at the specified \fIcontrol_ip\fR
> +address.  UDP traffic sourced at \fIvlan_ip\fR should be tagged and directed out
> +the interface whose connectivity is being tested.
> +.
> +.SS "Server Mode"
> +To conduct tests, an \fBovs\-vlan\-test\fR server must be running on a host
> +known not to have VLAN connectivity problems.  The server must have a
> +\fIcontrol_ip\fR on a non-VLAN network which clients can establish connectivity
> +with.  It must also have a \fIvlan_ip\fR address on a VLAN network which
> +clients will use to test their VLAN connectivity.  Multiple clients may test
> +against a single \fBovs\-vlan\-test\fR server concurrently.
> +.
> +.SH OPTIONS
> +.
> +.IP "\fB\-s\fR"
> +.IQ "\fB\-\-server\fR"
> +Run in server mode.
> +.
> +.IP "\fB\-h\fR"
> +.IQ "\fB\-\-help\fR"
> +Display a help message.
> +.
> +.IP "\fB\-V\fR"
> +.IQ "\fB\-\-version\fR"
> +Display the Open vSwitch version of this program.
> +.
> +.SH EXAMPLES
> +.TP
> +\fBovs\-vlan\-test\fR -s 0.0.0.0:80 1.2.3.4
> +Runs an \fBovs\-vlan\-test\fR server listening for client control traffic on
> +port 80 of any address and VLAN traffic on the default port of 1.2.3.4.
> +.
> +.TP
> +\fBovs\-vlan\-test\fR 5.6.7.8 1.2.3.4:99
> +Runs an \fBovs\-vlan\-test\fR client with a control server located at the
> +default port of 5.6.7.8 and a local VLAN ip of 1.2.3.4 port 99.
> +.
> +.SH SEE ALSO
> +.
> +.BR ovs\-vswitchd (8),
> +.BR ovs\-ofctl (8),
> diff --git a/utilities/ovs-vlan-test.in b/utilities/ovs-vlan-test.in
> new file mode 100755
> index 0000000..61b6f3d
> --- /dev/null
> +++ b/utilities/ovs-vlan-test.in
> @@ -0,0 +1,441 @@
> +#! @PYTHON@
> +#
> +# Copyright (c) 2010 Nicira Networks.
> +#
> +# Licensed under the Apache License, Version 2.0 (the "License");
> +# you may not use this file except in compliance with the License.
> +# You may obtain a copy of the License at:
> +#
> +#     http://www.apache.org/licenses/LICENSE-2.0
> +#
> +# Unless required by applicable law or agreed to in writing, software
> +# distributed under the License is distributed on an "AS IS" BASIS,
> +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> +# See the License for the specific language governing permissions and
> +# limitations under the License.
> +
> +import BaseHTTPServer
> +import getopt
> +import httplib
> +import os
> +import threading
> +import time
> +import signal #Causes keyboard interrupts to go to the main thread.
> +import socket
> +import sys
> +
> +print_safe_lock = threading.Lock()
> +def print_safe(s):
> +    print_safe_lock.acquire()
> +    print(s)
> +    print_safe_lock.release()
> +
> +def start_thread(target, args):
> +    t = threading.Thread(target=target, args=args)
> +    t.setDaemon(True)
> +    t.start()
> +    return t
> +
> +#Caller is responsible for catching socket.error exceptions.
> +def send_packet(key, length, dest_ip, dest_port):
> +
> +    length -= 28 #L3 L4 headers.
> +
> +    packet   = [chr(0) for _ in range(length)]
> +    data_str = str(key)
> +
> +    for i in range(len(data_str)):
> +        packet[i] = data_str[i]
> +
> +    packet = ''.join(packet)
> +
> +    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
> +    sock.sendto(packet, (dest_ip, dest_port))
> +    sock.close()
> +
> +#UDP Receiver
> +class UDPReceiver:
> +    def __init__(self, vlan_ip, vlan_port):
> +        self.vlan_ip        = vlan_ip
> +        self.vlan_port      = vlan_port
> +        self.recv_callbacks = {}
> +        self.udp_run        = False
> +
> +    def recv_packet(self, key, success_callback, timeout_callback):
> +
> +        event = threading.Event()
> +
> +        def timeout_cb():
> +            timeout_callback()
> +            event.set()
> +
> +        timer = threading.Timer(5, timeout_cb)
> +        timer.daemon = True
> +
> +        def success_cb():
> +            timer.cancel()
> +            success_callback()
> +            event.set()
> +
> +        # Start the timer first to avoid a timer.cancel() race condition.
> +        timer.start()
> +        self.recv_callbacks[key] = success_cb
> +        return event
> +
> +    def udp_receiver(self):
> +
> +        sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
> +        sock.settimeout(1)
> +
> +        try:
> +            sock.bind((self.vlan_ip, self.vlan_port))
> +        except socket.error, e:
> +            print_safe('Failed to bind to %s:%d with error: %s'
> +                    % (self.vlan_ip, self.vlan_port, str(e)))
> +            os._exit(1) #sys.exit only exits the current thread.
> +
> +        while self.udp_run:
> +
> +            try:
> +                data, _ = sock.recvfrom(4096)
> +            except socket.timeout:
> +                continue
> +            except socket.error, e:
> +                print_safe('Failed to receive from %s:%d with error: %s'
> +                    % (self.vlan_ip, self.vlan_port, str(e)))
> +                os._exit(1)
> +
> +            data_str = ''
> +            for i in range(len(data)):
> +                if data[i] == chr(0):
> +                    break
> +                data_str += data[i]
> +
> +            if not data_str.isdigit():
> +                continue
> +
> +            key = int(data_str)
> +
> +            if key in self.recv_callbacks:
> +                self.recv_callbacks[key]()
> +                del self.recv_callbacks[key]
> +
> +    def start(self):
> +        self.udp_run = True
> +        start_thread(self.udp_receiver, ())
> +
> +    def stop(self):
> +        self.udp_run = False
> +
> +#Server
> +vlan_server = None
> +class VlanServer:
> +
> +    def __init__(self, server_ip, server_port, vlan_ip, vlan_port):
> +        global vlan_server
> +
> +        vlan_server = self
> +
> +        self.server_ip   = server_ip
> +        self.server_port = server_port
> +
> +        self.recv_response = '%s:%d:' % (vlan_ip, vlan_port)
> +
> +        self.result      = {}
> +        self.result_lock = threading.Lock()
> +
> +        self._test_id      = 0
> +        self._test_id_lock = threading.Lock()
> +
> +        self.udp_recv = UDPReceiver(vlan_ip, vlan_port)
> +
> +    def get_test_id(self):
> +        self._test_id_lock.acquire()
> +
> +        self._test_id += 1
> +        ret = self._test_id
> +
> +        self._test_id_lock.release()
> +        return ret
> +
> +    def set_result(self, key, value):
> +
> +        self.result_lock.acquire()
> +
> +        if not key in self.result:
> +            self.result[key] = value
> +
> +        self.result_lock.release()
> +
> +    def recv(self, test_id):
> +        self.udp_recv.recv_packet(test_id,
> +                lambda : self.set_result(test_id, 'Success'),
> +                lambda : self.set_result(test_id, 'Timeout'))
> +
> +        return self.recv_response + str(test_id)
> +
> +    def send(self, test_id, data):
> +        try:
> +            ip, port, size = data.split(':')
> +            port = int(port)
> +            size = int(size)
> +        except ValueError:
> +            self.set_result(test_id, 'Server failed to parse send request')
> +            return
> +
> +        def send_thread():
> +            send_time = 10
> +            for _ in range(send_time * 2):
> +                try:
> +                    send_packet(test_id, size, ip, port)
> +                except socket.error, e:
> +                    self.set_result(test_id, 'Failure: ' + str(e))
> +                    return
> +                time.sleep(.5)
> +
> +            self.set_result(test_id, 'Success')
> +
> +        start_thread(send_thread, ())
> +
> +        return str(test_id)
> +
> +    def run(self):
> +        self.udp_recv.start()
> +        try:
> +            BaseHTTPServer.HTTPServer((self.server_ip, self.server_port),
> +                    VlanServerHandler).serve_forever()
> +        except socket.error, e:
> +            print_safe('Failed to start control server: %s' % str(e))
> +            self.udp_recv.stop()
> +
> +        return 1
> +
> +class VlanServerHandler(BaseHTTPServer.BaseHTTPRequestHandler):
> +    def do_GET(self):
> +
> +        #Guarantee three arguments.
> +        path = (self.path.lower().lstrip('/') + '//').split('/')
> +
> +        resp = 404
> +        body = None
> +
> +        if path[0] == 'start':
> +            test_id = vlan_server.get_test_id()
> +
> +            if path[1] == 'recv':
> +                resp = 200
> +                body = vlan_server.recv(test_id)
> +            elif path[1] == 'send':
> +                resp = 200
> +                body = vlan_server.send(test_id, path[2])
> +        elif (path[0] == 'result'
> +                and path[1].isdigit()
> +                and int(path[1]) in vlan_server.result):
> +            resp = 200
> +            body = vlan_server.result[int(path[1])]
> +        elif path[0] == 'ping':
> +            resp = 200
> +            body = 'pong'
> +
> +        self.send_response(resp)
> +        self.end_headers()
> +
> +        if body:
> +            self.wfile.write(body)
> +
> +#Client
> +class VlanClient:
> +
> +    def __init__(self, server_ip, server_port, vlan_ip, vlan_port):
> +        self.server_ip_port = '%s:%d' % (server_ip, server_port)
> +        self.vlan_ip_port   = "%s:%d" % (vlan_ip, vlan_port)
> +        self.udp_recv       = UDPReceiver(vlan_ip, vlan_port)
> +
> +    def request(self, resource):
> +        conn = httplib.HTTPConnection(self.server_ip_port)
> +        conn.request('GET', resource)
> +        return conn
> +
> +    def send(self, size):
> +
> +        def error_msg(e):
> +            print_safe('Send size %d unsuccessful: %s' % (size, str(e)))
> +
> +        try:
> +            conn = self.request('/start/recv')
> +            data = conn.getresponse().read()
> +        except (socket.error, httplib.HTTPException), e:
> +            error_msg(e)
> +            return False
> +
> +        try:
> +            ip, port, test_id = data.split(':')
> +            port    = int(port)
> +            test_id = int(test_id)
> +        except ValueError:
> +            error_msg("Received invalid response from control server")
> +            return False
> +
> +        send_time = 5
> +
> +        for _ in range(send_time * 4):
> +
> +            try:
> +                send_packet(test_id, size, ip, port)
> +                resp = self.request('/result/%d' % test_id).getresponse()
> +                data = resp.read()
> +            except (socket.error, httplib.HTTPException), e:
> +                error_msg(e)
> +                return False
> +
> +            if resp.status == 200 and data == 'Success':
> +                print_safe('Send size %d successful' % size)
> +                return True
> +            elif resp.status == 200:
> +                error_msg(data)
> +                return False
> +
> +            time.sleep(.25)
> +
> +        error_msg('Timeout')
> +        return False
> +
> +    def recv(self, size):
> +
> +        def error_msg(e):
> +            print_safe('Receive size %d unsuccessful: %s' % (size, str(e)))
> +
> +        resource = '/start/send/%s:%d' % (self.vlan_ip_port, size)
> +        try:
> +            conn    = self.request(resource)
> +            test_id = conn.getresponse().read()
> +        except (socket.error, httplib.HTTPException), e:
> +            error_msg(e)
> +            return False
> +
> +        if not test_id.isdigit():
> +            error_msg('Invalid response %s' % test_id)
> +            return False
> +
> +        success = [False] #Primitive datatypes can't be set from closures.
> +
> +        def success_cb():
> +            success[0] = True
> +
> +        def failure_cb():
> +            success[0] = False
> +
> +        self.udp_recv.recv_packet(int(test_id), success_cb, failure_cb).wait()
> +
> +        if success[0]:
> +            print_safe('Receive size %d successful' % size)
> +        else:
> +            error_msg('Timeout')
> +
> +        return success[0]
> +
> +    def server_up(self):
> +
> +        def error_msg(e):
> +            print_safe('Failed control server connectivity test: %s' % str(e))
> +
> +        try:
> +            resp = self.request('/ping').getresponse()
> +            data = resp.read()
> +        except (socket.error, httplib.HTTPException), e:
> +            error_msg(e)
> +            return False
> +
> +        if resp.status != 200:
> +            error_msg('Invalid status %d' % resp.status)
> +        elif data != 'pong':
> +            error_msg('Invalid response %s' % data)
> +
> +        return True
> +
> +    def run(self):
> +
> +        if not self.server_up():
> +            return 1
> +
> +        self.udp_recv.start()
> +
> +        success = True
> +        for size in [50, 500, 1000, 1500]:
> +            success = self.send(size) and success
> +            success = self.recv(size) and success
> +
> +        self.udp_recv.stop()
> +
> +        if success:
> +            print_safe('OK')
> +            return 0
> +
> +        return 1
> +
> +def usage():
> +    print_safe("""\
> +%(argv0)s: Test vlan connectivity
> +usage: %(argv0)s server vlan
> +
> +The following options are also available:
> +  -s, --server                run in server mode
> +  -h, --help                  display this help message
> +  -V, --version               display version information\
> +""" % {'argv0': sys.argv[0]})
> +
> +def main():
> +
> +    try:
> +        options, args = getopt.gnu_getopt(sys.argv[1:], 'hVs',
> +                                          ['help', 'version', 'server'])
> +    except getopt.GetoptError, geo:
> +        print_safe('%s: %s\n' % (sys.argv[0], geo.msg))
> +        return 1
> +
> +    server = False
> +    for key, _ in options:
> +        if key in ['-h', '--help']:
> +            usage()
> +            return 0
> +        elif key in ['-V', '--version']:
> +            print_safe('ovs-vlan-test (Open vSwitch) @VERSION@')
> +            return 0
> +        elif key in ['-s', '--server']:
> +            server = True
> +        else:
> +            print_safe('Unexpected option %s. (use --help for help)' % key)
> +            return 1
> +
> +    if len(args) != 2:
> +        print_safe('Expecting two arguments. (use --help for help)')
> +        return 1
> +
> +    try:
> +        server_ip, server_port = args[0].split(':')
> +        server_port = int(server_port)
> +    except ValueError:
> +        server_ip = args[0]
> +        server_port = 80
> +
> +    try:
> +        vlan_ip, vlan_port = args[1].split(':')
> +        vlan_port = int(vlan_port)
> +    except ValueError:
> +        vlan_ip   = args[1]
> +        vlan_port = 15213
> +
> +    if server:
> +        return VlanServer(server_ip, server_port, vlan_ip, vlan_port).run()
> +    else:
> +        return VlanClient(server_ip, server_port, vlan_ip, vlan_port).run()
> +
> +if __name__ == '__main__':
> +    main_ret = main()
> +
> +    # Python can throw exceptions if threads are running at exit.
> +    for th in threading.enumerate():
> +        if th != threading.currentThread():
> +            th.join()
> +
> +    sys.exit(main_ret)
> --
> 1.7.3.3
>
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev_openvswitch.org
>




More information about the dev mailing list