[ovs-discuss] vxlan port was deleted while using “ovs-ctl restart” to do ovs hot upgrading

王小伟 wangxw12421 at gmail.com
Wed Mar 17 01:45:22 UTC 2021


Hi all,

I'm using ovs-ctl [Open vSwitch version: v2.11.2] to do ovs hot
upgrading depending on the
guide[https://docs.openvswitch.org/en/latest/intro/install/general/?highlight=hot%20upgrade#hot-upgrading].
Only userspace daemon
should be upgraded in my situation, and ovsdb restart command was
commented out. The upgrading workflow looks like this:

save_flows;
#stop_ovsdb;
#start_ovsdb;
stop_forwarding;
flow_restore_wait;
start_forwarding;
restore_flows;
flow_restore_complete;

While hot upgrading, the vxlan port belong to our bridge was deleted,
and then was created few moments later. Once vxlan port was deleted,
all packets send to vxlan port will be dropped, which is not expected.

As long as I recongnize, vxlan port may be deleted in following three situation:
1. ovs-ctl execute "stop_forwarding";
2. ovs-vswitchd "create ofproto";
3. ovs-vswitchd "ofproto_type_run"

In the first situation, if apply "kill -9 ovs-vswitchd" instead of
gracefully exit by using "ovs-appctl -p pid exit", vxlan port will not
be deleted. It's usefull, but then I was faced with the second
situation.

In the second situation, the function call chain in vswitchd is:
“bridge_run -> bridge_reconfigure -> ofproto_create  -> construct ->
open_dpif_backer “

"open_dpif_backer" will compare the port name configured in ovsdb and
the port get by netlink from kernel. ovs-vswitchd call
"netdev_vport_get_dpif_port" to add the postfix "_sys_4789" for vxlan
port before create port. Thus the name of vxlan port configrued in
ovsdb is "vxlan", but in kernel it's "vxlan_sys_4789". vxlan_sys_4789
is considered as a unconfigured port and be deleted then. This seem
unresonable. May I ask what is the consideration?

I modified the comparison logic for vxlan type to avoid the second
situation, also vxlan port added in "bridge_add_ports" is skipped.
This cause the third situation. The function call chain in vswitchd
is:   “ bridge_run -> bridge_run__ -> ofproto_type_run -> type_run”

"type_run" find out the vxlan_sys_4789 port is not in
ofproto->backer->tnl_backers hmap(because "bridge_add_ports" is
skipped), so netlink interface is called to create vxlan_sys_4789
port. Because the port "vxlan_sys_4789" is existed in kernel,
ovs-vswitched will delete it.

I'm confused about the behaviour ovs-vswitchd handle vxlan port and
not sure all the above is designed  by certain or not.

The question is:
- Is receate(delete first then create) vxlan port designed for any
reason? Can this be avoided if no configurations was changed?

- why ovs-vswitchd rename vxlan port to "vxlan_sys_4789", while remain
other internal port its original name?

- Is there a way to do ovs-vswitchd daemon hot upgrading without packet loss?

Thanks a lot!
Xiaowei Wang


More information about the discuss mailing list