[ovs-dev] [RFC 3/3] netdev-dpdk: Unlink vhost user socket before creation.

Ansis Atteka ansisatteka at gmail.com
Fri Feb 5 20:19:00 UTC 2016


On 5 February 2016 at 00:53, Aaron Conole <aconole at redhat.com> wrote:

> Hi Ansis,
>
> Ansis Atteka <ansisatteka at gmail.com> writes:
> > On 2 February 2016 at 17:56, Daniele Di Proietto <diproiettod at vmware.com
> >
> > wrote:
> >
> >> If ovs-vswitchd crashes, it will not be able to recreate the same
> >> vhost user ports, since the socket will still be in the file system.
> >>
> >> This commit introduces an unlink() before creation to remove an eventual
> >> preexisting vhost user socket.
> >>
> >> This mimics the behavior of the OpenFlow management socket for a bridge.
> >>
> >> Signed-off-by: Daniele Di Proietto <diproiettod at vmware.com>
> >> ---
> >> There have been security concerns about the introduction of a similar
> >> mechanism, because the port name is basically untrusted input.
> >>
> >> I decided to post this anyway since:
> >> * The previous commit introduces a check to discard port names with '/'
> >> * This is how the OpenFlow management socket is creation is handled.
> >>   As pointed out by Ben, though, an OpenFlow socket always ends in
> '.mgmt'
> >>   while a vhost user socket doesn't have any suffix.
> >>
> >
> > How about introducing something like:
> > unlink_only_if_is_orphaned_socket(...)?
> >
> > ls -la is able to indicate that inode is socket (the s prefix):
> > aatteka at aatteka-PowerEdge-T110:~/Git/openvswitch$ ls -la
> > /var/run/openvswitch/
> > srwxr-x---  1 root root    0 Feb  3 14:43 db.sock
> >
> > And ss -p is able to confirm existence of the other end of the socket:
> > aatteka at aatteka-PowerEdge-T110:~/Git/openvswitch$ sudo ss -p | grep
> > "db.sock"
> > u_str  ESTAB      0      0      /var/run/openvswitch/db.sock 38934130
> >         * 38933374 users:(("ovsdb-server",pid=3658,fd=18))
> >
> > I did not look into "ss -p" internals so I can't comment about
> portability
> > or complexity if you want to pursue this path. Also, I believe it still
> > does not solve a minor leak problem if ovs-vswitchd was abruptly killed
> and
> > the bridge did not show up in OVSDB on the next ovs-vswitchd startup
> hence
> > relics of old socket remained there. I don't have a clean solution for
> this
> > unless there is convention that OVS can wipe out all contents of a
> > dedicated directory where Unix Domain Sockets are created.
>
> IMHO, this is all just guessing games - testing this way is non-atomic.
> There
> must be a way to have DPDK (or even if there is some non-dpdk vhost-user
> library, really) know that this socket is orphaned. I guess, I'm saying
> it's ****** that OVS is left holding the bag of deciding that it should
> clean up.
>
> I think if we are choosing to unlink though, this approach is the
> sanest, frought with possible edge cases as it is. :)
>

I think this problem should be viewed from two angles:

1. Reduce leakage of inodes

While there is upper limit of leaked inodes I think we should be good (ie.
abrupt crashes in a loop must not repeatedly leak the same inodes)


2. Reduce possibility of unlinking things that OVS was not supposed to.
Consider following examples:

a) ovs-vsctl add-port br-int "ovsdb-server.pid" ...
b) ovs-vsctl add-port br-int "db.sock" ...

Assuming DPDK user sockets are created in /var/run/openvswitch, then a) can
be solved always and b) can be solved most of the time with the
unlink_if_is_orphan_check(). Though there still could be name collisions
blocking one of the inode creators in such cases.


Other alternative solution are:
1. add .dpdk postfix to such socket names
2. always use dedicated directory just for DPDK sockets


Also, I don't think it is legit security assumption that libvirt (or
whoever calls add-port) will do any sort of output validation before
handing of port names to OVS through "ovs-vsctl add-port ..." command.
Implications of this are that I agree with Ilya that a) and b) should be
solved.


>
> >> What do you guys think?
> >> ---
> >>  lib/netdev-dpdk.c | 5 +++++
> >>  1 file changed, 5 insertions(+)
> >>
> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> >> index 6b33c02..e839eaa 100644
> >> --- a/lib/netdev-dpdk.c
> >> +++ b/lib/netdev-dpdk.c
> >> @@ -694,6 +694,11 @@ netdev_dpdk_vhost_user_construct(struct netdev
> >> *netdev_)
> >>      snprintf(netdev->vhost_id, sizeof(netdev->vhost_id), "%s/%s",
> >>               vhost_sock_dir, name);
> >>
> >> +    if (unlink(netdev->vhost_id) && errno != ENOENT) {
> >> +        VLOG_WARN("unlinking \"%s\": %s\n",
> >> +                  netdev->vhost_id, ovs_strerror(errno));
> >> +    }
> >> +
> >>      err = rte_vhost_driver_register(netdev->vhost_id);
> >>      if (err) {
> >>          VLOG_ERR("vhost-user socket device setup failure for socket
> %s\n",
> >> --
> >> 2.1.4
> >>
> >> _______________________________________________
> >> dev mailing list
> >> dev at openvswitch.org
> >> http://openvswitch.org/mailman/listinfo/dev
> >>
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
>



More information about the dev mailing list