[ovs-dev] [PATCH v2 2/2] netdev-dpdk: Support user-defined socket attribs

Tue Jun 14 02:46:10 UTC 2016

On 13 June 2016 at 14:36, Aaron Conole <aconole at redhat.com> wrote:

> Daniele Di Proietto <diproiettod at vmware.com> writes:
>
> > On 10/06/2016 10:51, "Aaron Conole" <aconole at redhat.com> wrote:
> >
> >>Aaron Conole <aconole at redhat.com> writes:
> >>
> >>> Christian Ehrhardt <christian.ehrhardt at canonical.com> writes:
> >>>
> >>>> On Tue, May 24, 2016 at 4:10 PM, Aaron Conole <aconole at redhat.com>
> wrote:
> >>>>
> >>>>> Daniele Di Proietto <diproiettod at vmware.com> writes:
> >>>>>
> >>>>> > Hi Aaron,
> >>>>> >
> >>>>> > I'm still a little bit nervous about calling chown on a (partially)
> >>>>> > user controlled file name.
> >>>>>
> >>>>> I agree, that always seems scary.
> >>>>>
> >>>>> > Before moving forward I wanted to discuss a couple of other
> options:
> >>>>> >
> >>>>> > * Ansis (in CC) suggested using -runas parameter in qemu.  This way
> >>>>> > qemu can open the socket as root and drop privileges before
> starting
> >>>>> > guest execution.
> >>>>>
> >>>>> I'm not sure how to do this with libvirt, or via the OpenStack
> Neutron
> >>>>> plugin.  I also don't know if it would be an acceptable workaround
> for
> >>>>> users.  Additionally, I recall there being something of a "don't even
> >>>>> know if this works" around it.  Maybe Christian or Ansis (both in CC)
> >>>>> can expound on it.
> >>>>>
> >>>>
> >>>> Hi,
> >>>> IIRC we kind of agree that long term a proper MAC will be much better
> but
> >>>> most involved people needed something to get it working like "now".
> >>>> Since they are complementary (other than the fix removing a bit of the
> >>>> urgency for more MAC) it was kind of the least bad option.
> >>>>
> >>>> You have to be aware that I brought up the discussion on dev at dpdk.org
> - see
> >>>> [1] and [2]:
> >>>> But this will take time and eventually still be the applications task
> to
> >>>> "do something" - no matter if via API or via the chmod's right now.
> >>>> So Aaron is trying to get something that works now until the long term
> >>>> things are in place, which I appreciate.
> >>>>
> >>>> FYI - I was even more in a hurry as it was clear that OVS-2.5 won't
> get
> >>>> this in time I run with [3] for now.
> >>>> I never intended to suggest that, but with the discussion in place,
> one
> >>>> could ask if you (Aaron) want to pick up that instead.
> >>>> That would keep OVS free for now until DPDK made up the API (see [2])
> for
> >>>> socket ownership control and this then could be implemented in OVS?
> >>>>
> >>>> (I hope) In some months/years we will all be happy to drop this bunch
> of
> >>>> interim solutions, never the less we need it for now.
> >>>>
> >>>> [1]: http://dpdk.org/dev/patchwork/patch/12222/
> >>>> [2]: http://dpdk.org/ml/archives/dev/2015-December/030326.html
> >>>> [3]:
> >>>>
> https://git.launchpad.net/~ubuntu-server/dpdk/commit/?h=ubuntu-xenial-to-dpdk2.2&id=f3c7aa1b2ddea8e092ad4a89e41a0e19d01ed4e7
> >>>>
> >>>> [...]
> >>>>
> >>>>
> >>>>> I think originally we quickly discussed 4 possible solutions (and
> >>>>> hopefully I captured them correctly):
> >>>>>
> >>>>> 1. OVS downgrades to the ovs user, and kvm runs under the ovs
> >>>>>    group.  I don't actually like this solution because kvm could then
> >>>>>    pollute the ovs database.
> >>>>>
> >>>>> 2. OVS runs as some user and sets the user/group ownership of the
> socket
> >>>>>    via chown/chmod where permissions come from the database (the
> >>>>>    original context had ovs running as root - but as I described
> above
> >>>>>    it doesn't need to be root provided ovs+DPDK can start without
> root).
> >>>>>
> >>>>> 3. OVS runs as some user, kvm starts as root, opens the socket and
> >>>>>    downgrades.  IIRC, this doesn't actually work, or it may have
> >>>>>    implications on other projects.  I don't remember exactly what was
> >>>>>    not as great about this solution, TBH.
> >>>>>
> >>>>> 4. OVS and KVM run as whatever users; MAC is used to enforce the
> >>>>>    layering between them.
> >>>>>
> >>>>> I think solution 2 and solution 4 don't actually interfere with each
> >>>>> other, and can be used to a complementary effect (if implemented
> >>>>> properly) so that the MAC layer enforces access, but even without
> MAC,
> >>>>> the DAC layer can provide appropriate whitelisting behavior.
> >>>>>
> >>>>
> >>>> I also remember several complex changes needed for the #1 and #3 that
> >>>> always would end up with huge effort and a high risk not being
> accepted.
> >>>> Probably that is what you refer to with "implications on other
> projects".
> >>>>
> >>>> Also keep in mind the position of dpdk out of the last few discussions
> >>>> which I'd like to summarize as "dpdk got this path from an app, so
> this app
> >>>> OWNS that path".
> >>>
> >>> I'd like to continue on, but I am not sure what the concerns are right
> >>> now.  Is it possible to enumerate them point by point so that I can
> >>> understand them?  I think there are two outstanding concerns right now:
> >>>
> >>> 1. the proposed approach is not good enough (vis-a-vis DAC vs. MAC)
> >>>
> >>> 2. the proposed approach would be better implemented in the utility
> >>>    that wants access to the sockets (vis-a-vis the libvirt discussion)
> >>>
> >>> Am I understanding the concerns correctly?
> >>
> >>Ping?
> >
> > I found another theoretical problem with the chmod approach, let me try
> to
> > explain:
> >
> > There's an extremely small race window between the socket creation and
> the
> > chmod which could theoretically be exploited to change the owner of a
> socket
> > (e.g. ovnsb_db.sock) in ovs rundir, by controlling the name of the port:
> >
> > 1. There's no southbound database running, because it's not yet been
> >    started or because it's being restarted.
> > 2. The user creates a vhost port, naming it ovnsb_db.sock.
> >    rte_vhost_driver_register() succeeds and creates a socket in the file
> >    system.
> > 3. The southbound database is started, it removes ovnsb_db.sock and
> recreates
> >    it.
> > 4. Now OVS changes the owner and the permission of what it thinks is a
> >    vhost-user socket.
> >
> > If 3 manages to get between 2 and 4, we have a problem. It's a pretty
> small
> > window, and it's unlikely that an attacker can control when the
> southbound
> > database is restarted.
> >
> > I feel like I'm nitpicking, but I'm not sure how serious is the security
> > impact of what I'm describing.
> >
> > I suggested an alternative approach, and I've tried implementing a quick
> > POC on top on your patch:
> >
> > ---8<---
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index 24ebb41..d7adc66 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -30,6 +30,7 @@
> >  #include <sys/types.h>
> >  #include <sys/stat.h>
> >  #include <getopt.h>
> > +#include <sys/fsuid.h>
> >
> >  #include "chutil.h"
> >  #include "dirs.h"
> > @@ -891,6 +892,17 @@ netdev_dpdk_vhost_user_construct(struct netdev
> *netdev)
> >       */
> >      snprintf(dev->vhost_id, sizeof(dev->vhost_id), "%s/%s",
> >               vhost_sock_dir, name);
> > +    uid_t orig_u = geteuid();
> > +    gid_t orig_g = getegid();
> > +    if (vhost_sock_def_owner) {
> > +        uid_t u;
> > +        gid_t g;
> > +        if (!ovs_strtousr(vhost_sock_def_owner, &u, NULL, &g, false)) {
> > +            VLOG_INFO("UID: %d GID: %d", u, g);
> > +            setfsuid(u);
> > +            setfsgid(g);
> > +        }
> > +    }
> >
> >      err = rte_vhost_driver_register(dev->vhost_id);
> >      if (err) {
> > @@ -903,16 +915,12 @@ netdev_dpdk_vhost_user_construct(struct netdev
> *netdev)
> >          err = vhost_construct_helper(netdev);
> >      }
> >
> > -    ovs_mutex_unlock(&dpdk_mutex);
> > -    if (!err && vhost_sock_def_owner &&
> > -        (err = ovs_chown(dev->vhost_id, vhost_sock_def_owner))) {
> > -        VLOG_ERR("vhost-user socket device ownership change failed.");
> > +    if (vhost_sock_def_owner) {
> > +        setfsuid(orig_u);
> > +        setfsgid(orig_g);
> >      }
> >
> > -    if (!err && vhost_sock_def_perms &&
> > -        (err = ovs_chmod(dev->vhost_id, vhost_sock_def_perms))) {
> > -        VLOG_ERR("vhost-user socket device permission change failed.");
> > -    }
> > +    ovs_mutex_unlock(&dpdk_mutex);
> >
> >      return err;
> >  }
> >
> > ---8<---
> >
> > Compared to the chmod approach this has some limitation:
> >
> > 1. It doesn't support changing permissions, only the owner.  This
> >    could be done with umask, but I couldn't find any system call
> >    to change the umask for a single thread.
> > 2. Unless vhost-sock-dir is owned by the target owner, the socket
> >    cannot be created.  I'm not sure whether this is a reasonable
> >    limitation for the use cases you have in mind.
> > 3. setfsuid() is Linux specific and somehow deprecated according
> >    to the manpage:
> >
> >    "Thus, setfsuid() is nowadays unneeded and should be avoided
> >    in new applications"
> >
> >    I haven't used seteuid, because it changes the euid of the whole
> >    process and that may interfere with other operations on OVS.
>
> Thanks for this PoC, and explanation.  I agree, there is a race, and I'd
> like to work on trying to solve the problem.
>
> > If these limitations are unacceptable, I can see how we can use
> > chmod.  After all, as you point out, it's probably better to do it
> > in OVS than in some script.
>
> I think fchmod and fchown may actually be the correct calls to have, and
> will refactor these chown/chmod utils functions as such, which (I
> believe) avoids the race as you describe.
>
> > Thanks for your patience in solving this problem,
>
> Thanks for your reviews!
>
> > Daniele
>
> Here's an elephant of a question, though.  Would it make sense to try
> and work towards some kind of scheme whereby OvS is aware of the various
> unix sockets it creates, and allows setting the permissions, ownership,
> etc. in a common way?  I'm not committed to finding / solving that
> problem, but would it even be acceptable / appropriate?
>

Just my 2 cents... Besides QEMU that runs under non-root user and wants to
connect to the DPDK server sockets created by OVS, you might also have
OpenFlow controller that runs locally under non-root user and would want to
connect to a OpenFlow server socket created by OVS. So, yes, I think it may
be good to generalize this feature in a common way instead of making it
DPDK-specific (or at least try to design it to be extensible in the future).

However, OVSDB socket is slightly different from OpenFlow or DPDK user
socket because having access to a OVSDB socket would effectively allow
"write-ups". What this mean is that processes intended to be running at
lower security level (i.e. OVSDB client in this case) could grant
themselves access to at least some unauthorized resources by tricking OVS
to chown() a socket that it was not supposed to. Though, IMHO, if properly
documented, then also this feature would still add some value.

>
> -Aaron
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>