[ovs-dev] [RFC] ovn: minimize the impact of a compromised chassis
Russell Bryant
rbryant at redhat.com
Tue Aug 23 21:20:04 UTC 2016
On Tue, Aug 23, 2016 at 5:05 PM, Darrell Ball <dlu998 at gmail.com> wrote:
>
>
> On Mon, Aug 22, 2016 at 1:08 PM, Lance Richardson <lrichard at redhat.com>
> wrote:
>
>> > From: "Ben Pfaff" <blp at ovn.org>
>> > To: "Russell Bryant" <russell at ovn.org>
>> > Cc: "Lance Richardson" <lrichard at redhat.com>, "ovs dev" <
>> dev at openvswitch.org>, "Russell Bryant" <rbryant at redhat.com>
>> > Sent: Monday, August 22, 2016 1:22:43 PM
>> > Subject: Re: [ovs-dev] [RFC] ovn: minimize the impact of a compromised
>> chassis
>> >
>> > On Mon, Aug 22, 2016 at 01:14:03PM -0400, Russell Bryant wrote:
>> > > On Mon, Aug 22, 2016 at 12:30 PM, Ben Pfaff <blp at ovn.org> wrote:
>> > >
>> > > > On Tue, Aug 16, 2016 at 09:30:21AM -0400, Lance Richardson wrote:
>> > > > > As described in ovn/TODO, these are the two main approaches that
>> could
>> > > > > be
>> > > > > used to minimize the impact of a compromised chassis on the rest
>> of an
>> > > > > OVN OVN network:
>> > > > >
>> > > > > 1) Implement a role- or identity-based access control mechanism
>> for
>> > > > > ovsdb-server and use it to limit ovn-controller write access
>> to
>> > > > > tables in the southbound database.
>> > > > >
>> > > > > or
>> > > > >
>> > > > > 2) Disallow all write access to the southbound database by
>> > > > ovn-controller
>> > > > > (as an optional mode or unconditionally) and provide
>> alternative
>> > > > > mechanisms for updating the southbound database for entries
>> that
>> > > > > are
>> > > > > currently updated by ovn-controller.
>> > > > >
>> > > > > It is believed that option (1) would require somewhat more effort
>> than
>> > > > (2),
>> > > > > and, because it would involve significant modifications to
>> > > > > ovsdb-server,
>> > > > > would also be more likely to add risk and burden to non-OVN users.
>> > > > > Additionally, option (2) will likely place fewer requirements on
>> > > > alternative
>> > > > > databases (such as etcd), so the following implementation
>> discussion
>> > > > > only
>> > > > > considers option (2).
>> > > >
>> > > > I've always pushed back against adding granular access control
>> > > > mechanisms to OVSDB because I didn't believe it was likely that
>> anything
>> > > > that was simple enough to be in the "spirit of OVSDB" (heh) was also
>> > > > going to be sufficient to fit a real use case. However, if we do
>> now
>> > > > have specific requirements for OVN, then I'd invite descriptions of
>> what
>> > > > access control mechanism would be sufficient. If it's simple and
>> > > > general enough, then implementing it in OVSDB might totally make
>> sense.
>> > > >
>> > > > I don't think that the "risk and burden" of a simple and general
>> > > > mechanism is a real issue.
>> > >
>> > >
>> > > I think that push back makes sense.
>> > >
>> > > The proposal here was to take route #2. The only OVSDB feature
>> required in
>> > > that case is to accept read-only connections, which could be on a
>> > > per-socket basis. This seems much simpler all around, as long as we
>> can
>> > > all get on board with ovn-controller as a read-only client.
>> >
>> > I'm not actually saying we should choose #1. I'm saying a couple of
>> > things. First, changing OVSDB is not a huge deal; we do it when it
>> > makes sense. Second, that it is possible that our specific application
>> > here is a better place to start for OVSDB access control than a blanket
>> > "we need access control for OVSDB" that I've heard a couple of times.
>> >
>>
>> Based on my own narrow view of the world, I think option #1 would need:
>>
>> - The ability for ovsdb-server to associate a role/identity with each
>> client connection. For simplicity this could be a binary
>> "privileged"
>> vs "non-privileged" association, perhaps using per-role SSL
>> certificates
>> for TLS connections and treating unix socket connections as
>> "privileged".
>> - A mechanism for mapping a role/identity to access rights on a
>> per-table
>> and per-column basis.
>> - A mechanism for enforcing access rights on a per-table or per-column
>> basis,
>> in some cases also considering the identity of the client that
>> created
>> the row.
>>
>> This infrastructure would be applied to OVN to implement the following:
>> - These tables would be read-only for non-privileged clients:
>> SB_Global, Logical_Flow, Multicast_Group, Datapath_Binding,
>> Address_Set,
>> DHCP_Options, and DHCPv6_Options.
>>
>> - The Chassis and Encap tables would allow insertions by
>> non-privileged clients
>> and updates to existing rows only for the clients that inserted
>> them.
>>
>> - The Port_Binding table would be writable only by privileged clients
>> (ovn-northd) except for the "Chassis" column which should be
>> writable by any
>> non-privileged client (note that this doesn't do a lot to minimize
>> harm from
>> a compromised chassis).
>>
>> - The MAC_Binding table should be writable by any non-privileged
>> client (which also
>> doesn't do much to minimize harm from a compromised chassis).
>>
>> > > Are you interested in looking closer at what #1 would look like, with
>> > > details of what the access control policy would look like?
>> >
>> > It'll probably be obvious, or close to obvious, what would be needed for
>> > #1 once we talk through what #2 needs.
>> >
>>
>> Here's a slightly more detailed breakdown of the work needed for option
>> #2:
>>
>> ovsdb-server: Add support for "read-only" connections. Perhaps
>> something
>> like "--remote ptcp:read-only:<port>[:<ip>]" and variations on that
>> theme
>> for other connection types.
>>
>> ovn-controller: Implement new approach for Chassis and Encap tables:
>> - Remove code from ovn-controller for creating rows in these
>> tables.
>> - Document how administrators create rows using ovn-sbctl in
>> ovn-controller
>> man page.
>> - Update all tests to manually create Chassis/Encap rows.
>>
>> ovn-controller: Implement new approach for chassis column in
>> Port_Binding table:
>> - Remove the code to update the chassis column from
>> ovn-controller.
>> - Add new key to options column of Logical_Switch_Port in
>> OVN_Northbound
>> database to specify chassis binding.
>> - Change ovn-northd to update Port_Binding table in southbound
>> db based
>> on chassis option from Logical_Switch_port in northbound db.
>> - Write upgrade helper script that sets chassis option for
>> existing
>> Logical_Switch_Ports based on current values in Port_Binding
>> table of
>> southbound db
>> - Document OVN upgrade procedure, including the use of the
>> upgrade helper
>> script.
>>
>> ovn-controller: Rework MAC_Binding table
>> - Propose details of chassis-local mac bindings storage, the two
>> main options
>> are:
>> + In ovn-controller memory (simple, but cache reset on
>> ovn-controller restart).
>> + In Open_vSwitch database (more work, as we need cache
>> invalidation logic added).
>> - Change ovn-controller to use local store for learned mac
>> bindings.
>> - Remove code for updating MAC_Binding table from ovn-controller.
>>
>
> Regarding Option 2:
>
> Most distributed systems that share a common management plane would try to
> share
> mac bindings via the common management plane, even if each node maintains
> it own cache.
>
What specific systems are you referring to here?
> Throwing that out entirely because of a fear of a compromised chassis
> seems out of
> proportion to the potential problem. There can be 1000s of chassis part of
> the same
> logical network having packet flows needing the same binding.
>
It's not a fear. It's a legitimate security issue.
> Furthermore, the risk of a compromised chassis may be very low in many use
> cases.
> The "one known target environment" eluded to in the problem description
> should not "rule all"
> by default.
>
The group that raised this to me was OpenShift (a kubernetes based
platform). It's a show stopper for them, as I would expect for other
container based systems.
The same issue applies to OpenStack, though it's not quite as pressing of
an issue as other OpenStack components have similar problems anyway.
> Perhaps allowing ovn-controller to write to a candidate mac binding table
> (with some limitations
> as well) and having northd (possibly as background work) detect a
> concensus of binding from > X controller
> client sessions and then populate the actual mac binding table might
> mitigate the exploit concern.
> Only northd would be able to write to the actual mac binding table.
>
> If there is no binding concensus yet on the binding, then the default is
> for the interested
> controller to issue the arp request and use the local controller cache.
> This includes the
> degenerate case where there is only one controller interested in that
> particular mac binding.
>
That sounds like a potential improvement for dynamic mac bindings, at
least. We still have Chassis, Encap, and Port_Binding to deal with. It
would also require more complex RBAC capabilities to be added to ovsdb,
which I was hoping to avoid.
--
Russell Bryant
More information about the dev
mailing list