[ovs-dev] Proposed OVN live migration workflow

Thu Mar 16 18:23:44 UTC 2017

On Thu, Mar 16, 2017 at 1:39 PM, Ben Pfaff <blp at ovn.org> wrote:
> On Mon, Feb 27, 2017 at 11:12:07AM -0500, Russell Bryant wrote:
>> This is a proposed update to the VM live migration workflow with OVN.
>>
>> Currently, when doing live migration, you must not add the iface-id to the
>> port for the destination VM until migration is complete.  Otherwise, while
>> migration is in progress, ovn-controller on two different chassis will
>> fight over the port binding.
>>
>> This workflow is problematic for libvirt-based live migration (at least) as
>> it creates an identical VM on the destination host, which includes all
>> config such as the ovs port iface-id.  This results in ovn-controller on
>> two hosts fighting over the port binding for the duration of the migration.
>>
>>
>> Proposed new workflow for a migration from host A to host B:
>>
>> 1) The CMS sets a new option on Logical_Switch_Port called
>> "migration-destination".  The value would be the chassis name of the
>> destination chassis of the upcoming live migration (host B in this case).
>>
>> 2) While this option is set, if host B claims the port binding, host A will
>> not try to re-claim it.
>>
>> 3) While this option is set, if host B sees the new port appear, it will
>> not immediately update the port binding.  Instead, it will set up flows
>> watching for a GARP from the VM.  GARP packets would be forwarded to
>> ovn-controller.  All other packets would be dropped.  If a GARP is seen,
>> then host B will update the port binding to reflect that the port is now
>> active on host B.
>>
>> At least for KVM VMs, qemu is already generating a GARP when migration is
>> complete.  I'm not familiar with Xen or other virtualization technologies,
>> but it seems like this would be a common requirement for VM migration.
>>
>> 4) When the migration is either completed or aborted, the CMS will remove
>> the "migration-destination" option from the Logical_Switch_Port in
>> OVN_Northbound.  At this point, ovn-controller will resume normal
>> behavior.  If for some reason a GARP was not seen, host B would update the
>> port binding at this point.
>
> This seems like a reasonable approach to me.  I've spent a few minutes
> trying to think of problems with it.  It adds a little bit of
> complexity, but not enough to worry me.  It doesn't seem to add races
> that cause a problem.  It still works even if the GARPs are lost or
> never sent.
>
> However, I find myself wondering how aware the hypervisors are that a
> migration is taking place.  If the source and destination hypervisors
> could note locally that a migration is going on for a given VM, then
> they could handle this situation without needing anything to happen in
> the OVN databases.  For example, consider the Interface table
> integration documented in Documentation/topics/integration.rst.  Suppose
> we added external_ids:migration-status, documented as follows:
>
>     This field is omitted except for a VM that is currently migrating.
>     For a VM that is migrating away from this chassis, this field is set
>     to "outgoing".  For a VM that is migrating to this chassis, this
>     field is set to "incoming".
>
> If we had this, then we could use the following workflow from a
> migration from A to B:
>
> 1) CMS integration sets migration-status appropriately on A ("outgoing")
> and B ("incoming").
>
> 2) While migration-status is "outgoing", A will not try to reclaim a
> port claimed by a different chassis.
>
> 3) While migration-status is "incoming", B will not grab the port
> binding unless and until it sees a GARP.
>
> 4) When the migration is completed successfully, A's port gets destroyed
> and B's migration-status gets removed, so at this point B claims it in
> case it didn't see a GARP.  If the migration fails, B's port gets
> destroyed and A's migration-status gets removed, so at this point A
> reclaims it if necessary.
>
> As I wrote up the above, though, I found myself thinking about how hard
> it is to update all the hypervisor integrations and make them correct in
> the corner cases.  I know that's tough from experience.  (Also, this
> would be the first change to the integration spec in years.)  I don't
> know whether updating the CMSes is harder or easier.  But maybe you are
> planning to do it yourself (at least for Neutron?) in which case I think
> that the CMS-based approach is probably the right one.

Thank you for the feedback!

One reason I prefer a CMS-driven approach is working with the security
policy of which hypervisors can update port bindings.  Having this
defined by the CMS gives us an easy way to define which chassis can
update the port binding.

I was imagining that for security purposes we would be adding an
"option:chassis" to Logical_Switch_Port that a CMS can set which
defines the current chassis allowed to bind the port.  Our ACL would
be extended so that a chassis identified by options:chassis OR
options:migration-destination would be allowed to update the port
binding.  The capability to express that was reflected in the ovsdb
ACL proposal.

-- 
Russell Bryant