[ovs-discuss] Scalability Questions in High-Density Virtualisation Environments

Fri Sep 2 08:17:42 UTC 2011

Hello,

I've the same kind of question ...
Moreover most of physical switch doesn't support more than 1024 Vlan...

Idealy, I'm thinking about a new approach :  Build a virtual mesh network
between hypervisor  and use Openvswitch (VM Connectivity) and openflow (deal
with network flow)
If you know a soft to build a mesh Layer2 that can work with openvswitch, my
ear is wide open ;)

In fine, another more realistic and close in time approach, is to use GRE
tunnel.
This  involve to have a central server for hypervisor GRE Tunnel end-point
and commutation between virtual and physical world.  Of course, the virtual
network have star topologie. The OpenVswitch central switch is really
critical and potential bottle neck point (use 10Gbps network).

Regards,

2011/9/1 Leland Vandervort <leland at dev.discpro.org>

>
> Hi All,
>
> I have a couple of questions concerning the scalability of OVS when used in
> high-density virtualisation environments.
> To elaborate futher, a bit of background.
>
> Most virtualisation implementations do not go far enough in reality to
> abstract the virtual from the physical network environments.  As an example,
> take an environment using Xen/XenServer, with up to a few hundred physical
> nodes, each with up to 50 or 60 virtual machines.  If you assume that each
> VM has a single VIF, which uses, of course, a MAC address.    Let’s say for
> argument’s sake that we have 200 physical nodes, each with 50 VMs.  That
> makes a total of 10,000 MAC addresses for the VMs, plus 200 MAC addresses
> for the physical nodes, any X-number of MAC addresses if you are using any
> kind of network attached storage.
>
> Given that the majority of physical network implementations place the
> physical nodes connected to physical access switches in the datacenter,
> which in turn feed into larger distribution switches, there is a risk with
> CAM table overflow.  For example, some hosting providers use relatively
> inexpensive switches at the access layer, and some manufacturers have a CAM
> limit of anywhere between 6000 and 16000 MAC addresses.  In the case of CAM
> overflow, the switch has no choice but to flood, essentially making it
> behave like a hub, with the inherent network performance degradation that
> entails.
>
> Admittedly, on way of overcoming this is to extend the routing protocol
> domain down to the individual nodes.  In this way the physical switched
> environment sees only the MAC addresses of the nodes and any associated
> storage, and not the individual VMs.  (An example would be a node would
> advertise its connected VMs to the distribution router via OSPF, and will
> accept only a default route received from the distribution — so the
> individual node kernel routing table would be quite manageable).  However,
> given that OVS is not a layer3 device, what is the scalability limit with
> OVS for such a high-density environment?  Next imagine that some of the VMs
> have multiple VIFs — so the problem is exacerbated.
>
> Another way to alleviate this could be through the use of TRILL in
> combination of RBRIDGE — wondering if there are any plans to implement this
> in OVS at some point in the future.
>
> I am, nevertheless, interested in the Ethernet over GRE implementation in
> OVS as well, as this facilitates the creation of private networks between
> VMs, across different nodes.  Again, however, I question the scalability of
> this approach, especially in such a high density environment.  While I
> understand that theoretically one could create a few “thousand” tunnels
> between the various nodes, each carrying the private LANs for the different
> VMs.  How well does this approach actually scale?  Imagine, for example, a
> customer with 40 VMs scattered across different nodes.  These 40 VMs all
> need to “talk” to each other via a “backend” private network.  How would
> this be setup, whilst still avoiding potential layer-2 loops, or avoiding
> any one node becoming a “bottleneck” hub to multiple tunnel endpoints?
>
> For this one, I’m really interested since 802.1Q is not really feasible due
> to its limitation to 4096 VLANs  (and certain equipment manufacturers
> imposing even further limitations, such as Cisco’s 1005 VLANs on many
> switching platforms) -- Q-in-Q isn’t really a viable option as it
> overcomplicates matters.
>
> So I guess in summary, my questions are (to cut a long story short):
>
>
>    - What is the scalability of a large OVS deployment and what
>    abstraction measures are taken to avoid the “virtual” domain having CAM
>    exhaustion risks on the “physical” infrastructure?
>    - What is the scalability of the Ethernet over GRE implementation and
>    what measures are taken to avoid loops and/or bottlenecks?
>    - Are there any plans potentially to incorporate TRILL and/or RBRIDGE
>    features into OVS at some point in the future?
>
>
>
> Many thanks for any ideas, comments, suggestions, answers, and
> cheeseburgers :)
>
>
> Leland
>
>
>
>
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> http://openvswitch.org/mailman/listinfo/discuss
>
>

-- 
--
Benoit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20110902/89527081/attachment.html>