Welcome! Wikis are websites that everyone can build together. It's easy!

Oracle VM Server use case: multiple networks, multipathed SAN storage


This page discusses a more complex, hopefully Enterprise grade virtualization environment. Some comparisons are made between the offerings of popular virtualization products. Your specific needs, and results, are very likely to vary.

skip to: configuration

Summary


I have been involed in a project to migrate a set of old/legacy systems to VMs running on many less physical systems. This is the classical justification for virtualization.

While deploying the VM-based infrastructure, I found that there was very little documentation across all current VM vendors that discussed the integration of VM host systems in "enterprise" environments. This page is an attempt to rectify this. Note that just like "International Community", there is no real agreement as to what "enterprise" means.

For the purposes of this wiki entry, I define my Enterprise virtualization environment as being one where preferably:
  1. there is no single point of failure, making services resilient against individual component downtime due to failures, upgrades, servicing and the like
  2. systems reside on many seperate networks/subnets, normally divided by functional requirements and the security profiles related to these requirements.
  3. storage is centralized onto a SAN, with a high availability (read 2+ paths) interconnect that provides block storage. I will discuss FCP, but iSCSI is also possible.

So we naturally would prefer to have all of the above, but factors such as time/budget/capability will probably mean that a business' computing infrastructure relies on some non-ideal configurations. This isn't necessarily a bad thing, as long as the risks of doing so are known and acceptable to the business.


Before Virtualization


The pre-virtualization environment is typically one where individual machines are assigned to a primary role, such as database, mail, web etc. While individual machines can provide multiple services, the dependencies and complexity of some applications makes it such that this isn't done frequently. The resulting computing infrastructure is then frequenltly one where there are a lot of servers running at low to medium utilization. These will have very different connectivity profiles to network and storage resources. Some machines will be very well connected, and have no single point of failure, while others will be running very old hardware, with a single power supply, and with a dodgy old network cable.

The figure below attempts to illustrate this. The enterprise has some high availabily network and storage assets, but they aren't used by all machines.

servers: pre-virtualization


pre-virtualization environment

  1. storage, is either:
    1. local disk / direct attached (DAS)
    2. SAN provided, single path
    3. SAN provided, multiple paths.
  2. network connectivity variables:
    1. interfaces:
      1. single link
      2. bonded (2+) link, using 802.3ad or a proprietary protocol such as Cisco's EtherChannel
    2. vlans:
      1. single (native) network
      2. VLAN trunk, allowing the host to connect to multiple networks from one interface
  3. server systems probably vary greatly in architecture, number of processors, hardware redundancy, memory


Virtualized Server Environment Functionality


Moving to a virtualized environment, we will be taking advantage of storage and networking functionality that is generally already available in existing Enterprise IT environments.

storage area network functionality


Oracle VM Server use case: multiple networks, multipathed SAN storage - Oracle Wiki
SAN disk can supports many different ways of using multiple paths, notably:
  1. active/standby: the oldest method, where if the link is dropped or too many I/O errors occur on the active path a timer will switch to the standby (secondary) path. Simplest to implement and supported by most enterprise storage systems, and storage software. Is disadvantaged by having relatively slow failover times, and may require manual intervention to force a fail-back when the active (primary) path is restored.
  2. active/active round-robin: I/O is balanced across all paths in a round-robin manner. More difficult to orchestrate on both the storage systems and storage software. Has the advantage of generally detecting and mitigating a path failure quickly and more gracefully than active/standby. Does not require any intervention upon the restoration of the failed path. Has the disadvantage that, depending on the storage vendor, it may send I/O requests through sub-optimal or indirect paths to the disk.
  3. active/active round-robin with Asymetric Logical Unit Access (ALUA). ALUA allows the storage system to indicate I/O path preference in an active/active configuration. This allows the host to send the majority of its requests through the most optimal and direct path to the disk, while providing quick and graceful failover to the suboptimal paths in the event of a failure. Has the disadvantage that only more recent storage systems and storage software are capable of properly sending and exploiting ALUA information. Note that a firmware upgrade may add this capability. NetApp's Data ONTAP supports ALUA in versions 7.2.0 and above, Solaris' native MPxIO in releases 10 update 2 and later, Enterprise Linux (both Red Hat and Oracle's, via the dm-multipath package) in release 5.0 and above, Windows Server in release 2003 (preferably R2) with a compatbile vendor-supplied MPIO library.

Virtualization vendors differ greatly on their support for connectivity to storage networks. A quick overview - please feel free to update this as other products join the fray, or as later releases add features.
  1. Oracle VM 2.1 supports active/active + ALUA via the dm-multipath package.
  2. RHEL, Oracle EL, and CentOS (all generally derivatives of the same codebase) support active/active + ALUA via the dm-multipath package.
  3. Citrix XenEnterprise 4.0 cannot use multiple paths, single are ok. This is expected to be addressed in release 4.1
  4. VMware ESX 3.0.1 and above support active/active + ALUA.
  5. Microsoft's Hyper-V (beta - unreleased!), as with recent releases of Windows Server, appears to support active/active + ALUA with a compatbile vendor-supplied MPIO library.
  6. Sun's xVM (beta - unreleased!), as with recent releases of Solaris 10, appears to support active/active + ALUA with the native MPxIO storage software.

ethernet network functionality


Oracle VM Server use case: multiple networks, multipathed SAN storage - Oracle Wiki
I am aiming for a design with is straighforward to implement and configure. Other configurations are possible based on the use of routing protocols. The criteria below will make a set of virtual machine hosts unavailble in the (generally very rare) event of a switch failure. This is expected to be compensated by migrating critical virtual machines over to the set of hosts that has connectivity.

ethernet network features, which to the best of my knowledge are supported by the current versions of all virtualization vendors' offerings.
  1. support for multiple, segregated networks - or virtual local area networks (VLAN). Intefaces may be configured to connect to one network natively, or to a VLAN trunk interface running the IEEE 802.1q VLAN tagging protocol. 802.1q tags inidividual ethernet frames with an identifier which indicates which network the frame belongs to.
  2. network interface bonding or teaming. Multiple ethernet intefaces are aggregated to increase a link's throughput and reliability by balancing frames across multiple paths. Generally uses IEEE's 802.3ad, or Cisco's EtherChannel protocol. Generally efficient to use across 2-4 ports. If more than 4 ports are required, the cost of a higher bandwidth port generally makes it more efficient to upgrade to this higher bandwitdh (e.g. instead of 8x GigE, use 2x 10GigE). Aggregated links will probably be used to connect machines that require high availablity, and to provide a VLAN trunk between network switches.

On the virtualization front, network interface card (NIC) vendors are brining in more features that can increase efficiency and performance, such as:
  1. TCP/IP offload engines (TOE) which reduce the burden on a machines CPU for some parts of the network communication, requires compatible drivers. Note that early implementations tended to be buggy, and hampered features such as VLAN trunking and interface bonding. Most vendors have remedied this with firmware upgrades.
  2. Virtualization targetted network cards, such as Sun's Crossbow project (unreleased!)


bringing everything together - the Enterprise virtualization platform use case


From the discussion above, we are aiming to deploy a feature rich virtualization platform that supports the most flexible storage and networking features so far discussed. These assist implementors in retaining maximal flexibility over their environments.

Oracle VM Server use case: multiple networks, multipathed SAN storage - Oracle Wiki

When deloying a set of virtualized host machines, it's best to aim for similar sets of hardware for each group. I recommend:
  1. 2 or more socket systems, with CPUs that have as many cores and gigahertz as possible. This is because all known vendors charge per-socket. With 2 and 4 socket systems being mainstream, the purchasor will often have to decide if it's more efficient to purchase 2x 2-socket systems or 1x 4-socket system. Upfront price should not be the only consideration! Remember that these systems have to be racked, powered, and cooled!
  2. 2 or more gigabit network interfaces. Many shops like to deploy a separate (3rd) management interface. If using iSCSI, I strongly recomomend dedicating an additional and separate 2 interfaces for storage access.
  3. if using fiber channel SAN attachment, 2 FCP ports.
  4. redundant power supplies
  5. remote management interface (IBM RSA, Dell DRAC, HP iLO, Sun ALOM, etc).
  6. mirrored (2+) system disks, this is optional if you have a SAN

Remember that a risk of virtualization is the placing of multiple different functions on one physical machine. While this can be mitigated by migrating virtual machines in the event of a hardware failure, it's generally best to spend a bit more money on high quality redundant hardwware to limit this risk.


Configuration, with Oracle VM Server 2.1


Theory is nice, but practical implementations are better. Follow the links below for Oracle VM Server 2.1 details - note that this is user submitted documentation, and in no way guaranteed to be endorsed or supported by Oracle!

The interesting thing about this virtualization environment is that it does not lock you in to one vendor. Customers such as myself love having the option to move to another vendor with relatively low cost and effort if a more suitable product is made available, or the existing vendor simply "drops the ball".

To the best of my knowledge, much of what is described here is possible with (listed in the order that they come to mind):
  1. Oracle VM Server 2.1, via the Xen Hypevisor v3.1.1
  2. Red Hat Enterprise Linux (RHEL), or Oracle Enterprise Linux 5.x., via the Xen Hypervisor v3.0.3 (heavily patched to support hardware and features made available in later releases). caveat: This older version of the hypervisor is arguably still less feature rich than the 3.1.x series.
  3. SuSe Enterprise Linux 10 SP1 by Novell. via the Xen Hypervisor v.3.0.x. I have never used this product, though it's virtualization engine and tools are roughly equivalent to RHEL.
  4. VMware ESX 3.0.2. caveat: Depending on your relationship with EMC, this solution will probably be the most expensive of all the options listed in this table. Nevertheless, depending on your environment this additional cost may be a mostly negligible factor.
  5. Microsoft Virtual Server, and Hyper-V. Virtual Server is arguably at the bottom of the pack when it comes to management, guest operating system, hardware assisted virtualization, and advanced network & storage feature support. This is very likely to change this with the release of Hyper-V, currently targetted for 180 days after Windows Server 2008 - or 2008 Q3.
  6. Sun xVM. Unreleased, based on the Xen Hypervisor running a Solaris dom0. Release targetted for late Q2 - early Q3 2008.
  7. Citrix XenEnterprise 4.0. caveat: XE release 4.0 does not support multipath storage, this is expected in release 4.1.

Disk images, and virtual machine configurations can generally be converted between the above formats using vendor supplied tools, or utilities included with the Linux QEMU processor emulator.

Finally, it's good to see that Microsoft, Citrix (Xen), Novell and Sun have signed various bilateral agreements that should allow for a great deal of improved interoperability between Virtualization platforms and vitualized guest machines in 2008.

Again, in all cases, it's generally the customer that wins with more flexibility, and better performance.

Oracle VM Server configuration: multiple networks, multipathed SAN storage


Latest page update: made by martin_foster , Jan 16 2008, 10:27 PM EST (about this update About This Update martin_foster Edited by martin_foster


view changes

- complete history)
More Info: links to this page

Anonymous  (Get credit for your thread)


Started By Thread Subject Replies Last Post
martin_foster Consolidating documentation - please hang on! 1 Jan 18 2008, 3:25 AM EST by martin_foster
martin_foster
Thread started: Jan 16 2008, 12:56 AM EST  Watch
This page should be complete by 17:00 AEDT, 17 Jan 2008.

It is being build by copying data from a private wiki, which requires information to be sanitized.
3  out of 6 found this valuable. Do you?    
Show Last Reply