Part 4 – vSphere 5 vMotion
by Jim Hannan
In Part 4 of vSphere 5 Advantages for a VBCA, I turn my focus to vMotion and it’s enhancements in vSphere 5. VMware vMotion offers the ability to live-migrate virtual machines from one ESX host to another.
It amazes to me to think how long vMotion has been around. Most of us remember our first demo of a vMotion migration. vMotion, originally introduced in ESX 2.0 with vCenter 1.0, offered a feature unlike any in the industry. For many organizations, it was the quintessential reason to virtualize their workloads.
Care to guess what interactive application was used by VMware to demo the first migrations?
How vMotion Works
The majority of the work performed by vMotion is copying the virtual machine memory over to the destination ESX host. This memory migration can be broken down into the following 3 phases:
Phase 1: Guest Trace Phase
Consider this an accounting step. vMotion needs to account for each memory page in the virtual machine and any changes to pages during a vMotion operation. Traces are placed on each memory page in order to track guest memory pages; this allows for vMotion to track memory modifications during vMotion.
Phase 2: Pre-copy Phase
The pre-copy phase is done in iterations. First, a full sweep and copy of all memory blocks is done. The second iteration only copies the blocks that have changed since the first iteration. From here, additional iterations may be run depending on the ability of previous iterations to keep up with the changed blocks.
Phase 3 - Switchover Phase
The switchover phase is the final step in the migration. This is the cutover step, which quiesce the source VM and places the destination VM in a resume state. This step typically occurs in less than one second.
All three of these phases are greatly optimized with the vSphere 5 vMotion enhancements.
- Multiple network adapter: VMkernel can transparently load balance a single vMotion across multiple network adapters.
- Round-trip latency limit for vMotion networks has increased from 5 milliseconds to 10 milliseconds. This, in House of Brick’s opinion, is a future looking enhancement that will natively allow for long-distance vMotion.
- Improved memory tracing (Phase 1): According to VMware documentation, the tracing mechanism has been optimized to place traces faster.
- Improvements to allow the vMotion to effectively use the full bandwidth of a 10GbE interface. This actually may come as a surprise to some people, but many applications struggle with this. It's mostly due to the CPU overhead TCP/IP introduced into the network stack. From my observations, this is no longer an issue with vMotion.
- Stun During Page Send (SDPS): Ensures the vMotion will be successful during memory block change over. This new feature should be viewed with caution for VBCA applications that are latency sensitive. Let's take a look at what SDPS does and what it means for your VBCA in the next section.
SDPS for VBCAs
In some rare cases, a VM will have memory changes that occur faster than the vMotion iteration can keep up with. In these cases, SDPS will slow the virtual machine down enough to allow the vMotion to complete the migration. This “slow down” could cause unwanted application latency. In the case of Oracle RAC and its interconnect, this could cause node eviction.
The SDPS feature can be disabled, but this is not recommended by HoB. Instead, we recommend that vSphere administrators leave this feature enabled and build a vMotion network capable of moving memory blocks fast enough to prevent an overrun.
VMware recently benchmarked vSphere 5 vMotion against vSphere 4.1. One of the many tests VMware ran to compare vMotion performance was a database workload test. Using a SQL Server running the open source DVD Store Version 2 (DS2), VMware generated a RDBMS workload that ran during the vMotions. As you can see in the graph below, one test running with 2 NICs for vMotion was approximately 42% faster than vSphere 4.1 with one NIC.
By Jim Hannan
In Part 3 of vSphere 5 Advantages for VBCA, I turn my focus to Storage vMotion and its enhancements in vSphere 5. Storage vMotion capabilities were first introduced in vSphere 3 as a way to migrate from VMFS2 to VMFS3. It was re-introduced in vSphere 3.5 (after high demand from the VMware user base) as a supported way for administrators to move virtual machines from one datastore to another. The latest vSphere 5 release has undergone multiple enhancements to speed up Storage vMotion times.
Storage vMotion can be separated into two components: data movers and data mirroring. Data movers read blocks from the original location and copy them to a new destination. Data mirroring writes data to both the original VMDK location and the new location. The guest does not get a write confirm until the data has been written in both locations.
The ESX hypervisor chooses from one of three mechanisms, which greatly affects the speed of the Storage vMotion. The available data movers are:
fsdm is the slowest of the data movers, it appears to function more like an OS copy mechanism by copying files from one datastore to the next.
fs3dm is a faster mechanism. In his book, “VMware vSphere 5 Clustering Technical Deepdive”, Ducan Epping states that fs3dm moves data through fewer layers than traditional fsdm, making it faster and more efficient.
fs3dm-hw is a VAAI hardware offload, it offloads the copy to the SAN or NAS layer. I recently had the opportunity to use fs3dm-hw and it was impressively fast, 30-50% faster than fs3dm.
The hypervisor chooses the data mover to use based on a specific set of conditions. If the source and destination datastore are in the SAN, uses the same VMFS block size, and the SAN/NAS is VAAI capable, then it uses fs3dm-hw.
If all the above conditions are met but the SAN is not VAAI capable, then it resorts to fs3dm. If the block sizes are different, then the hypervisor uses the oldest (and slowest) fsdm mechanism.
House of Brick Best Practice
An in-place upgrade from VMFS3 to VMFS5 will retain the original block size of the VMDK. In VMFS5, VMware has standardized on a 1 MB block. HoB strongly recommends that customers move to the standard block size for the reasons mentioned above.
What Does This Mean For Your VBCA?
If you ask VMware Support about snapshots with a database, you will typically get a response like “It’s not a good idea”. We do not agree with that generic statement. We do, however, encourage our customers to be aware of the risks involved in creating a snapshot on a VBCA. In particular, these databases typically have high I/O. If the I/O is high enough during a snapshot, you can suffer from an overrun scenario where data over runs the snapshot deletion.
How does this happen? When snapshots become very large they can be difficult to delete. During a deletion the data may be changing faster than the snapshot can be committed. This causes an overrun. I have seen cases where the snapshot deletion was barely keeping up with data changes and the snapshot could not be deleted until the data change rate slowed. In this particular case, the snapshot deletion occurred 8 hours after it was issued.
I bring this up to underscore the enhancements that VMware has made to Storage vMotion. In earlier versions (3.5) Storage vMotion relied on snapshots, which may not have been approachable for all databases or Business Critical Applications. In vSphere 4 it relied on change block tracking; more approachable but still not optimal. Currently, vSphere 5's data mirroring truly makes it feasible to use Storage vMotion for VBCAs.
Storage vMotion Dirty Block Copy Mechanism Timeline
- vSphere 3.5 used snapshots
- vSphere 4 used change block tracking
- vSphere 5 uses mirroring
Additional Benefits of Storage vMotion – Renaming a Virtual Machine and its VMDKs
For administrators, it can be frustrating to rename a virtual machine but have its VMDKs keep the original name. With Storage vMotion you can rename the virtual machine and its VMDKs in one operation and without downtime.
byDavid Klee (@kleegeek)
As database administrators, we are always searching for tools and technologies that can help us improve our lives, our jobs, our systems, and our processes. When a database server is virtualized, a whole new set of tools is available for database administrators to embrace. VMware vSphere 5.0 is the latest and greatest release from VMware, and House of Brick considers it the finest server platform in the world. The tooling and features included in this release allow for some of the greatest power and flexibility in the industry.
These core features allow DBAs more flexibility within the environment and, in practice, will help reduce downtime.
vMotion. vMotion allows for the migration of an in-use virtual machine from one physical server to another. If maintenance needs to be performed on a physical host, or if the host is overcommitted on resources and you need to move the virtual machine to a new host with less resource contention, it is as simple as a drag-and-drop and then two clicks. The VM stays up, the services all continue to work, and no downtime is experienced.
Storage vMotion. Storage vMotion gives the administrator the ability to migrate one or more virtual hard drives from one SAN LUN to another. This occurs transparently to the application. If a database administrator sees that a local drive is about to fill up, and that not enough free space on the LUN exists, the virtual disk can be relocated to a new LUN that has enough free space. At this point, the virtual disk can be grown and then the guest operating system can seamlessly extend the local partition. All of this can be done without downtime or hassle.
Snapshots. How many times have you had to develop, document, and test a minor upgrade roll-back plan? What if your fool-proof rollback plan could be as simple as three clicks? A snapshot is a point-in-time recovery point for a server. Take a snapshot with just a few clicks and the contents of the memory are stored. Any changes to the virtual hard drives are written to a delta disk. If something goes awry in the maintenance window, simply revert the VM to the state at when the snapshot was taken. Voilà – instant rollback! If the maintenance succeeded, simply commit the changes with two more clicks.
Storage I/O Control. If certain virtual machines are consuming substantially more storage resources than others and begin to negatively affect the performance of key virtual machines, a governor can be placed on the VM’s storage usage. It will let the administrator precisely limit a VM’s storage usage so that it does not affect any of its neighbors.
VMware Tools. Believe it or not, a component of the seemingly innocuous VMware Tools package should be considered one of the most important tools for database administrators. VMware Tools installs two new Perfmon Performance Objects along with their associated counters. These counters include:
- VM Memory
- Memory Active in MB
- Memory Ballooned in MB
- Memory Limit in MB
- Memory Mapped in MB
- Memory Overhead in MB
- Memory Reservation in MB
- Memory Shared in MB
- Memory Shared Saved in MB
- Memory Shares
- Memory Swapped in MB
- Memory Used in MB
- VM Processor
- % Processor Time
- Effective VM Speed in MHz
- Host processor speed in MHz
- Limit in MHz
- Reservation in MHz
These counters should be collected with Perfmon and kept for historical purposes. If a user reports system performance problems, these stats can be analyzed and overlaid with vCenter statistics to determine if the host was under duress at the time of the reported problem. It can help rule out one layer in the stack and save time in troubleshooting.
vNUMA. vSphere 5 now extends the CPU NUMA to the virtual machine. A performance boost can be felt by properly aligning the virtual machine design with the NUMA structure in the physical host.
In addition to these core features, new product offerings from VMware can add even more features and tools into the DBA’s world.
vFabric Data Director. VMware’s new vFabric Data Director presents database-as-a-service databases for your internal cloud. The databases are self-contained, and provisioned in a self-service manner. It can reduce database sprawl while accelerating application development lifecycles.
vCloud Director. The pinnacle of VMware technology, vCloud Director, allows for an entire IT ecosystem to be rapidly provisioned in a self-service portal. Database and application servers can be deployed from a pre-built VM catalog quickly and seamlessly. The application development lifecycle is tremendously shortened by this self-service flexibility.
If you are interested in exploring these topics, they are covered more in depth here in the Solution Architects blog in the current series of posts by Jim Hannan entitled vSphere 5 Advantages for VBCA.
By Jim Hannan
This blog post focuses on VMFS-5 enhancements in vSphere and how they improve Virtual Business Critical Applications (VBCA) performance and scalability. VMFS-5 offers several improvements from the previous version VMFS-3.
How VMFS-5 has improved in scalability and performance:
- Newly created VMFS-5 datastores use a single unified block size of 1MB. There is no more choosing between 2MB – 8MB during VMFS creation. Standard block size of 1MB allows for improved performance, but more on this later.
- VMFS-5 uses GUID Partition Table (GPT) rather than MBR, which allows for pass-through RDM (RDM-P or RDM Physical) files up to 60TB. RDM-V has a maximum size limit of 2TB.
- VMFS-5 does not use SCSI-2 reservations, but uses the Atomic Test and Set (ATS) VAAI primitives. SCSI-2 reservations and ATS are used when, among other things, a lock is needed. A lock is required to perform some operations, like creating a VMDK or creating and deleting snapshots. The ATS mechanism allows for locks that are more efficient and improved performance. ATS requires the VMFS-5 files system and VAAI enabled SAN.
- VMFS-5 uses SCSI_READ16 and SCSI_WRITE16 cmds for I/O (VMFS-3 used SCSI_READ10 and SCSI_WRITE10 cmds for I/O). Of the new features, this is the technology improvement I am least familiar with. My understanding of SCSI_READ10 vs. SCSI_READ16 is that SCSI_READ16 offers larger storage capacity. SCSI_READ16 uses a 64-bit Logical Block Addressing (LBA) field allowing it addressing eight Petabytes of storage. Obviously, a VMFS datastore cannot grow this large, but you can see the potential.
Reference: vSphere 5 FAQ: VMFS-5
Upgrading to VMFS-5
HoB recommends creating new VMFS-5 datastores rather than in place upgrades of VMFS-3 to VMFS-5. When upgrading a VMFS volume you only partially benefit from some of the new features. As an example, VMFS-3 volumes upgraded to VMFS-5 will retain the original block size instead of the new 1MB block size. This affects features like Storage vMotion and ATS. In the case of ATS, the ESX hypervisor will revert to the slower SCSI-2 reservation. With Storage vMotion, a slower datamover mechanism called fs3dm will be used instead of the faster fs3dm-hw. In my next blog, Part 3 – Storage vMotion enhancements for VBCA, I will address Storage vMotion and its datamovers (fsdm, fs3dm, and fs3dm-hw).