vSphere (and others) LAB storage

Some of you may know I have been building and using a vSphere lab for a number of years now as most VMware professionals. Recently the SAN platform I've been using for a couple years, Nexenta, has removed/disabled VAAI support from their software because of some issues so I decided to try the other popular FreeNAS since it's been rapidly maturing.

For the most part my 3 Nexenta SANs have been running fine until a HDD dies at which time the SAN would lock and require some coaxing and perhaps a power cycle to come back alive. With some of the recent changes to the platform, removing VAAI, I decided it was time to give FreeNAS another try.

For those of you involved in some way with VMware vSphere you know that VAAI was a very important advancement in storage function and management. It provides primitive functions to allow the storage controller to do the work only sending progress updates to the hosts cutting down on latency and storage fabric utilization. Nexenta used to provide 3 of the commonly used and 1 of the uncommonly used primitives. https://v-reality.info/2011/08/nexentastor-3-1-adds-second-generation-vaai/
They have removed VAAI in the recent patches 4.0.3FP2 due to "kernel panic issues". What they failed to realize is this is a SIGNIFICANT change to a storage infrastructure. It's easy to introduce from a traditional non-VAAI design but once a storage architecture is designed for VAAI it's nearly impossible to go back. FreeNAS 9.3 supports 5 primitives, you get a bonus one. http://www.ixsystems.com/whats-new/freenas-93-features-support-for-vmware-vaai/
1 particular primitive, ATS, allows us to make LUNs much larger in size since only VMDK operations happen at the file level instead of the entire LUN. This allowed us to make larger LUNS since having more then 10 or 15 VMs in a LUN since the host would not lock an entire LUN for a single file operation causing the rest of the VMs to be impacted. Further FreeNAS also includes Warn&Stun which provides the host with some more intelligence about a thin provisioned VM reducing crashes.

FreeNAS has also been making many other improvements to the platform. One major one was the migration from iSCSI target software being moved from user space to kernel space. After some 'seat of the pants tests' compared to earlier releases this seemed to provide a nice 30% improvement in performance.

Installing 9.3 FreeNAS is as simple as it's always been, a couple presses of <ENTER> and it's installing. One nice feature is you have the ability to install to USB where Nexenta cannot. However make sure you create SWAP on a disk once you have it installed. Being BSD based compared to openSolaris you have a much wider array of hardware choices. Going from Nexenta to FreeNAS you should have no issues. The community forms and docs provide some good direction for hardware and firmware versions. For example using the standard LSI HBAs you know to use the P16 firmware version. The other cool feature is FreeNAS does not limit you to 18TB of RAW storage.

I've now been running FreeNAS as the main LAB storage san for a couple days now and I'm rather impressed with it's performance and stability. Nexenta, I couldn't always say this...


Backup of vCenter and other vSphere components

One of the many questions through the years while deploying a VMware virtual environment has been “How do I backup vCenter?”. The response is the typical next-gen IT answer, “It depends”. Some of these dependencies can relate to how large your environment is to what is your organization’s maintenance process? Other contributing factors can be leveraging other departments within your organization that maintain databases. Through my experience from small to large business, these clients have many different operational procedures and many have dedicated SQL teams to manage this process. All of these can contribute to various scenarios within your own organization.

To begin the search I considered how VMware would currently address the issue, however did not turn up any real meat in terms of official support or KB articles. Considering they have their own backup product and do not provide much guidance in this area leaves me to believe they recognize the thousand different ways this can be accomplished. Next I searched around the different backup vendor sites and this lead to the same lack of ‘official' information. The information I did find was info from other blogs or lists and as you can guess opinions varied as much as the search results I was typing in Google. Considering there are many ways to accomplish this goal I wanted to find information directly through supportable channels to have a good base for this endeavor.

Plan B…

What would be required if my entire virtual environment were trashed and I had to rebuild from scratch The key requirement would be to create a backup that would save the vCenter database but also the ESXi configs and the specific build numbers. If build numbers are not at least noted then firmware compatibility or specific vSphere builds may introduce issues into the environment. It’s easy to stand up a new fresh environment that is fully patched but this can break stuff.

Let’s consider what specifics we need to account for. The typical components of a typical vSphere environment are vCenter and its database, ESXi hosts, datastore connectivity and network connectivity. If there are other services such as vRealize Operations or vRealize Log Insight these services can be saved and recovered either with a replication technology such as vSphere Replication or with a backup technology, vSphere Data Protection or Veeam. We can also use these tools to protect vCenter however we do not have a guarantee of database consistency.

Starting with vSphere and the database if VCSA we can refer to the KB articles
For vCenter 6

This appears to improve the process by adding an online method of saving the database. If you are using a Microsoft SQL server embedded with vCenter your experience may vary using standard backup tools with MSSQL VSS aware plugins. A sure method is to leverage MS SQL Studio to perform SQL backups. This will use the appropriate VSS provider for consistency and then backup the exported DB backup file. Upon recovery this file can be imported into a fresh vCenter deployment for recovery. If the MSSQL server is dedicated the same method can be used however this architecture has shown more reliable while performing backups using the standard backup processes. Below are some references for MSSQL backups.

MS SQL Database backups
Migrate MSSQL Express (unsupported) to SQL Standard (supported)

Next we need to save the config for the ESXi hosts. Yes, this config can be saved as well. Be sure to save any drivers you may have added outside the standard patches. I’ve noticed over time specific versions of drivers become unavailable so it is important to save these as they may have a dependency with the respective card’s firmware version. This is important due to newer CNAs, 10G, FC adapters and their dependency between firmware to driver versions.

Backup ESXi host config

This provides ESXi build references for use in manually creating baselines for recovery for your current ESXi build level.
References for manually creating update manager baselines.

Another best practice is to keep a current config exported of your vSphere dVswitches. This is the only critical piece in the event of a catastrophic failure that would cause downtime. Sure, you would loose some configs and some historical data but these are not critical to the functionality of the virtual machines running on the hosts. Obviously this is very simplistic and other monitoring, automation, and compliance systems do need to be considered in the grand scheme of the design but this provides a second backup type for this very critical information if all else fails.

Export dVswitch config

In the case where a SLA must be maintained for this data and other management systems a dedicated a management cluster becomes the reference and preferred architecture. This would remove the backup circular dependency created when any backup system attempts to quiesce the vCenter database. This also provides a solid architecture where a highly or hyper-converged architecture is implemented. When management systems are integrated with the hardware being managed there are times when manual juggling is required removing some of the automation SDDC provides. Updating, patching, providing maintenance, and unplanned failures often require this juggling effort. For example if vCenter is running on a host that decides it’s time to reject a stick of ram and PSODs while automation tasks are occurring this will impact these tasks while vCenter is non-functional. Here is a link with some great reference designs.

Bottom line… Since many vendors provide tools to accomplish these tasks of ensuring these management applications are recoverable prudence still is required while merging these technologies together. The community forums of each vendor typically provides real world experience and is a valuable support tool. However always reference release notes and documentation as these provide officially supported architecture, behavior and tips for dependable operation.