Storage and VMware vSAN design tips

These days storage, even local storage, is more complex to understand with all the different options. These options include everything from Storage Class Memory to spinning disks. So this begs the question, "how do we choose what to attach to our servers?" Companies like Dell with it's VxRail product do provide a jointly engineered solution, so no matter what your requirements are an architecture can reliably be created. If your use case if for VDI, common server workload, or databases with heavy I/O, a solution can be created with success. Solutions like ready-nodes or simply picking parts off the VMware HCL are good solutions, however the success of the solution is up to the engineering prowess of the architect.

Storage is one of those critical pieces of infrastructure. It is the last chain in the data path from where we can listen to downloaded music, view favorite family and holiday photos, run that app on a daily basis. If a CPU or a memory stick dies or even if a network cable breaks typically no data is actually lost. However, if a drive dies all of our memories and productivity for at least that day is gone.

Desktops were typically backed up to some external tape or disk. Typically today, backups are sent to some type of remote or cloud resource. For servers, these can use larger variants of these resources however because more risk and expense can come with failed hardware. A little extra caution and effort placed when considering storage architecture. This includes the quality and built-in redundancy of the designs.

The other consideration is performance. Because of the number of drive choices, when evaluating performance and reliability, we have many different types of drives to consider. Our desktops and laptops typically use SSDs or NVMEs and now servers are typically designed with these. Considering performance below is a memory and drive performance table that displays latency with the 'human relatable' translation. ( #geeks #> ls -lh ) Most of this information was retrieved from Frank Denneman - AMD EPYC Naples vs Rome and vSphere CPU Scheduler Updates. I like how he correlated from 1 CPU cycle all the way to a SSD I/O. I added a typical 15K disk drive for additional impact on the comparison.

Memory and Drive Latency

Next I would like to delve into VMware vSAN because many of our datacenters are now turning to hyper-converged architectures that run vSAN, I thought I'd hit on some of the salient points.

Disk groups should be a key considerations when architecting for vSAN and how many to use per host. Another is all-flash verses hybrid. As the cost of flash based storage becomes even less expensive hybrid arrays do not make as much sense to implement. vSAN limits the feature set of hybrid compared to all-flash. Hybrid arrays are not capable of erasure-coding (RAID5/6) or compression and de-duplication. Hybrid designs will consume all cache you provide and use 70% for reads and 30% for write caching. 10% capacity of the capacity tier is the recommended capacity for the cache tier. However, a relative relationship exists between the cache tier capacity and the host memory consumed. Increasing the cache tier will cause and increase in host memory consumed. 

All-Flash typically makes more sense considering cost, heat, performance, and reliability. All-Flash is a little different in the case of features and cache. Specifically for cache 100% is dedicated to write caching however it's limited to 600GB. Larger capacity drives are supported and will enhance reliability due to write leveling. Keep in mind the goal is to flush cache to capacity and thus data protection. Read caching is not necessary. Flash drives do not have mechanical limits so I/O can occur more rapidly. For performance and to limit the amount of memory consumed away from VMs I prefer the Optane (375GB) drives matched with either SAS or SATA SSD capacity drives. VMware recommends architecting cache tiers with faster drives compared to the capacity tier. For example, leveraging all NVME drives in the capacity tier, Optanes are recommended in the cache tier.

Another consideration is that when using NVMEs Dell VxRail systems require dual processors. Check the vendor specifications for NVMEs and other directions with vSAN as using different drive technologies may require other server host considerations. I also prefer to use at least 2 disk groups per host especially in production due to the fact that if a cache drive fails the entire disk group fails. Using 2 disk groups per host will increase the availability of the architecture.

Ultimately, isn't that what we are after? Availability, reliability, and performance.


My HomeLab

Current Lab configuration

vSphere 6.7 P1
vSAN All flash FTT=1 Raid 3+1
VRA 7.6
vROPs 7.5
vRLI 4.5
NSX 6.4.6
VLC 3.9.1

Total of 6 VMware Hosts

Supermicro X9DR3-F (Ebay for $200 each)
Supermicro 2U Chassis, 8 hot-swap 3.5"
128GB RAM each ($160)
Dual Intel(R) Xeon(R) CPU E5-2650L 8C @ 1.80GHz ($140)
Dell H310 LSI 2008 HBA (flashed to IT mode and Q-depth 600) ($40)
Emulex OneConnect OCe11102 Dual port 10Gb NIC ($40)
WD Raptor 300GB - Boot Drive
Misc Cables ($40)

Supermicro X9DRI-F+ (Ebay for $160 each)
Supermicro 2U Chassis, 8 drive hot-swap 3.5"
128GB RAM each ($160)
Dual Intel(R) Xeon(R) CPU E5-2650L v2 10C @ 1.70GHz ($140)
Dell H310 LSI 2008 HBA (flashed to IT mode and Q-depth 600) ($40)
Emulex OneConnect OCe11102 Dual port 10Gb NIC ($40)
WD Raptor 300GB - Boot Drive
Misc Cables ($40)

3 hosts based off each design.

3x E5-2650L based hosts = $1,650
3x E5-2650L v2 based hosts = $1,530

vSAN Storage
Cache Tier
Intel SSDSC2BX40 400GB (5)
Samsung NVMe 960 (1)

Capacity Tier
Samsung SSD 860 EVO 1TB (16)
Intel SSDSC2BX40 400GB (3)
Crucial CT240M50 250B (6)
Crucial CT480M50 480GB (1)
M4-CT256M4SSD2 250GB (1)
OCZ-Agility3 250GB (2)

Storage SAN / NAS

FreeNAS - 69.6TB
X8DTH-6F - ($400)
Supermicro 4U Chassis, 36 drive hot-swap 3.5"
Dual Intel Xeon L5630L 4C 2.13GHz ($50)
48GB RAM ($75)
Boot Drive
10K SAS 500GB
Disk Group 2 - RAIDZ 6 - 18.1TB
10x various 2TB 7K RPM disks
Disk Group 1 - RAIDZ 6 -  19TB
7x various 3TB 7K RPM disks
ARC Cache 40GB
Disk Group 3 - RAIDZ 6 -  32.5TB
9x various 4TB 7K RPM disks
ARC Cache 60GB

2x IBM G8124-E - 24 port 10Gb SFP+ ($850)
4x SFP+ 1Gb GBICs ($80)
Cisco SG300-28 ($528)
Cisco SG200-26P ($250)