3.12.2019

Fix MSDTC for VRA Install Wizard Validation

Did you use a template to create the IaaS servers for VRA? This is a quick post on how to resolve the errors from the VRA validator step. Perhaps like you I had some trouble locating a concise KB article or post on an easy way to resolve these issues.

Reset the CID/SID of the Server

Log into the IaaS and DB servers as Administrator.

Opening REGEDIT can show what the CID/SID values are. This is located:
HKEY_CLASSES_ROOT\CID\(CID)\Description\(Default)

Open a powershell prompt as administrator and run the command:

Uninstall MSDTC
msdtc -uninstall

Reboot
shutdown -r -t 0

Re-install MSDTC (login with same permissions as above)
msdtc –install

Warning: The msdtc command does not give any return response when running this command.

Open the Firewall

Enable the firewall rules for WMI and DTC on both computers by using the Netsh utility. This

netsh advfirewall firewall set rule group="Windows Management Instrumentation (WMI)" new enable=yes
netsh advfirewall firewall set rule group="Distributed Transaction Coordinator" new enable=yes


Testing

Basic checking can be done by opening the Component Services MMC. You should see something similar.
Component Services MMC for MS DTC


Run the DTCtester to test the state of MSDTC. Below are some example tests that can be run to test local and both local and remote DTC connectivity.

Test MSDTC on the local computer
Test-Dtc -LocalComputerName "$env:COMPUTERNAME" -Verbose

Test MSDTC on the local computer and a remote computer
Test-Dtc -LocalComputerName "$env:COMPUTERNAME" -RemoteComputerName "remote-server" -ResourceManagerPort 17100 -Verbose

Test MSDTC on a local computer that blocks inbound transactions
Test-Dtc -LocalComputerName "$env:COMPUTERNAME" -RemoteComputerName "remote-server" -ResourceManagerPort 17100 -Verbose

Test MSDTC on a local computer that blocks outbound transactions
Test-Dtc -LocalComputerName "$env:COMPUTERNAME" -RemoteComputerName "remote-server" -ResourceManagerPort 17100 -Verbose


This is the result if the first test partially fails. The 3 local and remote tests will also show the CIDs for the communicating systems. Referring to the REGEDIT above will display the UIS and the XA values that are contained in the CID subkeys.. From this output you will be able to determine if the CIDs are unique as another method to validate the registry values.

PS C:\Windows\system32> Test-Dtc -LocalComputerName "$env:COMPUTERNAME" -Verbose
VERBOSE: ": Firewall rule for "RPC Endpoint Mapper" is enabled."
VERBOSE: ": Firewall rule for "DTC incoming connections" is enabled."
VERBOSE: ": Firewall rule for "DTC outgoing connections" is enabled."
VERBOSE: IN-SQL02: AuthenticationLevel: Mutual
VERBOSE: IN-SQL02: InboundTransactionsEnabled: False
WARNING: "IN-SQL02: Inbound transactions are not allowed and this computer cannot participate in network transactions."
VERBOSE: IN-SQL02: OutboundTransactionsEnabled: False
WARNING: "IN-SQL02: Outbound transactions are not allowed and this computer cannot participate in network transactions."
VERBOSE: IN-SQL02: RemoteClientAccessEnabled: False
VERBOSE: IN-SQL02: RemoteAdministrationAccessEnabled: False
VERBOSE: IN-SQL02: XATransactionsEnabled: False
VERBOSE: IN-SQL02: LUTransactionsEnabled: True


This is the result when things look good for the installer to proceed.

PS C:\Windows\system32> Test-Dtc -LocalComputerName "$env:COMPUTERNAME" -Verbose
VERBOSE: ": Firewall rule for "RPC Endpoint Mapper" is enabled."
VERBOSE: ": Firewall rule for "DTC incoming connections" is enabled."
VERBOSE: ": Firewall rule for "DTC outgoing connections" is enabled."
VERBOSE: IN-SQL02: AuthenticationLevel: Mutual
VERBOSE: IN-SQL02: InboundTransactionsEnabled: True
VERBOSE: IN-SQL02: OutboundTransactionsEnabled: True
VERBOSE: IN-SQL02: RemoteClientAccessEnabled: True
VERBOSE: IN-SQL02: RemoteAdministrationAccessEnabled: True
VERBOSE: IN-SQL02: XATransactionsEnabled: False
VERBOSE: IN-SQL02: LUTransactionsEnabled: True


Summary

This is only one example of how to resolve these errors. If you used a template and a customization spec as you deploy while selecting "Generate New Security ID (SID)" your experience might be different.

8.23.2018

IBM 10G Switch - The Home Lab Gem

I came across the IBM G8124 while providing some pre-sales architecture to some of my clients. As HomeLab'ers it's difficult to afford some modern datacenter switch that we can afford. Most all 10G switches are over $1000 unless you are looking on the used market and then most of the switches are old and power hungry. It's easy to locate these on EBay and the prices have been dropping as they get a little older. Because most of us are using SDN (software defined working) they work very well in low cost lab situations where 10G offers some really nice benefits. When paired with some Emulex OCE11102 dual port 10G NICs it's possible to get a full 10G network for less than $500.




The G8124 is considered a Top-of-Rack switch that maintains some incredibly low port to port latency, about .600 nanoseconds. It also supports Virtual Fabrics and L3 routing with OSPF. This switch offers some really nice features when fitted in a HomeLab where VMware vSAN, NSX, VRA and other goodies want to be learned. If this solution is of interest one thing to note is for connectivity to the rest of your network you will need to have either a 10G interface in your existing switch or you will need a 1G SFP interface. The 2 x 1G interfaces are strictly for out-of-band management. Many IBM systems feature 2 dedicated management interfaces that require a different network than any SVI assigned and each management interface are required to reside on different networks as well. It is possible to only use a single management interface or manage the switch through one of the SVIs. While I wouldn't recommend this in a production environment for a lab, have at it, knock yourself out.

The config is a little different than the Cisco language but not very difficult to get past if you are familiar with the concepts. Documentation and firmware are still available from IBM. Below are some links for this and model information.

Firmware and Docs
https://www.ibm.com/support/home/search-results/5422459/IBM_RackSwitch_G8124,_8124E_-_7309,_0446,_1455_7309?docOnly=true&sortby=-dcdate_sortrange&ct=rc

Model Info
https://lenovopress.com/tips0787

Example code (shortened to remove redundancy)

!
version "7.11.9"
switch-type "IBM Networking Operating System RackSwitch G8124-E"
iscli-new
!
system timezone 145
! America/US/Eastern
system daylight
!
ssh enable
!
snmp-server location "CloudRoom"
snmp-server read-community "HNET"
!
no system bootp
no system dhcp mgta
no system dhcp mgtb
no system default-ip
hostname "10gNET"
no hostname prompt
system idle 60
!
!
no access telnet enable
!
!
interface port 1
        switchport mode trunk
        switchport trunk allowed vlan 1,3,5,10,70-71,80-85,98-102,201-209,250,252,301-339
        bpdu-guard
        flowcontrol send on
        flowcontrol receive on
        exit
!
interface port 11
        switchport access vlan 98
        bpdu-guard
        exit
!
interface port 12
        switchport mode trunk
        switchport trunk allowed vlan 1,3,5,10,70-71,80-85,98-102,201-209,250,252
        exit
!
interface port 15
        switchport access vlan 202
        flowcontrol send on
        flowcontrol receive on
        exit
!
interface port 16
        switchport access vlan 201
        flowcontrol send on
        flowcontrol receive on
        exit
!
interface port MGTA
        shutdown
        exit
!
interface port MGTB
        shutdown
        exit
!
vlan 10
        name "LAB1"
!
vlan 70
        name "VLAN 70"
!
vlan 201
        name "iSCSI-201"
!
vlan 202
        name "iSCSI-202"
!
vlan 205
        name "ESXi-vMotion"
!
vlan 206
        name "ESXi-FT"
!
vlan 250
        name "Home-NET"
!
vlan 252
        name "GuestNET"
!
portchannel 13 lacp key 100
portchannel 14 lacp key 101
!
!
!
spanning-tree mst configuration
        name "local"
        exit
!
spanning-tree mode disable
!
no spanning-tree pvst-compatibility
spanning-tree stp 1 vlan 1
spanning-tree stp 1 vlan 3
!
!
logging host 1 address 192.168.98.48 DATA
!
interface port 13
        lacp mode active
        lacp key 101
        no lacp suspend-individual
!
interface port 14
        lacp mode active
        lacp key 101
        no lacp suspend-individual
!
interface port 23
        lacp mode active
        lacp key 100
        no lacp suspend-individual
!
interface port 24
        lacp mode active
        lacp key 100
        no lacp suspend-individual
!
interface ip 1
        ip address 100.64.254.254 255.255.255.0
        enable
        exit
!
interface ip 3
        vlan 3
        exit
!
interface ip 70
        ip address 192.168.70.254
        vlan 70
        enable
        exit
!
!
ip bootp-relay server 1 address 192.168.98.21
ip bootp-relay server 2 address 192.168.98.22
ip bootp-relay information enable
ip bootp-relay enable
!
!
ip igmp snoop vlan 1
ip igmp enable
ip igmp snoop enable
!
ip igmp snoop igmpv3 enable
!
ip route 0.0.0.0 0.0.0.0 100.64.254.1
ip route 192.168.251.0 255.255.255.0 192.168.250.245
ip route 192.168.8.0 255.255.248.0 192.168.250.250
!
router ospf
        enable
!
        area 0 enable
!
interface ip 1
        ip ospf enable
!
ntp enable
ntp primary-server 192.168.98.21 DATA
ntp secondary-server 192.168.98.22 DATA
!

end

6.19.2015

vSphere (and others) LAB storage

Some of you may know I have been building and using a vSphere lab for a number of years now as most VMware professionals. Recently the SAN platform I've been using for a couple years, Nexenta, has removed/disabled VAAI support from their software because of some issues so I decided to try the other popular FreeNAS since it's been rapidly maturing.

For the most part my 3 Nexenta SANs have been running fine until a HDD dies at which time the SAN would lock and require some coaxing and perhaps a power cycle to come back alive. With some of the recent changes to the platform, removing VAAI, I decided it was time to give FreeNAS another try.

For those of you involved in some way with VMware vSphere you know that VAAI was a very important advancement in storage function and management. It provides primitive functions to allow the storage controller to do the work only sending progress updates to the hosts cutting down on latency and storage fabric utilization. Nexenta used to provide 3 of the commonly used and 1 of the uncommonly used primitives. https://v-reality.info/2011/08/nexentastor-3-1-adds-second-generation-vaai/
They have removed VAAI in the recent patches 4.0.3FP2 due to "kernel panic issues". What they failed to realize is this is a SIGNIFICANT change to a storage infrastructure. It's easy to introduce from a traditional non-VAAI design but once a storage architecture is designed for VAAI it's nearly impossible to go back. FreeNAS 9.3 supports 5 primitives, you get a bonus one. http://www.ixsystems.com/whats-new/freenas-93-features-support-for-vmware-vaai/
1 particular primitive, ATS, allows us to make LUNs much larger in size since only VMDK operations happen at the file level instead of the entire LUN. This allowed us to make larger LUNS since having more then 10 or 15 VMs in a LUN since the host would not lock an entire LUN for a single file operation causing the rest of the VMs to be impacted. Further FreeNAS also includes Warn&Stun which provides the host with some more intelligence about a thin provisioned VM reducing crashes.

FreeNAS has also been making many other improvements to the platform. One major one was the migration from iSCSI target software being moved from user space to kernel space. After some 'seat of the pants tests' compared to earlier releases this seemed to provide a nice 30% improvement in performance.

Installing 9.3 FreeNAS is as simple as it's always been, a couple presses of <ENTER> and it's installing. One nice feature is you have the ability to install to USB where Nexenta cannot. However make sure you create SWAP on a disk once you have it installed. Being BSD based compared to openSolaris you have a much wider array of hardware choices. Going from Nexenta to FreeNAS you should have no issues. The community forms and docs provide some good direction for hardware and firmware versions. For example using the standard LSI HBAs you know to use the P16 firmware version. The other cool feature is FreeNAS does not limit you to 18TB of RAW storage.


I've now been running FreeNAS as the main LAB storage san for a couple days now and I'm rather impressed with it's performance and stability. Nexenta, I couldn't always say this...

5.29.2015

Backup of vCenter and other vSphere components


One of the many questions through the years while deploying a VMware virtual environment has been “How do I backup vCenter?”. The response is the typical next-gen IT answer, “It depends”. Some of these dependencies can relate to how large your environment is to what is your organization’s maintenance process? Other contributing factors can be leveraging other departments within your organization that maintain databases. Through my experience from small to large business, these clients have many different operational procedures and many have dedicated SQL teams to manage this process. All of these can contribute to various scenarios within your own organization.

To begin the search I considered how VMware would currently address the issue, however did not turn up any real meat in terms of official support or KB articles. Considering they have their own backup product and do not provide much guidance in this area leaves me to believe they recognize the thousand different ways this can be accomplished. Next I searched around the different backup vendor sites and this lead to the same lack of ‘official' information. The information I did find was info from other blogs or lists and as you can guess opinions varied as much as the search results I was typing in Google. Considering there are many ways to accomplish this goal I wanted to find information directly through supportable channels to have a good base for this endeavor.

Plan B…


What would be required if my entire virtual environment were trashed and I had to rebuild from scratch The key requirement would be to create a backup that would save the vCenter database but also the ESXi configs and the specific build numbers. If build numbers are not at least noted then firmware compatibility or specific vSphere builds may introduce issues into the environment. It’s easy to stand up a new fresh environment that is fully patched but this can break stuff.

Let’s consider what specifics we need to account for. The typical components of a typical vSphere environment are vCenter and its database, ESXi hosts, datastore connectivity and network connectivity. If there are other services such as vRealize Operations or vRealize Log Insight these services can be saved and recovered either with a replication technology such as vSphere Replication or with a backup technology, vSphere Data Protection or Veeam. We can also use these tools to protect vCenter however we do not have a guarantee of database consistency.

Starting with vSphere and the database if VCSA we can refer to the KB articles
http://kb.vmware.com/kb/2034505
For vCenter 6
http://kb.vmware.com/kb/2091961

This appears to improve the process by adding an online method of saving the database. If you are using a Microsoft SQL server embedded with vCenter your experience may vary using standard backup tools with MSSQL VSS aware plugins. A sure method is to leverage MS SQL Studio to perform SQL backups. This will use the appropriate VSS provider for consistency and then backup the exported DB backup file. Upon recovery this file can be imported into a fresh vCenter deployment for recovery. If the MSSQL server is dedicated the same method can be used however this architecture has shown more reliable while performing backups using the standard backup processes. Below are some references for MSSQL backups.

MS SQL Database backups
https://support.microsoft.com/en-us/kb/2019698
Migrate MSSQL Express (unsupported) to SQL Standard (supported)
http://kb.vmware.com/kb/1028601

Next we need to save the config for the ESXi hosts. Yes, this config can be saved as well. Be sure to save any drivers you may have added outside the standard patches. I’ve noticed over time specific versions of drivers become unavailable so it is important to save these as they may have a dependency with the respective card’s firmware version. This is important due to newer CNAs, 10G, FC adapters and their dependency between firmware to driver versions.

Backup ESXi host config
http://kb.vmware.com/kb/2042141

This provides ESXi build references for use in manually creating baselines for recovery for your current ESXi build level.
http://kb.vmware.com/kb/1014508
References for manually creating update manager baselines.
http://kb.vmware.com/kb/1019545

Another best practice is to keep a current config exported of your vSphere dVswitches. This is the only critical piece in the event of a catastrophic failure that would cause downtime. Sure, you would loose some configs and some historical data but these are not critical to the functionality of the virtual machines running on the hosts. Obviously this is very simplistic and other monitoring, automation, and compliance systems do need to be considered in the grand scheme of the design but this provides a second backup type for this very critical information if all else fails.

Export dVswitch config
http://kb.vmware.com/kb/2034602

In the case where a SLA must be maintained for this data and other management systems a dedicated a management cluster becomes the reference and preferred architecture. This would remove the backup circular dependency created when any backup system attempts to quiesce the vCenter database. This also provides a solid architecture where a highly or hyper-converged architecture is implemented. When management systems are integrated with the hardware being managed there are times when manual juggling is required removing some of the automation SDDC provides. Updating, patching, providing maintenance, and unplanned failures often require this juggling effort. For example if vCenter is running on a host that decides it’s time to reject a stick of ram and PSODs while automation tasks are occurring this will impact these tasks while vCenter is non-functional. Here is a link with some great reference designs.
http://blogs.vmware.com/vsphere/2014/12/creating-vmware-software-defined-datacenter-reference-architecture.html

Bottom line… Since many vendors provide tools to accomplish these tasks of ensuring these management applications are recoverable prudence still is required while merging these technologies together. The community forums of each vendor typically provides real world experience and is a valuable support tool. However always reference release notes and documentation as these provide officially supported architecture, behavior and tips for dependable operation.

5.11.2011

Cloud apps

Been a little while since my last post. Well... Time to come back after spending some time at a new job.

Some cool things I've come across. For 1 I'm writing this from my phone (the little things in life). Watched a video for Google IO. You should check it out. Also VMware anounced a new cloud platform. This should lend itself for those attempting to create a private cloud beyond simply running virtual servers.

11.08.2010

Virtual backups

A short note about backing up your VMs.

One of the next (and sometimes forgotten) issues after you have virtualized your life is now how do you save it? You could keep performing backups the same way you have for years however I would recommend staggering them as if they all start at the same time you stand the risk of creating I/O contention on your SAN. Now you have an alternative method since your virtual servers now are living in essentially files or possibly a LVM style partition, depending on the technology you are using, let's take advantage of this situation.

Using methods provided by traditional solutions as in Backup Exec with the VMware agent or even looking at newer offerings such as Veeam or PHDVirtual you can achieve successful backups easier then sticking with agent per-server (virtual server in this case) methods. The new style software that specifically supports VMware or Xenserver are agent-less and are gaining features that can either equal or even exceed what physical server backups are capable of. Missing in the physical server world compared to the virtual world is the visibility at a lower level from the volume where the data or files reside you are concerned about. On one side we are dealing with platters inside of a physical disk compared to the virtual side where we can easily see a layer under the operating system's disk. Some of what is built into VMware, and to a lesser extent with other solutions, allows us to intelligently deal with this data.

Bottom line - if you are having trouble getting good reliable backups in the physical world perhaps virtualization can assist along with other cost cutting reasons.

10.26.2010

I/O (part 2)

In part 2 of I/O we will consider how to observe some pain points in your overall storage design. These concepts could be applied to any technology once you understand how they can be applied. The concern areas include any connection between the application running in the operating system all the way to the spinning platters inside the disk drive. In this I will speak specifically to iSCSI as this is becoming increasingly common in storage networks.

Servers
Lets begin right at the server where the application or files are presented from. There are some things to tune here but nothing that will make a significant difference. If using a physical server ensure the NIC(s) you are connecting to the storage with are 1Gb server type network cards. Most popular ones these days support some type of TCP offloading and the associated drivers are a better quality in the supported OSs. If this machine is virtual the VM itself will not be performing the iSCSI translation rather VMware will be handling this piece. If you find yourself needing to use an iSCSI initiator from within a VM use a dedicated  vmxnet 3 virtual NIC if supported. One of the methods to check if I/O is the issue, check PerfMon or iostat (with respect to OS) and look for queue depth, length, or hold time. This measurement can indicate if the OS is holding SCSI requests waiting to be processed. One potential solution depending on the root cause is to enable MPIO as this can assist with performance issues and also provide iSCSI redundancy.

Virtual Host
The next link in the chain is normally VMware, Xenserver or some other virtualization technology. In a physical environment this can obviously be skipped. In a virtual host environment some of the same rules apply however keeping in mind you now have many servers using the same iSCSI connections. In a local storage environment you had a direct path between the controller and the disk drive using a 68pin or SAS cable and was typically capable of more then 1Gb/sec. Now you have many servers using perhaps a single 1Gb connection to it's respective disk as well as the latencies introduced with the other components. Evaluating the performance here can be done in a similar approach by checking for disk latency and queue. Make sure latency is less then 50ms and queue is less then 50. If using an application, like a SQL database, some vendors have much stricter limit of between 2ms and 10ms for latency. Using such technologies as MPIO, better network cards, updated drivers, fully patched hosts can assist to provide the desired performance. Also providing dedicated iSCSI interfaces should be one of the first things considered in a properly designed host.

Infrastructure
Moving to the switch infrastructure can also play a significant role in the overall performance and is often overlooked. The basic rule is to use a good quality switch with plenty of port buffering. This will ensure the packets flow through without becoming blocked due to the buffers filling. This could be seen from the VM and the host showing high levels of latency however the SAN showing low overall utilization and no signs of stress. The switch itself may not show a high CPU level or any other stress as it may not have a lot of traffic on all ports or the configuration may not have CPU intensive tasks. Also to ensure the switch will not be asked to perform some of these other functions or pass non-iSCSI traffic it is recommended to use dedicated switches. In some designs or budgets this may not be possible so ensure the switch you are using is a good quality switch. Some examples include the HP 2910al or the Cisco 3750. Obviously there are many full Gb switches on the market even in the sub $200 range and may be fine for lab/test situations I would caution using them in a production network as these may not have enough buffering to maintain a non-blocking state.

Storage
Considering storage, this is one area that is not as clear. Due to the amount and diversity of technology these vendors use one must understand the architecture and hardware used. Typically most vendors will have some method to measure CPU, memory utilization (often local cache), disk queue depth and latency. Virtualized systems will always perform better (as most systems) when RAID 10 or RAID 50 sets are chosen over RAID 5. Using SAS, SCSI or FC 10Krpm or 15Krpm disks obviously will always perform better then the SATA, SAS 7Krpm disks. Another philosophy concerning the number of spindles or amount of disks used can also prove to be beneficial however as SAN vendors use different technology this may or may not help as much as it used to. One consideration to support this is if the disk controller can handle many disks in a large RAID set. Recently Intel and others have shown processors are becoming so fast software based RAID can outperform hardware based RAID sets. Also as you are designing your disk system do not add parity disks (or equivalent of a disk) in your write I/O calculations as this stripe when written will actually increase write time. Read times will lessen however also keep in mind especially in virtualized environments the platters are housing blocks of simply more blocks of data. Each time the virtual OS writes a file it changes a block (VMware example) in the .vmdk file, then changes a block on the VMFS partition, which in turn changes a block on whatever filesystem the SAN uses to store data. In the world of virtualization this can be virtualized, not sitting directly on platters, also. ;-)

Enjoy!