There are many reasons a guest VM can have an unstable network connection. Here, I’ll detail one of the lesser know causes. This issue was previously unknown but is now listed as a known issue with the versions of VMware Tools current as of the time of this writing.
It’s a well-known and well-documented practice to ensure the various Guest Introspection components are not installed unless the services are in use. What you may not know is that previous versions of the VM Tools either inadvertently installed these services or failed to uninstall them properly.
The result of this installer error is orphaned installations of one or both of the Guest Introspection services. The VM Tools installer will not see them as installed in its database and will, therefore, not attempt to remove or upgrade them on subsequent upgrades. The issue is detailed in KB Article 78016, albeit under a somewhat cryptic name and description.
The orphaned service installations can sit dormant on a VM guest for quite some time, without issue. In this case, the VM was being upgraded from Tools version 10.2.5 to 11.0.5.
The environment consisted of two (2) test VMs which comprise an application cluster. Permission was given to upgrade both the VM Tools and VM Hardware on the inactive node of the cluster. The automated VM Tools upgrade, via the “Upgrade VMware Tools…” vCenter menu option, appeared to execute flawlessly and reboot the guest VM’s OS, as expected.
Shortly after the guest VM was returned to service, application administrators reported seeing cluster communication failures in their application logs.
After extensive troubleshooting, it was determined the network instability was triggered by the activation of the File Guest Introspection filter driver. Because this behavior is well outside the norm for the File Guest Introspection components, a ticket was opened with VMware GSS. The unusual behavior was demonstrated to GSS and the issue was immediately escalated to VMware’s Development team.
After some advanced log collection and analysis at the direction of VMware Development, an interesting request came back: if not in use, remove Network Guest Introspection. This was quite puzzling as we didn’t use and, therefore, didn’t install Network Guest Introspection.
Even after a complete removal and installation of the VM Tools, the request from development came back the same: if not in use, remove Network Guest Introspection.
Finally, a remote support session with VMware GSS and Development revealed the root cause of the issue: an orphaned installation of the Network Guest Introspection components. The services and kernel-mode drivers were present but the VM Tools installer had no record of them.
Once the Network Guest Introspection service was disabled, the File Guest Introspection filter driver was enabled without issue. This removed any doubt the orphaned Network Guest Introspection components were to blame.
Now we have two new challenges:
- How to remove the orphaned components.
- How to determine where the orphaned components were present.
Removing the orphaned components can be solved fairly easily but it tedious. You must first install the Network Guest Introspection components with your current VM tools installer. This will overwrite the orphaned components with a known installation, and allow them to be uninstalled with the same VM Tools installer. So install, then remove the Network Guest Introspection components to ensure a proper removal.
Please Note: As with any scripts found online, the scripts referenced in this post may be disruptive and must be understood and run against test assets first.
For detecting where this orphaned components may be present, a quick script was developed. The script will detect and optionally disable the orphaned components by stopping and disabling the Network Introspection Windows service. Disable-NetworkIntrospection on GitHub
For a more permanent fix, review VMware’s internally developed script attached to KB Article 78016.