I was recently tasked with an issue where our CIM probe was failing during CIM requests to new VMWare ESXi 6.5 servers we deployed. We were getting connection rejected failures from our probes which resulted in no valuable data being returned. We started following the breadcrumbs which lead us back to the ESXi host. We opened the UI and checked the health monitor in the UI and found it was showing “No sensor data available”. The first thing we checked was to see if the sfcbd-watchdog was running, and it was not. By default, this service was turned off, or so we thought! We turned on the service and the UI reported that the service was now running.
Even after several refreshes of the UI it stilled showed running but we still received a connection rejected. We rebooted the ESXi host and after it came back we tested the connections again and are still failing. We reopen the web UI and looked at the services again and there was our watchdog service stopped. We had set the service to autostart with host so this lead us to believe it must be dying at some point.
The best way to see what a service doesn’t like is to login to ESXi host using SSH and manually start the process and see what it’s output is. A quick /etc/init.d/sfcbd-watchdog start showed us that the service was “Administratively disabled”.
After digging around Google for some reference to this new data we came across a blurb about setting an option to allow CIM manager to run.
The command esxcli system wbem set –enable true followed by /etc/init.d/sfcbd-watchdog start allowed the sfcb-HTTPS-Daem process to start. This process is the TCP Listener that takes CIM requests from probes like ours and returns the health of the hardware.
You should get an output like the following
/etc/init.d/sfcbd-watchdog start
sfcbd-init: Getting Exclusive access, please wait…
sfcbd-init: Exclusive access granted.
sfcbd-init: Request to start sfcbd-watchdog, pid 69438
sfcbd-config[69448]: No third party cim providers installed
sfcbd-init: snmp has not been enabled.
sfcbd-init: starting sfcbd
sfcbd-init: Waiting for sfcb to start up.
sfcbd-init: Program started normally.
Invoking lsof -nPV | awk {‘count[$2]++}END{for(i in count)print count[i], i’} | sort -n in the SSH console will produce a list of running processes minus all the junk. You can use this list of processes to determine what is running on the ESXi Host.
We also used esxcli network ip connection list to get a list of ports the ESXi host was listening on to help determine if the port 5989 was active.
If you are deploying VMWare ESXi 6.5 in your environments and need CIM health data, remember to enable it and do not just assume that the WebUI is telling you it is active.
Check out our ESXi Health Monitor for LabTech (Automate) here
Thanks man — this saved me trying to figure out why my monitoring system couldn’t do health checks on vsphere 6.5 hosts
Thanks a lot ! This fixed my problem with the Nagios check-esxi-hardware plugin on 6.5 hosts.
For me, esxcli system wbem set –enable true didn`t worked, but had to issue:
esxcli system wbem set –e 1
I just found this (same problem, Intel server board with Xeon e3 1245v2).
I had to run:
esxcli system wbem set -e true
/etc/init.d/sfcbd-watchdog start
The quoted syntax above didn’t work (errors on both lines though the esxcli line was the problematic one from what I can see).
esxcli system wbem set –enable true
This line does work. But I’m thinking in the formatting of this page, the double hyphen was changed to a longer more solid hyphen.
Looking at the syntax for the command, either -e or –enable will work.
CIM Server sfcbd-watchdog service does not automatically starts on ESXi 6. Is there any similar commands for the same.
Following command does not works in 6.0
esxcli system wbem set –enable true (two minus signs before enable)
Tried following command to make this change persistent on reboot but, after every reboot the CIM server service is not started automatically inspite of selecting start & stop with Host reboot.
chkconfig sfcbd-watchdog on
chkconfig sfcbd on
/etc/init.d/sfcbd-watchdog start
Can any one help me to create a cron job to automatically run following command to start CIM server service during every reboot of ESXi.
/etc/init.d/sfcbd-watchdog start
Brilliant. Looked all over before I found this. I wish the VMware support site was as helpful. I can now monitor the hardware with Nagios
For ESXi 6.7 use:
esxcli system wbem set –enable=true
/etc/init.d/sfcbd-watchdog start
That incantation to pull running services and piping to awk is excessive. ESXi has ‘ps’. Running that alone gets you close to the output of your ‘lsof’ spell. I’ve been doing things like ‘ps | grep sfcbd’ or ‘ps | grep vcenter’. I don’t see an immediate benefit to this method. You can also tell grep to (i)gnore case with -i. That lets you do ‘ps | grep -i http’ and the sfcbd services relating to http to show along with everything else http. grep -i is also useful for pulling storage adapter information from lspci. Do ‘lspci | grep -i mass’ and all devices listed as “Mass storage controller” will appear.
Also, I’ve addressed this issue by adding a cron job to restart the CIM service everyday at 2AM. This was to address CIM data going stale in our remote monitoring utility (NCentral). I have a script to do ‘/etc/init.d/sfcbd-watchdog stop’ then ‘/etc/init.d/sfcbd-watchdog start’ and then use another script with a GUI/prompt to copy the restart script to /usr/sbin, copy (cp) ‘/var/spool/cron/crontabs/root’ (cron file for user ‘root’) to ‘/var/spool/cron/crontabs/root-$DATE’ and then echo >> a line into /crontab/root that calls the restart CIM script at 2am daily.
Save this as it’s own file with .sh as extension.
Script to restart CIM service on ESXi host:
#!/bin/sh
# Comment sfcbd-watchdog is the name for CIM monitoring service in ESXi
/etc/init.d/sfcbd-watchdog stop
/etc/init.d/sfcbd-watchdog start
exit
Pertinent section from my GUI/Prompt script:
touch /var/spool/cron/crontabs/root.old.$(date +%F)
printf “Created backup of cron file ‘root’\n”;
cp -f /var/spool/cron/crontabs/root /var/spool/cron/crontabs/root.old.$(date +%F);
printf “Copied contents of cronfile ‘root’ to backup file\n”;
chmod +x restartCIM.sh
cp “$cwd/restartCIM.sh” /usr/bin;
cp “$cwd/restartCIM.sh” /usr/sbin;
printf “Copied restart script to /usr/sbin\n”
echo “0 2 * * * /usr/sbin/restartCIM.sh” >> /var/spool/cron/crontabs/root;
touch command makes a new file next to the cron file with the filename + date. cp command copies current cron file out to touched file. In troubleshooting i found cp wouldn’t work on it’s own so I had to create the file first. Possibly due to adding date to filename. Next we cp the above script to root’s bin folder (/usr/sbin/ sbin = Superuser Binary. Only root has access.) Copied to global /bin as well to make available for any potential user accounts.
$cwd is a variable containing the working directory of the GUI script. I have ‘cwd=$(pwd) # Set $cwd to the current directory’ as one of the first lines in the GUI script so I can reference where the script is running from later on. This allows the script to be path agnostic. Copy all files for this job into a directory on ESXi, including GUI script, and run the GUI script. The script will look for all files in the same directory as the GUI script. This allows support staff to use a “graphical” interface to do complex linux/powercli/esxcli things without having to know linux/ESXi. Copy all files within a .zip to the datastore. SSH in, chmod +x the install script and then ./ the install script. The rest is done in the script.