Did you ever wonder, how to troubleshoot target device boot process. Let’s say, that your environment is working correctly and one day some of your servers can’t boot from PVS server. Target device (TD) just hangs on “Connecting to the Provisioning Services. Please wait …” message.
In PVS version 6.x you could enable logging, but this feature was removed in version 7.x. But instead of logging Citrix gaves us great tool for troubleshooting – CDFControl. So start this tool on each PVS server you have (if you have many sites in you environment use it only in site, where you have a problem). Check all modules and press “Start Tracing“. Then go to TD console and reboot the server. Write down the time, when device hangs. After that stop each tracing and open it for analyzing. Here are my old posts how to do this:
And here is, what I found:
CDF_ENTRY 5 Exit, Ardence::CLocalManagedVdisk::CLocalManagedVdisk CDF_INFO 8 About to open vhd, folder = <p:\XAStore01>, diskFileName = <vDisk_XA01.22.avhd> CDF_ERROR 1 VhdOpen failed with error: 0xE0210021. folder = <p:\XAStore01>, diskFileName = <vDisk_XA01.22.avhd> CDF_ERROR 1 Cannot get vdisk header for device at x.x.96.60:6901 [001DD8B71DDD], for base disk id = 17, version number = 22 CDF_INFO 4 Force retry for clientdevice at x.x.96.60:6901 [001DD8B71DDD]
OK, so what does this mean? This target was set, to boot from vDisk_XA01 image. This image has 3 versions:
- version 22 – updates – test
- version 21 – updates – test
- version 20 – base – production
All servers should boot from the newest version – so from 22. As you know, PVS is using VHD chain machanizm to merge all updates to base version and serve it to servers. I have 4 PVSs in one site.
- PVS01 – 20 connected devices
- PVS02 – 30 connected devices
- PVS03 – 17 connected devices
- PVS04 – 24 connected devices
So when TD got boot strap image from PVS it was redirected to least loaded server – PVS03. So I verified, if all VHD files from that vDisk_XA01 image was correct. I used MD5 tool to generate checksum for all VHD files. And I found, that vDisk_XA01.21.avhd checksum was different that on other PVS servers (files was was the same). When I replace this 21 version file with the correct one and reboot Target Device it booted correctly.
If you see in CDFControl, that “Events lost” counter is greater than 0, stop the tracing, go to options and increase buffer size to 512 KB.