vRLCM – environment deployment failed
Looks like I’m little bit into Lifecycle managers, but this time it’s the vRealize Lifecycle Manager (vRLCM). I’m running the vRLCM now for quite a time in my lab to play around with different versions my customers have, as well as deploying test environments with new versions to teach myself the new and fancy stuff.
Background
This time I wanted to roll out a complete vRealize Suite 8.6 environment so I updated my vRLCM to 8.6 and created a new environment with vRealize Log Insight 8.6 and vRealize Operations 8.6. I also reserved IPs in my IPAM, created DNS entries for the names and IPs and created CAs for the new deployments.
Issue
During the initial setup of the environment there is also a precheck phase where vRLCM checks all the information entered previously and see if everything is valid. During this phase I received one error which was not clear and I run it a second time and it was successful.
After the precheck phase was ok, vRLCM started the deployment. Unfortunately, the whole process failed during the vrliprevalidation phase.
The only error message I received was this:
So the prevalidation phase failed although it was successful during the precheck phase which was kind of strange. Why has it failed now and not during the precheck?
As I’m not so familiar with the vRLCM log files I run the following command at the command line of the appliance to see where I can find this error message:
find /var/log -type f -exec grep -H 'LCMVRLICONFIG40038' {} \;
The required log file was then found under /var/log/vrlcm/vmware_vrlcm.log and I found this file also in the documentation.
Here is now a snippet of the log file:
2021-10-25 12:17:50.165 INFO [pool-2-thread-14] c.v.v.l.u.SshUtils - -- host name got after resolving ... remote2019.ad.vbrain.info
2021-10-25 12:17:50.185 INFO [pool-2-thread-14] c.v.v.l.u.SshUtils - -- host name got after resolving ... remote2019.ad.vbrain.info
2021-10-25 12:17:50.186 INFO [pool-2-thread-14] c.v.v.l.p.a.s.Task - -- Injecting task failure event. Error Code : 'LCMVRLICONFIG40038', Retry : 'false', Causing Properties : '{ CAUSE :: }'
com.vmware.vrealize.lcm.common.exception.EngineException: vRLI prevalidation failed
at com.vmware.vrealize.lcm.plugin.vrli.VrliInstallValidationTask.execute(VrliInstallValidationTask.java:76) [vmlcm-vrliplugin-core-8.6.0-SNAPSHOT.jar!/:?]
at com.vmware.vrealize.lcm.automata.core.TaskThread.run(TaskThread.java:45) [vmlcm-engineservice-core-8.6.0-SNAPSHOT.jar!/:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
at java.lang.Thread.run(Unknown Source) [?:?]
Solution
The most important part of the snippet was the two lines before the error message. The installer tried to resolve the configured name and got a different FQDN back as defined in the setup. I checked my DNS and unfortunately, it is always DNS. I had 2 different DNS names configured for the same IP address. Of course, I would run into this error. The precheck was successful because of DNS round-robin. So every time a query is sent to the DNS it will return a different name. In the case of the precheck, it was the correct name.
I fixed the newly DNS name with a new IP address. After that I was able to deploy the environment.