vRealize Automation 7.x – Beware of the Certificates
I worked recently with a customer to fix a problem during a vRealize Automation 7.2 PoC installation. I need to say that I’ve never installed a vRA Appliance PoC before so I didn’t know how parts working together. During the installation wizard process we encounter the following error message where the installer tried to configure the WEB component:
Trusted connectivity validation failed for address “https://vra01.ad.vbrain.info/“: The request was aborted: Could not create SSL/TLS secure channel.
Failed to execute validation. Error: The request was aborted: Could not create SSL/TLS secure channel.
Issue 1:
From the error message I assumed that it had something to do with certificates. So we checked the certificate of the WEB service (created during Installation wizard) and the VAMI service (created during OVA deployment) the vRA Appliance is using. It pointed out that the certificate of the VAMI service looked like this:
FQDN.DOMAINNAME e.g. vra01.ad.vbrain.info.ad.vbrain.info
During the OVA deployment you need to enter a FQDN or IP address for the appliance hostname and a domain name. I think the process which is creating the VAMI certificate is somehow wrong because it’s using the FQDN and the domain name to create the certificate.
My recommendation for future deployments will be to use only the hostname instead the FQDN to avoid this problem.
We tried really hard to replace this certificate which we thought could be the problem. We used the certificate which was generated for the WEB service. After that the IAAS agent couldn’t contact the vRA Appliance any more. So we decided to install the Agent again. During the installation I saw that it needs a fingerprint of the certificate used by the vRA VAMI. Now I know why the connection didn’t work anymore. The reason for this was because we change the certificate and with the certificate came also a new fingerprint. But after changing the certificate and working on this issue for some hours we decided to step back to the last snapshot and start from “scratch”. I also got the information from a friend that in his lab this certificate looks like the same and there he had no issues. That was the final hint we need to move away from this theory.
Issue 2:
We still didn’t know from where the problem was coming but we missed something in the initial error message. We assumed that the VAMI certificate could be the problem but reading the error message once more we saw that the VAMI port 5480 wasn’t mentioned.
We connected through RDP to our IAAS Windows VM, opened an Internet Explorer windows and entered https://vra01.ad.vbrain.info to see which certificate comes back. To our surprise we got the following error message from IE:
This Page can’t be displayed
Turn on TLS 1.0, TLS 1.1, and TLS 1.2 in Advanced settings and try connecting to https://vRAApplianceServerName again. If this error persists, it is possible that this site uses an unsupported protocol or cipher suite such as RC4 (link for the details), which is not considered secure. Please contact your site administrator.
This was now very weird because we also checked the VAMI interface and there we didn’t get this message. I checked both used certificates for VAMI and the vRA interface with the following command:
openssl s_client -connect vra01.ad.vbrain.info:5480 and openssl s_client -connect vra01.ad.vbrain.info:443
Here are the differences:
VAMI Certificate
vRA Certificate
As you can see both certificates are using a different cipher. After we found out that the cipher is different the customer gave me the decisive hint. He told me that the security department has implemented a new GPO for new Windows Servers which should only allow a subset of available ciphers. The reason for that was to eliminate the POODLE vulnerability. In addition the security department defined the following certificate requirements:
- Protocol
- TLS 1.2 (TLS 1.0 and TLS 1.1 for specific Applications)
- Cipher
- AES 128/128 and AES 256/256
- Hashes
- SHA, SHA256, SHA384, SHA512
- Keys
- Diffie-Hellman and ECDH
The ciphers allowed on the IaaS server we were using were:
- TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P521
- TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384
- TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256
- TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P521
- TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P384
- TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256
- TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P521
- TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384
- TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P256
- TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA_P521
- TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA_P384
- TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA_P256
You can find the information in the following registry key:
HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Cryptography\Configuration\SSL\00010002
When I added in my test environment the TLS_RSA_WITH_AES_256_CBC_SHA cipher to the existing ones I had no problem accessing the website.
In addition there where also the following settings made in the registry:
- HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers
- enable (REG-DWORD 0xffffffff, decimal 4294967295)
- AES 128/128
- AES 256/256
- disable (REG-DWORD 0x00000000, decimal 0)
- NULL
- DES 56/56
- RC2 128/128
- RC2 40/128
- RC2 56/128
- RC4 128/128
- RC4 40/128
- RC4 56/128
- RC4 64/128
- Triple DES 168
- enable (REG-DWORD 0xffffffff, decimal 4294967295)
- HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Hashes
- enable (REG-DWORD 0xffffffff, decimal 4294967295)
- SHA
- SHA256
- SHA384
- SHA512
- disable (REG-DWORD 0x00000000, decimal 0)
- MD5
- enable (REG-DWORD 0xffffffff, decimal 4294967295)
- HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\KeyExchangeAlgorithms
- enable (REG-DWORD 0xffffffff, decimal 4294967295)
- Diffie-Hellman
- ECDH
- disable (REG-DWORD 0x00000000, decimal 0)
- PKCS
- enable (REG-DWORD 0xffffffff, decimal 4294967295)
When removing the cipher limit registry key only the disabled PKCS registry key led to the same behaviour as the used GPO. When removing the key as well everything works fine.
Conclusion
It looks like that the vRA 7.x appliance generates different self-signed certificates with different ciphers. In a normal environment with no cipher restriction this is not a problem. Only when a customer restricts the available ciphers because of security issues this could lead to a problem. Also the wrong generated VAMI certificate is something that need to be fixed and I will hand this over to our vRA team. To be clear, this is NOT a big issues for two reasons:
- Hopefully you always use your own PKI to create certificates for your production environments. And with that you have defined your certificate requirements.
- Even if you run into this issue during a test or PoC installation, you can still replace the self-signed certificates with ones only dedicated to the PoC.
Finally a big thank you to Wolfgang and Herbert who helped me through this troubleshooting.