This morning I received an email via a colleague from a AE admin at another company. He was asking for help getting SSO working with the Automation Engine. I wrote up a summary of recommendations and findings based on my experience with this topic. I thought it might be worthwhile to post this summary here.
Single sign-on with the Automation Engine: recommendations & findings
- I found that single sign-on
does not work reliably in AE systems running on multiple nodes. I was
able to get single sign-on working reliably only when the Automation
Engine server was running on just one node. (See the detailed
discussion below about SSO on multi-node AE servers.)
- Be sure to use Oracle version
of Java, not IBM or some other flavor.
- Don’t forget to install the two
JARs of Java Cryptography Extension (JCE) Unlimited Strength
Jurisdiction Policy in
$JAVA_HOME/lib/security
. This must be done for both
the JRE running the JWP, and the JRE running the UC4 GUI. The single sign-on documentation did not make this clear
before, but it has since been updated. - Similarly, don’t forget to
install the proper JDBC driver in AE server lib directory. The JWP installation documentation describes this pretty
well. One note: LDAP-based connection strings like
jdbc:oracle:thin:@ldap://oraclenameserver…
do not work with the JWP. We
had to stick with a basic connection string likejdbc:oracle:thin:@oracleserver...
(We opened a product enhancement request for this: PMPER-454:
JWP should support LDAP-based Oracle JDBC connection strings in ucsrv.ini) - Initially, to aid
troubleshooting, start the JWP from the command line, with Kerberos
debugging enabled:
java ... -Dsun.security.krb5.debug=true -jar
ucsrvjp.jar ...
This will allow you to watch the Kerberos communication between the JWP
and the KDC, and pinpoint the underlying causes of many problems. Once you
have gotten things working, you cant start the JWP via the service manager. - If you see the error “U0003127
Logon error: Access denied” when turning on the integrated
authentication check box in the login window, there are two possible
fixes:
- Start the UI as an administrative user; or
- Enable AllowTGTSessionKey
in Windows.
This limitation is arguably due to a bug in the underlying SSO framework from Oracle: JDK-6722928.
SSO on multi-node AE servers
The Automic SSO
documentation claims that SSO will work in AE systems running on more than one
node. However, I was never able to figure out how to make it work reliably.
Firstly, the Automic documentation fails to mention something very important:
1. A separate service
user must be defined for each node on which the Automation Engine
runs.
A service user (or technical user) must be created to run the JWP.
The JWP, running as this user, connects to the KDC to authenticate. The KDC
will not be able to find an SPN defined on the service user, unless the userPrincipalName
of the currently logged-in user is also set to the SPN being used to authenticate.
Andreas at Automic Development confirmed that it won’t work to
use the same service user on both nodes, because the UPN must match the SPN.
This means that if you run the AE on two nodes, then you must create two separate
users. Each service user must be associated with just one AE node. The userPrincipalName
attribute of each user must be set to the same thing as the servicePrincipalName.
For example:
We opened problem
ticket PRB00119215
with Automic about this omission from the documentation. They have promised to
update the documentation to make it clear that a separate service/technical
user must be defined for each AE node.
2. Even with a separate
service user defined on each node, SSO may not work reliably in multi-node AE
systems
The reason, I believe, is that the JWP does not select the SPN it
uses to authenticate with the KDC based on the hostname of the node where the
JWP is running, but instead based on the node where the CP to which the UI
connected is running.
|
I’ll explain this in a bit more detail. When the User Interface
connects to the AE, it connects to a communications process (CP). Which CP the UI connects to
is somewhat unpredictable. (It depends on the order of addresses in the CP
list in the uc4config.xml file.)
During single sign-on, the process works like this:
1.
User Interface connects to CP
2.
CP connects to JWP
3.
JWP authenticates with KDC
If all CPs and WPs are running on the same node, it’s simple and
will work fine every time.
If however, if the Automation Engine processes are running on
multiple nodes, as depicted in the figure to the left, then about half of the
time, the CP will connect to a JWP running on a different node. E.g.:
1. User Interface connects to CP on uc4a
2. CP on uc4a connects to JWP
on uc4b 3.
JWP on uc4b tries to authenticate with KDC using an SPN like UC4/uc4a.mycompany.com@MYREALM
|
I suspect that because
of the above problem, SSO will not work reliably in systems with more than one
AE node, even if a unique service user is defined for each AE node.
Automic is
investigating this in PRB00111313.
I will update this discussion thread soon as I have news from Automic. One possible way of fixing this
problem would be to force the CP connect to a JWP running on the same node.
(This would mean that at least one JWP would have to be running on any node running a CP.)