We run 3 Policy Servers and our Web Agents are connecting to them, as we defined them in HCO in order: PS1, PS2, PS3, with Failover option set to Yes.
We found that sometimes the Web Agents are not able to failback to recovered Policy Servers.
To illustrate this, suppose we have the following Policy Servers defined following your HCO :
PS1 -- (suppose it is currently down)
When the Web Agent is initialized, it will try to connect first to PS1. As it is currently down, it will fail to connect to it and the Web Agent will attempt to connect to Policy Server PS2, which is available. As expected the initialization will complete correctly. The Web Agent will send requests to the Policy Server PS2. When PS1 is recovered and up again, the Web Agent will not fail back to the Policy Server PS1 and it will continue sending requests to PS2. Why is this behavior and how can I fix this ?
Web Agent : lower than 12.52SP1CR06
During Web Agent initialization, the Web Agent marks the Policy Server PS1 as "Failed" since this Policy Server is not available. After, since the Web Agent is able to connect to the Policy Server PS2, PS2 is marked as "Active". And the Web Agent won't try to connect to the Policy Server PS3 since the Web Agent has found already an "Active" Policy Server (PS2). The Web Agent marks PS3 as "Disabled".
The Web Agent Connection Service thread won't consider "Failed" Policy Servers again, but it will monitor only the Active/Inactive/Disabled Policy Servers only. That is the reason why the Web Agent doesn't try to contact the Policy Server PS1 again.
This behaviour is changed starting from R12.52 SP1 CR06. The Web Agent will now mark the "Failed" Policy Servers as "Disabled" instead. As such, the Web Agent Connection Service thread will try to connect to the Policy Server PS1 when this one will be available again.
As workaround, the first Policy Server PS1 could be added as well as PS4 in the HCO, so when PS2 and PS3 are marker as "failed" then the Web Agent will try to connect the Policy Server 1 instance if this one is back online.
Also, a restart of the Web Agent will solve the issue, if the Policy Server PS1 is back online.
From cumulative notes on 12.52SP1CR06:
00216581 DE143166 Web Agent is not failing back to the first Policy Server and requests are not processed successfully when starting the first Policy Server.
KD : KB000006584