Expert’s opinions on AWA high availability, fault tolerant and Zero downtime upgrade feature

Discussion created by IndikaPeiris611408 on Oct 19, 2016
Latest reply on Oct 24, 2016 by AlainMoisy

Dear product/community expert’s,

I have use case for the AE solution with following deployment and operational requirements and would like to hear the product/community expert’s opinions/advices for the same.

Also would like to discuss how to best use AWA product features such as “multi-server operation”,”non-stop operations” , “NetArea” , “Zero downtime upgrade” and various other configuration option to achieve fault tolerant /highly available and scalable  automation platform .

My Primary requirements are as follow:

Note: Architecture Diagrams are attached to the bottom of the discussion.

1.       Deployed to two datacentres call DC1 and DC2 ( It is possible to have third DC as well in future )

2.       Deployed to two servers (Namely “AE_1A” in DC1 and “AE_1B” in DC2 depicted in “diagram1”) at the start (Phase1). Future plan(Phase2) is to add more servers to the AE farm ( Namely “AE_2A” in DC1 and “AE_2B” in DC2 depicted in “diagram2”)

3.       Should be a single AE system (namely PROD) where all the CP(s) and the WP(s) are distributed to all available AE servers in the AE farm.

4.       All the distributed CPs should be in active mode and should serve any client who connect to any one of the CP in any AE server(diagram1 and diagram3 ).

5.       At any given time, admin should be able to put all CPs on one node( or multiple node) in to the hot-standby mode where the CP will no longer serve the client but get activated when other active CPs are no longer available in AE system. (diagram2 and diagram4 ).

6.       All AE agents should be configured to communicate to the CPs on the same datacentre of the agents as the first preference.

7.       At the event of agents could not connect to the CPs on the same datacentre, it should connect to the CPs on the other datacentre.


8.       Solution should comply with the “Zero downtime upgrade” feature as well.


1.       Have all the required products licenses


2.       All component can communicate to each other across the datacentres ( No firewall restrictions ) 

I went through the documentation and understood all its features and architecture concepts to a good level. Still trying to figure out how to combine these features together to build the more robust solution.

As per my understanding:

To address requirement (3,4), we can use multi-server operations feature

To address requirement (5), we can use non-stop operations feature

To address requirement (6), we can use NetArea feature

To address requirement (7), not sure this is possible as per explanation in the documentation?

Not sure the NetArea feature and non-stop operations feature work together practically?

Any opinions/advices reference is appreciate in advance. Thank you.



Indika Peiris

Support architecture diagrams