AutoSys Workload Automation

Expand all | Collapse all

Testing primary to shadow failover but no jobs running

  • 1.  Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 09:40 AM

    Running WAAE with a primary, shadow and tiebreaker schedulers using dual event servers as well and currently trying to test the failover from primary to shadow schedulers. the failover to Shadow Schedule seems to work, get the right messages in the schedulers, but even though the shadow scheduler has taken over it doesn't actually run any jobs. Any ideas why?



  • 2.  Re: Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 10:04 AM

    That does not sound right, what is the message that you see in the EP logs of the primary and shadow at the time of the flip? Have you verified that the services are actually running and that there is database connectivity?

    Cheers,

    Chris  <CJ>



  • 3.  Re: Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 10:34 AM

    DB's are definitely running as I have checked this using chk_auto_up and connecting to both.

     

    When I stop the primary scheduler. I basically get a message saying "Primary scheduler shutdown complete" on the primary scheduler log. Then on the shadow server I get a few heart beat entries and then it says "the primary scheduler had been shutdown. The system is no longer running in HA mode", and the shadow scheduler continues to run, but no jobs get executed by the shadow server



  • 4.  Re: Testing primary to shadow failover but no jobs running
    Best Answer

    Posted May 10, 2017 10:39 AM

    What you are describing is a normal shutdown. To test the failover, kill the primary scheduler process (SIGKILL).



  • 5.  Re: Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 10:50 AM

    So I ran the command "sendevent -E STOP_DEMON -v FAILOVER" after looking this up. And this time it seems to have worked. So many thanks for this.



  • 6.  Re: Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 11:53 AM

    Mark they need to fix the failover, seriously. I do not and will not agree that service stop should NOT failover . 

    it should. the machine going down is the machine going down ... 

    I had to change the stop process to issue the sendevent .. This is not working as designed because autosys was designed to have the shadow take over whenever the primary was not heard ... 

    I know all the discussion, partly, but you will NOT convince me this was the right direction. We will need to agree to disagree... 

     

    Thank you 

    Steve C.



  • 7.  Re: Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 11:27 AM

    How did you failover?

    Killing the pid or saying stop service. You do know the stop service DOES NOT trigger failover.

    Check my thread … on failover

     

     

    Steve C.



  • 8.  Re: Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 11:56 AM

    Chris, this is the problem with the change in shutdown process. you will not get a failover if someone powers down the machine.. :-(

    unless you do what i did and change the K process.. :-|



  • 9.  Re: Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 12:02 PM

    But in the same time we do need an option to restart (fast) schedulers without re-syncing the DBs. 



  • 10.  Re: Testing primary to shadow failover but no jobs running

    Posted May 11, 2017 05:40 AM

    So initially I was just stopping the scheduler from the admin tool. This was working but after running the sendevent command the failover worked as expected. Basically I was trying to test the failover of the primary server as we had previously shut this down and didn't see the shadow take over. This seems to tie in with what your saying about not failing over is someone powers down the machine. However, after getting this working with the sendevent command, I tried turning off the primary scheduler server again, this time though I did see the shadow server take over, not sure why but it is working so everything seems to be good.



  • 11.  Re: Testing primary to shadow failover but no jobs running

    Posted May 10, 2017 12:07 PM

    Iriney , please do not confuse EP FAILOVER with DB FAILOVER those are 2 separate processes.