DX NetOps

  • 1.  RCPD Syncronization failing.

    Posted Feb 08, 2017 05:02 AM

    - New employee on old install.

     

    I noticed in the administration page of OneClick that a Lanscape secondary status was set to 'Not ready', after checking I realised that the processes on the server had crashed. After a restart the database was corrupt and I had to restore to a previous version. This is when it became fun.

     

    The only database available was the current corrupt database and otherwise there was only compressed saves with the most recent dating back to march 2016. To resolve the problem I copied the latest database save from the primary server to the backup server (Is this ok to do?).

     

    After realising that the  between the two servers had been failing for a long time I decided to try and force it using rcpd. Here is the output from both servers.

     

    Primary server (Automatic)

    Feb 07 23:30:36 : rcpd started

     command:  SEND
        host:  #########
        file:  /usr/Spectrum/SS-DB-Backup/db_20170207_2330.SSdb
        rcpd:  0xcafe
       procd:  0xfeeb
    compress:  1

    Feb 07 23:30:36 : Successfully connected to remote rcpd.  Initiating file transfer...
    Feb 07 23:30:36 : Starting file transfer using 1048576 byte application buffer, 1048576 byte TCP socket buffer.
    Feb 07 23:31:23 : /usr/Spectrum/SS-DB-Backup/db_20170207_2330.SSdb: has successfully been copied over to #########
    Feb 07 23:31:23 : Waiting for remote rcpd to process the database file...
    Feb 07 23:31:25 : The remote rcpd failed during processing. Check the RCPD.OUT file on the remote machine for details.
    Feb 07 23:31:25 : Final status is -1

     

    Primary Server (CLI)

    Feb 08 10:19:54 : rcpd started

     command:  SEND
        host:  ########
        file:  /usr/Spectrum/SS-DB-Backup/db_20170802_0918.SSdb
        rcpd:  0xcaff
       procd:  0xfeeb
    compress:  1

    Feb 08 10:19:54 : Waiting for remote processd to startup rcpd...
    Feb 08 10:20:24 : Waiting for remote processd to startup rcpd...
    Feb 08 10:20:54 : Waiting for remote processd to startup rcpd...
    Feb 08 10:21:24 : Waiting for remote processd to startup rcpd...
    Feb 08 10:21:54 : Waiting for remote processd to startup rcpd...
    Feb 08 10:22:24 : There is no remote rcpd running on #########  Exiting

     

    Secondary Server (Automatic)

    Feb 07 23:34:25 : rcpd started

     command:  RECV
        file:  .ft_save_file.SSdb
        rcpd:  0xcafe

    file name: .ft_save_file.SSdb.gz
    file size: 13517526
     compress: 1
    peer host: #######

    Feb 07 23:34:25 : Starting file transfer using 1048576 byte TCP receive socket buffer.
    Feb 07 23:35:12 : Successfully received .ft_save_file.SSdb.gz
    Feb 07 23:35:12 : Uncompressing the database file...success.
    Feb 07 23:35:14 : Stopping the SpectroSERVER...FAILED. Response ticket :
    DATE OF THIS REQUEST

     

    Secondary Server (CLI)

    Feb 08 10:23:54 : rcpd started

     command:  RECV
        file:  .ft_save_file.SSdb
        rcpd:  0xcaff

    Feb 08 10:26:54 : Waited for 3 minutes but did not accept a connection from the primary rcpd.

     

    I have tried with the default ports and custom ports and the results are the same.

     

    Any ideas?



  • 2.  Re: RCPD Syncronization failing.

    Posted Feb 08, 2017 06:41 AM

    Update: I noticed that their was a huge time discrepancy between the two servers, the backup server was not correctly syncronising to the ntp server. I have fixed the issue but I still have the same problem.



  • 3.  Re: RCPD Syncronization failing.

    Posted Feb 08, 2017 09:35 AM

    Hi Peter,

     

    I have seen this happen from time to time in my environment which consists of 18 primary landscapes with 18 secondaries. Typically all that is needed is a restart of processd on the seconday, then retry the Online Database backup from the primary server. If that doesn't work, then I would suggest a ticket to support.

     

    Jeremy



  • 4.  Re: RCPD Syncronization failing.

    Broadcom Employee
    Posted Feb 08, 2017 11:06 AM
      |   view attached

    Is the RCPD.OUT still showing the “Stopping the SpectroSERVER” error?

     

    If so, what’s the last message in the VNM.OUT? Is it waiting on model activates?

     

    If so, when you cycle processd as Jeremy noted, add the following to the .vnmrc before starting the SS:

     

    mdlact_debug=true

     

    This debug will show what model activation is causing it to not stop.

     

    Cheers

    Jay



  • 5.  Re: RCPD Syncronization failing.

    Posted Feb 09, 2017 04:01 AM

    Hi Jason,

     

    Yes, I am still seeing the Stopping the SpectroSERVER...FAILED error in the RCPD.OUT of the secondary server.

     

    The VNM.OUT shows no errors and that the server is ready however that dates back to the 7th, did you want me to restart the server?

     

    Thanks.



  • 6.  Re: RCPD Syncronization failing.

    Posted Feb 09, 2017 03:58 AM

    Hi Jeremy,

     

    First of all, thanks for the suggestion, I appreciate it. However, I restarted the processd but it didn't seem to change anything.



  • 7.  Re: RCPD Syncronization failing.

    Posted Feb 09, 2017 04:43 AM

    I thought I had it solved after the backup succeeded but I then noticed that the SS was not running. Once restarted the sync failed once again.

     

    On top of this, the archive manager wont run now saying that it has been shut down because the SS does not contain a User model for this user.

     

    For future info, this Tip is not bad: https://communities.ca.com/docs/DOC-231159959



  • 8.  Re: RCPD Syncronization failing.

    Broadcom Employee
    Posted Feb 09, 2017 07:52 AM
      |   view attached

    What was the last message in the VNM.OUT?

     

    It might be time to open a support ticket…



  • 9.  Re: RCPD Syncronization failing.

    Posted Feb 10, 2017 01:39 AM

    Ok I've fixed the issue but not 100% which action resolved it. But funnily enough I found another remote server with the same problem and I was able to resolve it also.

     

    Original Server

    Instead of using /etc/init.d/processd restart I decided to try to stop the process using processd.pl, reboot the server and then manually start the SS restoring the last database backup from the primary server. The backup then syncronised correctly.

     

    Remote Server

    So this server was the backup for the primary MLS server and apparently the CORBA service had crashed which left the server in a terminated state with an 'ungraciously shut database'. I force removed the locks, restored the latest good database and then started the SS, the sync failed at which point I rebooted the server and then it seemed to work.

     

    Like the French say: Dans le doute? Reboot!



  • 10.  Re: RCPD Syncronization failing.

    Broadcom Employee
    Posted Feb 10, 2017 07:20 AM
      |   view attached

    Glad to hear you got it resolved…

    Have a nice weekend ☺