HA for net_connect

Document created by Thomas GENTILHOMME on Mar 1, 2017
Version 1Show Document
  • View in full screen mode

Hi,

 

I recently started a new probe to bring high availability for net_connect probe. It can be useful if you split your pings profiles between multiple net_connect (On the active & passive hub for example). So, if the active hub go down you are not going to loose any pings monitoring.

 

This probe work like the HA probe. It was created with the objective of managing a single node. But this time you have to install the probe on both hub (bi-directionnal HA).

 

How work the probe with multiple steps : 

 

  1. Read CFG, Create folders, Create default state field
  2. Checking availability of local & remote hubs.
  3. If (remote) available 
    1. Are we under HA ? If yes push original CFG to the local net_connect and deactive HA. (+ create a backup of local cfg).
    2. Save local & remote net_connect cfg under storage directory.
  4.  If (remote) is not available ( hub down or net_connect down ).
    1. If we are not under HA
      1. Merge remote cfg into local cfg
      2. Push this new cfg to the local net_connect.
      3. Active HA.
    2. If we are under HA : Nothing to do.
  5.  Write state file to the disk (with HA state in).

 

You have to be vigilant if you mix this probe with any kind of net_connect provisionning mechanism (can generate collision between them). I'm working on a way to manage this (AKA daemon probe with callback).

 

Configuration

 

The configuration is pretty easy. Just complete the nim_addr ( it's the remote hub addr ). netconnect_online is for triggering a HA when the remote net_connect is offline (Not really a good practice in normal situation, but that can maybe be useful for you...).

 

Dont put nim_login and nim_password if you are packaging the probe in Nimsoft ( it's not useful ).

 

The pre-release work as a timed-probe.

 

Framework & Download

 

The probe has been created with the upcomming perluim 4.2 (LTS Version). This version bring sugar & code fix improvment. This version can be downloaded on this git branch : 

GitHub - fraxken/perluim at R4.2  (Stable release comming Tuesday).

 

The pre-release can be downloaded here : 

Release Warran 1.0 · fraxken/netconnect_ha · GitHub 

 

Roadmap for the official release : 

 

- HA for profiles groups.
- Avoid collision when we merge cfg between them (Like the same group or the same profile server).
- Create two distincts version ( timed probe & daemon probe ). Daemon probe to handle collision properly with a provisionning mechanism / probe.
- First API documentation for netconnect_ha core class.
- Publish with perluim R4.2 official stable LTS release.

 

Comming the next week. (Wednesday).

 

If you have any comments or any suggestions  

 

Best Regards,

Thomas

3 people found this helpful

Attachments

Outcomes