PM Vertica monitoring improvements

Idea created by raphael.franck on Jan 25, 2018
    New
    Score15

    Hi all, Lutz_Holzbecher, Dan_Holmes,

     

    customers will probably benefit from improved out-of-the-box monitoring for HPE Vertica, if some trap mapping is added to Spectrum.

     

    What have we done in total?

    - configure standard net-snmp agent on Vertica hosts

    - add device models to Spectrum for all Vertica hosts

    - add standard device thresholds, filesystem thresholds and process monitoring in Spectrum

    - add trap mapping to Spectrum based on VERTICA-MIB

    - configure Vertica to send traps on conditions worth to be notified of

     

    The VERTICA-MIB does not specify any pollable attributes. There is no dedicated snmp agent nor any integration with default operating system snmp agent (net-snmp). There is just 1 single traptype defined in that MIB,

     

    referring to HPE online documentation (needs to be verified for newer Vertica versions):

     

    Version 7.0.x:
    https://my.vertica.com/docs/7.0.x/HTML/Content/Authoring/AdministratorsGuide/Monitoring/Vertica/ConfiguringEventTrappingForSNMP.htm

     

    $ /opt/vertica/bin/vsql <DB-NAME>

     

    => SELECT SET_CONFIG_PARAMETER('SnmpTrapsEnabled', 1 );
    => SELECT SET_CONFIG_PARAMETER('SnmpTrapDestinationsList', 'host_name1 port1 CommunityString1,hostname2 port2 CommunityString2' );
    => SELECT SET_CONFIG_PARAMETER('SnmpTrapEvents', 'Low Disk Space, Read Only File System, Loss of K Safety, Current Fault Tolerance at Critical Level, Too Many ROS Containers, WOS Over Flow, Node State Change, Recovery Failure, Stale Checkpoint');

     

    Version 7.1.x
    https://my.vertica.com/docs/7.1.x/HTML/Content/Authoring/AdministratorsGuide/Monitoring/Vertica/ConfiguringEventTrappingForSNMP.htm

     

    $ /opt/vertica/bin/vsql

     

    => ALTER DATABASE <DB-NAME> SET SnmpTrapsEnabled = 1;
    => ALTER DATABASE <DB-NAME> SET SnmpTrapDestinationsList = 'host_name1 port1 CommunityString1,hostname2 port CommunityString2' );
    => ALTER DATABASE <DB-NAME> SET SnmpTrapEvents ='Low Disk Space, Read Only File System, Loss of K Safety, Current Fault Tolerance at Critical Level, Too Many ROS Containers, WOS Over Flow, Node State Change, Recovery Failure, Stale Checkpoint';

     

    The attachement contains sample configuration files for getting the trap mapping done in Spectrum. The event IDs that are used are based on a specific developer ID (not CA). Although you can, technically you should not need to change these. The EventDisp does some enhanced stuff in order to get fault-specific alarm titles as well as automatic alarm clearing on "ok-traps".

     

    regards,

    Raphael

    Attachments