- UIM v8.51
- nas/AE version 8.56 or higher
In this document, we will discuss alarm enrichment under the context of a specific use case. This is just one example of leveraging the alarm enrichment functionality.
The end result of this configuration will dynamically modify the following UIM alarm attributes, custom_2 through custom_4:
custom_4 --> ci_description "Configuration Item Description"
custom_3 --> met_description "Metric Description Data"
custom_2 --> target "Target data from S_QOS_DATA table"
CMDB Data Source
The alarm_enrichment probe can be configured to read data from various data sources. Each data source is referred to as a “CMDB” (Configuration Management Database), for example: CA Service Desk Manager (CMDB) or other products. In this example, we will be using the UIM database as the CMDB data source. Currently, only JDBC-compliant SQL-database sources are supported..
Each data source is defined as:
- JDBC connect string
- user login
- database password
- query to extract the data from the cmdb
Every data source allows a user-defined name to be referenced in the enrichment rules. Each enrichment_rule can reference one data source. A data source can be used by many enrichment rules.
Once you have defined the CMDBs/data sources, you must define at least one enrichment rule.
Each enrichment rule defines a matching condition to match on alarms which should be forwarded to this enrichment rule. The enrichment rule defines what alarm enrichment should be performed, and from what data source additional information for this alarm should be read. When an alarm is processed by the alarm_enrichment probe, it will be copied to a new event where:
- the message identifier NimId is modified to ensure it is still unique
- the fields qsize, md5sum and subject are removed from the incoming alarm
- all fields starting with "hop" are copied by prepending it with "original_" so that the field "hop0" becomes "original_hop0" in the outgoing alarm.
The alarm is then matched against the configured alarm enrichment rules. An overwrite rule defines an alarm attribute, e.g., custom_2, and a value to which the alarm attribute should be set, e.g., target. Once an alarm has been processed against the alarm enrichment rules, it is passed on to the nas probe for further processing.
At a minimum, you need one routing rule (routing-rule) to forward your alarms to your Alarm Server (nas).
There might be a situation where you would want to create more than one routing rule, e.g., send alarms to a different ‘receiver.’
It is highly recommended to test and develop the alarm enrichment in a sandbox environment following good change control processes prior to implementing in a production environment.
CMDB Data Source and Query Prep
- Ensure the data source is reachable (via host:port)
- Ensure the data sources you are using are ready for the number of requests the alarm_enrichment probe is making to get alarm information
- Test your query against a single ci metric id In this case scenario) to make sure it works as expected and returns the expected results within a reasonable time frame. For MS SQL Server, speak with your DBA about “Execution Plan” and “Client Statistics” while running the query. You may need an index or maintenance plan to improve the speed of the query. MS SQL Query below:
case when d.target is null then
dev.dev_name + ':' + ci.ci_name else d.target end as target,
from CM_CONFIGURATION_ITEM_METRIC ccim (nolock)
inner join CM_CONFIGURATION_ITEM ci (nolock) on ci.ci_id=ccim.ci_id
inner join CM_DEVICE dev (nolock) on ci.dev_id=dev.dev_id
inner join CM_CONFIGURATION_ITEM_METRIC_DEFINITION ccimd (nolock) on ccim.ci_metric_type = ccimd.met_type
inner join CM_CONFIGURATION_ITEM_DEFINITION ccid (nolock) on ccid.ci_type = ccimd.ci_type
left join S_QOS_DATA d (nolock) on ccim.ci_metric_id=d.ci_metric_id where ccim.ci_metric_id = 'M014F2FE1FEFC1F60130A3EAC07D4C938'
- Test an abbreviated version of the query e.g., without the last ‘where’ clause with the ? variable and make sure you get results quickly:
select case when d.target is null then dev.dev_name + ':' + ci.ci_name else d.target end as target,ccim.ci_metric_id,ccim.ci_metric_type,ccimd.met_description,ccid.ci_description from CM_CONFIGURATION_ITEM_METRIC ccim (nolock) inner join CM_CONFIGURATION_ITEM ci (nolock) on ci.ci_id=ccim.ci_id inner join CM_DEVICE dev (nolock) on ci.dev_id=dev.dev_id inner join CM_CONFIGURATION_ITEM_METRIC_DEFINITION ccimd (nolock) on ccim.ci_metric_type = ccimd.met_type inner join CM_CONFIGURATION_ITEM_DEFINITION ccid (nolock) on ccid.ci_type = ccimd.ci_type left join S_QOS_DATA d (nolock) on ccim.ci_metric_id=d.ci_metric_id where ccim.ci_metric_id = ?
- Keep an eye on latency to make sure your data source can return results quickly. When you run the query/population query the first row should be returned in a reasonable time frame, in a matter of seconds.
- When accessing large and busy databases consider running a ‘shadow’ database for read-only query purposes. A shadow database is basically a mirror of the production database you can use for testing/dev purposes.
- Create and use a separate database user for the connection string to the data source which allows easier troubleshooting if there is a problem. For example:
Sample JDBC connect URLs
connection_url = jdbc:oracle:thin:@//172.17.4.12:1521/ORCL
connection_url = jdbc:sqlserver://172.17.8.12:1433;DatabaseName=CA_UIM;
connection_url = jdbc:mysql://172.17.0.12:3306/choslm
Alarm Enrichment - Raw Configure
The alarm_enrichment probe is configured using the Raw Configure option in the nas probe. The configuration settings for this probe are stored in the nas configuration file. Memory settings for the alarm_enrichment probe are maintained in the startup->opt section of the alarm_enrichment Raw Configure option. We recommend a min/max of at least 2048/4096 respectively.
The alarm_enrichment configuration settings are contained in the enrichment-source, enrichment-rules, and routing-rules sections of the raw configuration for the nas probe. The alarm_enrichment probe subscribes to "alarm" messages, modifies the alarm and submits a new message to the nas with a modified subject of "alarm2." The nas probe subscribes to the "alarm2" messages.
Note: The alarm_enrichment probe processes the enrichment and routing rules in alphanumeric order. You can determine the order in which the rules are processed by using a naming convention for the section names that dictate the order.Users are allowed to change the subject (queue) names. By default, alarm_enrichment probe uses the "alarm" subject and forwards messages to the "alarm2" subject for the nas probe. Warning: Note that if the subject name is changed, any existing content in the queues will be lost.
population query is the pre-population non-targeted query that will be executed on startup of the probe and at regular intervals. There should not be a "?" in this query as no ID substitution will occur. This query is placed in the alarm_enrichment cache for quick retrieval. The following example gathers name, ip, and os_type. Name and ip are used to help match the alarm and os_type is used for updating custom_4.
This query is a targeted query (of the population_query) is executed if the data required is not returned from the AE cache. Specify a "?" at the end of the query where the ID of the item can be filled by the results of the query.
Example query which simply returns name, ip and os_type data:
select name,ip,os_type from cm_computer_system
Note that for large databases with large tables that are being queried, the pre-population query may be left empty for better performance.
Note also that if storing millions of items in the AE cache then the cache initialization can take a very long time, hence the AE queue may take some time to process through the nas.
The alarm_enrichment 'bulk_size’ variable is based on how many messages it is actually reading at any given time. If AE is able to read a higher number of items in successfully, it will continue to take that many in. However, if it isn’t able to handle the number of messages you set it to, e.g., 1200, it will automatically decrease down to 100. We recommend keeping it set to a value of 100.
Listed below is an example nas/alarm_enrichment configuration from a lab environment.
nas.cfg example listed below (with edited areas highlighted)
cache_enrichment_query_misses = yes
enrichment_loglevel = 3
enrichment_logsize = 50000
enrichment_cache_prepopulation_interval_in_seconds = 21600
enrichment_logfile = alarm_enrichment.log
enrichment_subject = alarm
debug = 3
subject = alarm2
logfile = nas.log
bulk_read_size = 100
active = true
connection_url = jdbc:sqlserver://abcd-1234.LAB.COM:1433;DatabaseName=CA_UIM;loginTimeout=1800;
user = <omitted>
password = <omitted>
query = select case when d.target is null then dev.dev_name + ':' + ci.ci_name else d.target end as target,ccim.ci_metric_id,ccim.ci_metric_type,ccimd.met_description,ccid.ci_description from CM_CONFIGURATION_ITEM_METRIC ccim (nolock) inner join CM_CONFIGURATION_ITEM ci (nolock) on ci.ci_id=ccim.ci_id inner join CM_DEVICE dev (nolock) on ci.dev_id=dev.dev_id inner join CM_CONFIGURATION_ITEM_METRIC_DEFINITION ccimd (nolock) on ccim.ci_metric_type = ccimd.met_type inner join CM_CONFIGURATION_ITEM_DEFINITION ccid (nolock) on ccid.ci_type = ccimd.ci_type left join S_QOS_DATA d (nolock) on ccim.ci_metric_id=d.ci_metric_id where ccim.ci_metric_id = ?
exclusive_enrichment = no
match_alarm_field = met_id
match_alarm_regexp = [\d\D]+
use_enricher = os_enricher
lookup_by_alarm_field = met_id
udata.custom_4 = [cmdb.ci_description]
udata.custom_3 = [cmdb.met_description]
udata.custom_2 = [cmdb.target]
nas.cfg in Raw Configure Mode:
- enrichment source->cmdbs->os_enricher (database connection)
In all cases match on alarm field-> met_id
Use regexp to match on/process alarms, [\d\D]+ is a ‘catch-all’ expression to match on all alarms.
If preferred, you can specify specific probes using an OR operator, e.g., (cdm|ntevl|netapp)
Overwrite rules: (overwrite the custom_2, custom_3, and custom_4) alarm attributes with the results of the query.
- Custom_2 alarm field is overwritten with target data from S_QOS_DATA (target)
- Custom_3 alarm field is overwritten with metric description data from CM_CONFIGURATION_ITEM_METRIC_DEFINITION (met_description)
- Custom_4 alarm field is overwritten with configuration item description from CM_CONFIGURATION_ITEM_DEFINITION (ci_description)
Running the aforementioned query in MS SQL Server studio yields these results:
If you have everything configured properly and you can connect to the database successfully to run the query, and the AE and nas queues are processing the alarm messages, you should see the custom fields populate in less than a minute or so in a small-to-medium environment and possibly longer (minutes) in a large environment. Check the hub Status Tab to make sure the AE and nas queues are sending messages.
Shown below is an example of UIM alarms in the IM alarm sub-console with custom_2 through custom_4 fields populated by the query:
Other nas.cfg notes:
cache_enrichment_query_misses = yes
#caches the query misses so AE does not rerun the query. Applicable to nas v8.56 and higher.
enrichment_cache_prepopulation_interval_in_seconds = 21600
#If using pre-population, run the prepopulation query every 6 hours to refresh the cache.
alarm_enrichment Raw Configuration
Using alarm_enrichment rule to lookup the device details in our CMDB, based on the short name / hostname and not the FQDN
How to update origin for robot inactive alarm