jonhcw

Nis database and SLAs

Blog Post created by jonhcw on Apr 18, 2015

The ability to provide SLAs is an important aspect of a monitoring tool. In UIM you can configure quite versatile and flexible SLAs, but unfortunately the interfaces to manage them leave room for improvement. I do a lot of daily SLA management directly in the database as it is much easier than in SLM, especially if you're dealing with a large number of them. In this post, I am going to explain the basic structure of an SLM from the perspective of the database. I will post further entries to demonstrate some management tips in the form of queries, discuss automatic creation of SLAs, SLA calculation specific things and some other custom development.

 

What I write here is all based on my personal experience and investigation, none if it comes from support or other CA staff.



The structure of an SLA

Here is a simpliefied diagram of the tables required for an SLA. This aims to explain the basic structure of an SLA, I'm leaving out stuff such as SLA alarms and grouping of SLAs. I will explain in another entry how the ownership and grouping of SLAs works.

sla-structure.png

Yellow rows are primary keys

 

Definitions

This reads from left to right much like an SLA in the SLM tool. First, you have SLA. Within an SLA you have a number of SLOs and within that you have qos objects, which are in 'S_QOS_CONSTRAINTS' table. There's not much special about these definition tables, but they are the core of it.

  • S_SLA_DEFINITION contains mostly rather obvious things: whether it's a weekly or montly SLA, alarm levels, calculation links, SLA compliance percentage, etc.
  • S_SLO_DEFINITION contains less data and one peculiarity that I haven't been able to figure out. It contains column compliance percentage (like sla definition), which behaviour and effect I haven't been able to track down. It seems sometimes this is the same as SLA, but most often it seems to be 100.00 (100%). It even varies between two otherwise identical SLAs that have been created with the SLA wizard at the same time. I haven't been able to witness any kind of effect that this field has on the actual calculations or reporting (the SLA table entry is used to color the SLA in GUIs, after all). Moreover, I don't think I've seen this value shown anywhere in the SLM tool.
  • S_QOS_CONSTRAINTS table is somewhat more interesting than the two previous ones. This table contains all the qos entries that have been defined for all the SLOs. There are two interesting things to consider about it:
    • It contains column dirty. This is also a field that I haven't been able to figure out. It seems, however, that the sla_engine or some other component updates this now and again automatically, so I haven't concerned myself with it that much. It would, however, be interesting to know what it means. Something to do when when (if) one of those quiet moments comes
    • QoS is referred to by qos name (for example QOS_DHCP_RESPONSE), source and target. This could be explained by historic reasons (met_id hasn't always existed). It might be difficult to change it now and all probes still do not support the TNT model: they do not send dev_id and met_id. Whereas this might be somewhat acceptable, for example, with things being left out of USM groups, it'd hardly be acceptable if you couldn't create SLAs based on that data. This does however complicate things such as keeping data up to date

 

 

 

Calculations

Each definition has its corresponding S_***_CALCULATION table. Each element must have a calc_id defined in it's corresponding table, which means that all elements must have calculation rules. They are defined as follows:

  • S_SLA_CALCULATION entries are what you see in new SLA dialogue's "Calculation Method" dropdown list: Average, Best, Sequential, Weight, Worst
  • S_SLO_CALCULATION entries are what you see in sla_engine probe General -> Plugins tab.
  • S_QOS_CONSTRAINTS entries are what you see in SLM calculation profiles section, both SLO Calculations and QoS Calculations. You see these options in SLO window in SLM, as well as in QoS window, depending on your choices

 

As especially the above description of S_QOS_CONTRAINTS suggests, deducing which table is used doesn't go hand in hand with the SLM window the setting is entered in. I will about this in more detail when I cover automating SLA creation.

 

Time constraints

UIM uses three different tables for operating periods. Contents from S_OPERATING_PERIOD is what you see under Operating Periods in SLM. Contents of S_TIME_SPECIFICATION is what you see when you open an operating period entry: It contains in numeric format the weekdays and times that the operating period is specified for. D_OPERATING_PERIOD is the table that you don't see in SLM, but is, by my reckoning, the most important one. This table contains oper_id and expression columns. Basically it is an operating period in the form of SQL query or more accurately the filter part of the query (what comes after WHERE). This query is used to fetch the QoS data for the SLA calculation, so it is an important field when working with automation.

 

Exclusion periods

Exclusion periods aren't strictly speaking necessary for an SLA, but I'm mentioning them here mostly for future reference. SLAs and SLOs both have their own exclusion tables and fortunately they work very much the same way with dates as maintenance windows (the new maintenance_mode probe for USM) does. This means that with very little effort you can  transfer maintenance periods to exclusion periods. I will also talk more about that in another entry.

Outcomes