Skip navigation
All Places > CA Unified Infrastructure Management > Blog > Authors carl
carl

SNMPTD - Generic Traps

Posted by carl Dec 16, 2013

Deploy the SNMPTD probe.

 

Select the settings button (top left) and "Enable Generic" bottom left of the settings window. [You might also like to bump up the log level/size while your here in case of debug]

 

 

Image 000.png

 

 

Then on the generic tab of the settings page tick the "Convert to Nimsoft alarm" box.

 

In later updates I might go a little further into the options here but for now lets just go ahead and use the defaults.

 

 

Image 001.png

 

 

In the main config window for SNMPTD there is an icon on the menu bar for the "Trap Monitor" Image 002.png

 

I use a free windows tool in my testing to generate traps which the monitor picks up. If your SNMP device is correctly configured to point to the SNMPTD probe you should see entries here when you click play (when and how many depends on the interval traps are being sent)

 

 

Image 004.png

 

 

You can see the top trap has a different icon to the rest. That's because I had a profile in place for "Generic" when that trap was received.

 

Again in the main config window under V2 & V3 traps create a new profile:

 

 

Image 006.png

 

Name it something like "Generic" and set the "Trap Object Identifier" to . [dot, period] and tick the "Convert to Nimsoft Alarm".

 

Applying the probe settings should now start to generate traps -> Nimsoft Alarms - possibly a lot so use with care.

 

Image 008.png

carl

Tutorial Requests

Posted by carl Sep 3, 2013

Promising to make more time to create these articles and tutorials, I might even ask for help and advice from you guys that are intrested in contributing? Any takers?

Feel free to post comments to this thread for requests or vote up other requests by giving kudos.

Open the NAS configuration GUI and navigate to "Auto-Operator" -> "Scripts" tab -> Right-Click and NEW -> Script

 

 

 

Image 001.png

 

 

We now have a new window with a script editor:

 

 

Image 002.png

 

 

Attached to this guide is the "NAS WHITEPAPER" PDF you might like to have this open while we discuss some of the functions we will use.

 

The first job it to get the alarm. When we put this script to work in production the NAS will be used to find the NIMID of an alarm. The NIMID is a unique identifier used to select a specific single alarm or chain of alarms. While we develop this script however we will need to have a NIMID of an existing alarm we can play with. 

 

Lets create a test alarm so as not to change any of our existing alarms that might cause confusion for other users.

 

Navigate to the NAS status tab and find the "Send test alarm" icon: Image 004.png 

 

This will open a new window where we can create a test alarm.

 

Image 003.png

 

Now in our IM alarm console we can see the new alarm:

 

Image 005.png

 

You might need to drap the first column a little wider to see the NIMID:

 

Image 006.png

 

 

 

So back to our script window. We will get this alarm and store it in a LUA table.

 

a = alarm.get("VF69259077-26007")

 

 

This will create a table called "a" and store the alarm details in there. Again while developing we pass the NIMID as a string so be sure to wrap in "". Later we will just use alarm.get() and the NAS will pass the NIMID.

 

 

I have some pre-written code that provides us with a small function to human readable print to screen the content of a table, add the follow to the top of your script.

 

--Table Dumper - Used in table  function tdump(t)       local function dmp(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     dmp(t, 1, "root")             end --End Table dumper code

 

Now we can use this function to print out the table:

 

tdump(a)

 

We pass the name of our table "a" to the function, in the code - as this is a table there is no need to wrap in "".

 

Our script should now look like this:

 

--Table Dumper - Used in table  function tdump(t)       local function dmp(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     dmp(t, 1, "root")             end --End Table dumper code  a = alarm.get("VF69259077-26007")  tdump(a)

 

Go ahead an execute this script using the "play" icon top of the script editor: Image 007.png

 

Below is the output:

 

----------- Executing script at 02/09/2013 09:33:48 ----------      root:       robot:cloud       i18n_dsize:0       source:10.0.0.1       origin:cloud.teknetik.co.uk       time_origin:2013-09-02 09:26:47       arrival:1378110412       supp_key:123456       hostname:10.0.0.1       level:4       nimid:VF69259077-26007       aots:0       change_id:10778ADF18D41BE15F3F52FA26179BC5       time_arrival:2013-09-02 09:26:52       domain:teknetik.co.uk       visible:1       severity:major       hub:cloud.teknetik.co.uk       nimts:1378110407       subsys:NMS       message:Testmessage       tz_offset:-3600       nas:cloud.teknetik.co.uk       user_tag1:newznab       suppcount:0       supp_id:smileyvery-happy:E98D080A00476F2C0FFEDF5DFD44502       event_type:1       sid:1

 

Now we can see the makeup of the alarm as a key/value table. 

 

Lets go ahead and change the alarm message. I am not going into too much detail on tables here as there is a seperate tutorial 

 

a.message = "New Alarm Message"

 

Add the above line just above the tdump function and run the script again.

 

The tdump output will now show this has been updated:

 

message:New Alarm Message

 

Before we resend this edited alarm I want to quickly cover concatination as I feel this will be a common request. 

 

Say we want to just append some text to the alarm message:

 

a.message = a.message.." +my text"

 

This line states that a.message should be updates with the original contents of a.message and the double dot ".." concatinates our new string to the end. Dont for get to wrap string with "" and add your formatting spaces.

 

We can also use combinations of existing table strings or other variables:

 

myVar = "source" a.message = a.message.." "..myVar.." "..a.source

 

Here we create a new variable with the string "source" and use the key value for s.source. Notice the extra conactination and strings added to insert spaces for formatting. 

 

Here is the result:

 

message:Testmessage source 10.0.0.1

 

Page 21 of the attached NAS Whitepaper discusses all the available fields you can edit or add.

 

Once we have the changes we want to make in place we need to send the alarm back onto the NIMBUS:

 

alarm.set(a)

 Will send the alarm back to the NIMBUS and the NAS will update your alarm console:

 

Image 008.png

 

 

Full script now looks like this:

 

--Table Dumper - Used in table  function tdump(t)       local function dmp(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     dmp(t, 1, "root")             end --End Table dumper code  a = alarm.get("VF69259077-26007")  myVar = "source" a.message = a.message.." "..myVar.." "..a.source alarm.set(a)  tdump(a)

 

GREAT! We can now manipulate alarms based on auto-operator profiles. Lets take a quick look at how to create that profile, if your familiar with auto-operator profiles you might want to spik this part and start having fun with LUA :smileyhappy:

Follow this link for creating NAS auto operator rules [coming soon!]

This article is intended for all users, experienced or none and I hope it will be written in such a manner as to be accessible by all. Diving right in, open the NAS configuration screen on your primary NMS hub server:

 

[If you are experienced in auto-operator profiles and LUA and just want to grab the code the script is attached at the bottom of the guide]

 

 

 

Select: Auto-Operator - Scripts -> <Right-Click> - New

 

This opens the script editor window, save this right away and call it "nexec"

 

 

 

Now cut copy past the following code into the top of the script. Don't worry if nothing makes any sense you wont need to understand any of this code to use the tutorial. The code is a "function" that will allow the printing of "tables" in a human readable format and really eases the processes.

 

--Table Dumper - Used in table function tdump(t)       local function dump(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     dmp(t, 1, "root")             end --End Table dumper code

 

 

Next we need an alarm to work with, the finished script with process an alarm as it enters the NAS but that's tedious when developing so we use a "static" alarm. My dev environment wont change much but in production alarms come and go so choose you alarm carefully as if it clears or gets acknowledged it might throw you off so be aware of this. From the alarm console in Infrastructure Manager (IM) or UMP find an alarm and write down its ID. [TIP: Within IM you might need to widen the column with the "color" alarmseverity indicator as by default ID is not shown]

 

 

For this example I took the first alarm in my console with an ID of "VV72149947-04467". The next step is to GET this alarm from the NAS and store its contents as a table. For this we use the alarm.get() function:

 

a = alarm.get("VV72149947-04467")

 

 

This simple one line gets the alarm, easy as that! Using the function we pasted in the last step we can print out that table and take a look at the alarm detail. Provided below is the code and STDOUT [you only need to copy the tdump(a) into your code]

 

tdump(a)      ----------- Executing script at 08/10/2012 17:40:04 ----------      root:       robot:suse-nms6       i18n_dsize:0       source:esxi2.nimsoft.no       origin:nevil-nmshub       time_origin:2012-09-26 10:29:37       arrival:1348651782       hub:nevil-nmshub       visible:1       time_supp:2012-10-08 17:39:42       prevlevel:4       supp_key:"ESC:H:44454c4c-5300-1043-8054-c4c04f58344a"."Disk"."Disk Write RateESC:402"       nimid:VV72149947-04467       supptime:1349714382       level:4       hostname:esxi2.nimsoft.no       aots:1349449180       change_id:F895A8C9794BD483514EA6C559A9AB87       dev_id:smileyvery-happy:0e42d2226baa63a8e29b987337ed78f5       suppcount:3542       domain:nevil-nmsdom       severity:major       nas:nevil-nmshub       nimts:1348651777       subsys:smileyvery-happy:isk       message:The disk usage for esxi2.nimsoft.no.Disk Write Rate on 130.119.1.2:443 is outside expected limits (287.94 > 100)       tz_offset:-3600       met_id:M598929e72112f857db9e557f82e5a72d       time_arrival:2012-09-26 10:29:42       prid:vmware       supp_id:53CF19BDB2CE3ED21B2B87EF2066C711       event_type:2       sid:1.1.1.1

 

 

The LUA table is build of pairs of keys and values, for example the key "robot" has a value of "suse-nms6"

 

robot:suse-nms6

 

From this information we can build the "address" of the robot that generated the alarm so we can send it commands. We do this by creating a local variable called "addr" to referance later. The line 1 below "concatenates" (using the double dots ..) the values of the keys domain, hub and robot and will provide the address. You can go ahead and remove or comment out (-- at the start of a line provides comments) the tdump(a) and copy the following:

 

local addr = '/'..a.domain..'/'..a.hub..'/'..a.robot..'/'  print(addr) 

 [Note: At this point of writing I had to use another alarm ID so watch for this as the robot changes]

 

How we should have a line printed to the output section of the LUA editor which reads similar to the following:

 

----------- Executing script at 17/10/2012 09:11:10 ----------    /nevil-nmsdom/nevil-nmshub/suse-nms6/

 

That address will be use to communicate with the robot the alarm was send from. The next step is to build our "call back" to the robot. The process here is quite simple even if it looks complicated to a first time LUA scripted. Each probe has a list of built in functions (call backs)  that can be called via a LUA (or other SDK). Lets take a look at some of the NEXEC call backs via the GUI interface we call the "Probe Utility".

 

Select the NEXEC probe so it is highlighted and then press CTRL+P to open the probe utility. You will see a screen like the screenshot below.

 

 

 

Notice the drop down list "Probe commandset" this lists the call backs we can issue. Closing this, we can now create a test profile in the NEXEC probe. I am using a Linux robot for this tutorial so if using MS Windows change the simple NEXEC profile to something similar for windows. Open the config for NEXEC on your robot and create a test profile, here I just echo the word test to a file.

 

 

 

 

 Give the profile a quick test run:

 

 

 

 As you can see the profile runs successfully and "test" is echoed to the file /tmp/lua_test. Open the Probe Utility again, select the "run_profile" callback from the drop down, enter the profile name [NOTE: This is cases sensitive on Linux at least] and select the green arrow on the top tool bar to execute the profile.

 

 

 

 This should return with "command status: OK" at the bottom of the screen. Right, back to our LUA script editor.

 


As a recap, our script should look like the following with our TDUMP function, a line to grab an alarm and build the nimbus address from that alarm. [Note extra comments and removal of the print(addr)]

 

--Table Dumper - Used to "dump" the contents of a table (used for debug) function tdump(t)       local function dmp(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     tdump(t, 1, "root")             end --End Table dumper code  -- Get the Alarm using the NIMID  a = alarm.get("PI59815363-66418")  -- Form a variable "addr" which will contain the nimbus address of the robot we issue the callback to. local addr = '/'..a.domain..'/'..a.hub..'/'..a.robot..'/'  -- Print he address for debug, comment out until required. -- print(addr) 

Crate a variable to hold the name of the probe we want to issue a call back to:

 

probe ="nexec"

 

Create a variable to hold the name of the callback we wish to issue:

 

cmd = "run_profile"

The next step is ever so slightly more complicated (at first). When we run a call back it expects a PDS (essentially a table of key, value pairs) that holds the arguments for the probe. To create an "empty" PDS structure called "args" add the following line:

 

local args = pds.create()

Now we can add our arguments to the PDS structure, in this example we only have one argument to pass, the name of the profile. Other probe callbacks may require a number of arguments to be issued. Lets go ahead and insert an argument for the profile name we want to run:

 

pds.putString(args, "profile", "Test")

pds.putString is simply a function that states we want to add a string to a pds (obvious I know!) and takes the following arguments, name of the PDS structure (args in this case which is a local variable so no need to encapsulate in quotes) separated by a comma. Next are the key, value pairs which are strings so need to be encapsulated in quotes and separated by comma and form the actual arguments we want to send, in this case profile and Test.

 

[NOTE: When adding a integer use pds.putInt(args, "number", 1) this is for future reference and not required now]

 

Now we have all the variables we need to issue the call back. The callback is sent using the nimbus.request function. The following lines are an example:

 

output,rc = nimbus.request(addr..probe, cmd, args) print(rc)

 Lets break that down to make is easy to understand:

 

  • output - A local variable that will hold and return output from the command
  • rc - Holds the return code which will lett us if the command was executed successfully.
  • nimbus.request() - Is the LUA SDK function that send the callback to the probe and requires the three variables to complete.

Nimbus.requet requires: the nimbus address of the probe, the callback to issue, any arguments that callback expects

 

  • addr..probe - concatenates the two variables addr and probe to form the full nimbus address of the NEXEC probe i.e. /nevil-nmsdom/nevil-nmshub/suse-nms6/nexec
  • cmd - the variable we built with the callback "run_probe"
  • args - the pds structure holding the arguments

The very last line print(rc) prints the return code to the screen and all being well wil return 0. But wait what happens if it returns 4? What does that mean? Good point! Copy the following function and insert into the script below our tdump function:

 

--Added the error codes function for extra debug function codes(a) codes = {[0]="OK",[1]="error",[2]="communication error",[3]="invalid argument",[4]="not found",[5]="already defined", [6]="permission denied",[7]="temporarily out of resources",[8]="out of resources",[9]="no space left",[10]="broken connection", [11]="command not found",[12]="login failed",[13]="SID expired",[14]="illegal MAC",[15]="illegal SID",[16]="Session id for hub is invalid", [17]="Expired",[18]="No valid license",[19]="Invalid license",[20]="Illegal license",[21]="Invalid operation finv",[100]="user error from this value and up"} if rc == 0 then  print("The return code is: "..codes[rc].."\n\n") else if rc > 0 then print("The return code is: "..codes[rc].."\n\n") end end end

 

Now replace the last line print(rc) with:

 

codes(rc)

 When you execute the script now the output will return a human readable interpretation of the return code:

 

----------- Executing script at 17/10/2012 10:16:19 ----------    The return code is: OK

 

 If it returns anything other than OK you have a problem. Common error would be "Communication Error" which would suggest the nimbus address is wrong, check your addr and probe variables.

 

Assuming all went well this will have triggered the "Test" profile for the NEXEC probe! Congratulations!!

 

Only one last change when we used the alarm.get function we passed it a NIMID so it would select a specific alarm to work against, remove this NIMID. Now when the script is run via an auto operator profile it will use the NIMID of the alarm that triggered the script execution.

 

Your full script should now look like this:

 

--Table Dumper - Used to "dump" the contents of a table (used for debug) function tdump(t)       local function dmp(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     dmp(t, 1, "root")             end --End Table dumper code  --Added the error codes function for extra debug function codes(a) codes = {[0]="OK",[1]="error",[2]="communication error",[3]="invalid argument",[4]="not found",[5]="already defined", [6]="permission denied",[7]="temporarily out of resources",[8]="out of resources",[9]="no space left",[10]="broken connection", [11]="command not found",[12]="login failed",[13]="SID expired",[14]="illegal MAC",[15]="illegal SID",[16]="Session id for hub is invalid", [17]="Expired",[18]="No valid license",[19]="Invalid license",[20]="Illegal license",[21]="Invalid operation finv",[100]="user error from this value and up"} if rc == 0 then  print("The return code is: "..codes[rc].."\n\n") else if rc > 0 then print("The return code is: "..codes[rc].."\n\n") end end end   -- Get the Alarm using the NIMID  a = alarm.get("PI59815363-66418")  -- Form a variable "addr" which will contain the nimbus address of the robot we issue the callback to. local addr = '/'..a.domain..'/'..a.hub..'/'..a.robot..'/'  -- Print he address for debug, comment out until required. -- print(addr)  probe ="nexec" cmd = "run_profile" local args = pds.create() pds.putString(args, "profile", "Test")  output,rc = nimbus.request(addr..probe, cmd, args) codes(rc)

Ensure you have a working SNMPGET probe installed, highlight the probe and press CTRL+P to open the probe utility. From the probe commandset drop down box select the "query_agent" call back and populate the appropriate fields. Here is an example:

 

 

 

I have a basic Debian Linux server running SNMPD running protocol V2. The SNMPGET probe configuration is is quite simple the only thing I want to point out here is the root and oid values. By setting them the same SNMPGET would return a single oid value and not walk from the root to the oid retuning multiple values. This might be obvious to experienced SNMP users but not all.

 

I suggest using the probe utility above as its quicker to configure and experiment with values than using the dashboard designer interface. We will now use these values to populate a text box in Dashboard Designer with the "variable" return value from the oid we selected about, in this case "Debian-apache"

 

Create a new Dashboard in Dashboard Designer. Drag a simple text box to the canvas and select data source.

 

 

 

Populate the probe tab with the same details we used in step one using the probe utility. Don't forget to fill in the arguments section by clicking on the "arguments" button:

 

 

One last config is the advanced button:

 

 

Here we see the return from the OID and using the drop down "Result Token" we can select which value to choose. I selected "variable" because unlike the "value" return it print debian-apache without quotes.

 

 

Hit the apply button and drag the "Arrow" icon to the text box we created. Save, publish or preview the dashboard:

 

 

You should now see your SNMP value printed to screen! (Added a computer icon for added artistic value!)

 

Introduction


Converged infrastructure stacks like FlexPod for VMware represent a new standard in the delivery of IT services—a complete data center in a box. FlexPod is a pre-integrated, pre-tested, and best-of-breed set of IT solutions that accelerate private  cloud implementation, reduce total cost of ownership (TCO), and increase business agility.  
 
Nimsoft Monitor for FlexPod is the first solution to offer comprehensive, integrated monitoring coverage of all elements of the FlexPod platform. Nimsoft Monitor for FlexPod tracks all the physical, virtual, and application elements of FlexPod cloud environments. Further, it provides sophisticated monitoring while streamlining administration—offering automated discovery, configuration, and monitoring. As a result, Nimsoft Monitor enables you to get the monitoring insights you need to quickly respond to outages, and even prevent them from happening in the first place. With these unified monitoring capabilities, Nimsoft Monitor enables enterprises and service providers to take full advantage of their FlexPod investments.
 
This document defines the best practices for monitoring FlexPod platforms with Nimsoft Moniitor, outlining how to set up key alarms and establish appropriate baseline thresholds.  It is not intended to provide a complete list of all the elements the solution can monitor and collect. This document is organized by the major components that exist within a FlexPod platform and includes monitoring information for both hardware and software.
 
Nimsoft Monitor for FlexPod includes the following probes:

 

  • CDM (CPU, Disk, Memory) Probe (cdm)
  • Cisco Monitor Probe (cisco_monitor)
  • Cisco Unified Computing System (UCS) Probe (cisco_ucs)
  • NetApp Storage (netapp)
  • Interface Traffic Probe (interface_traffic)
  • Processes Probe (processes)
  • Cisco NX-OS Probe (cisco_nxos) and other SNMP Probes (snmptoolkit, snmpget)
  • SQL Server Probe (sql_server)
  • VMware Probe (vmware)

 

The full document can be downloaded below.

What logmon does


An important source of information for an IT operations staff is the wide variety of log-files on the systems they maintain. Checking these files manually is a very time-consuming job, and it may also be a challenge for all members of the staff to be able to interpret all types of messages in all types of logs. The Nimsoft Log File monitoring probe (logmon) can simplify the job for the systems operations staff
by:  

 

  • Automatically informing about error situations immediately after they have occurred.  
  • Filtering out the log-file entries that need manual action. Usually the majority of entries in a log-file are not of interest to the daily operations staff. By setting up watchers and filters inlogmon, alarms are generated only for the important log-file messages.  
  • Specifying a more informative alarm message by modifying the original message text, thus

helping the operations staff to locate and fix the problem more easily, without requiring assistance from the system specialist.  
Logmon can be configured to monitor ASCII log-files in any format. Experience has shown that very few log-files have the same layout. Some files are line-oriented (single-line files like the UNIX system log-files /usr/adm/messages), while other log-files are record-oriented (multiple-line files, like the ones produced by Oracle). The logmon probe monitors both line-oriented and record-oriented log-files effectively, using a powerful regular expression and/or pattern-matching scheme. The probe checks
the log-file for new entries at user-configurable, timed intervals, keeping track of the position within the file between each run. This ensures that only one alarm is sent per log-entry, even if the log-file is truncated or wrapped in the meantime.  


A single instance of the logmon probe can be configured to monitor multiple log-files. Within each log-file, logmon can be set up to look for occurrences of many different log-file entries with each log-file entries generating a different alarm message, which may contain both text from the original log-file entry and/or user-defined text. 

 

To read the rest of this document please download the attached PDF

carl

REGEX: Part 1 - The Basics

Posted by carl Aug 29, 2012

We get a lot of cases logged with regards to Regex and it's usage, for those who may not have used regex before here is a little guide:

 

Regex is a powerful pattern matching language used as part of many of the Nimsoft Probes and Tools, the following is a brief introduction to Regex and it’s usage within the Nimsoft NMS software.

 

The first point to make is Nimsoft implementation regex is based around Perl’s implementation (Other flavors use differing syntax).

 

When specifying a pattern match using REGEX in NMS we need to tell the NMS software we want to use Regex, we do this by opening and closing the syntax with / [Forward-slash] as in the following example.

 

/<regex syntax goes here>/

 

The next most basic usage is the “match all” characters statement:

 

.*            (Dot Asterisk) – Dot represents any alpha-numeric, character or special character. Asterisk represents “Any number of times”. The two used in conjunction create the expression match anything any number of times (everything!)

 

|              (pipe Symbol) Is used as an OR operator

 

\              (Back-slash) Is the escape character operator and is used to escape special characters. As a example if you wanted to match a back-slash in an expression, it has a special meaning in regex and hence has to be escaped like \\

 

\s            Matches white spaces, ie breaks between words

 

 

 

Let’s take a look at a very simple REGEX statement to match an alarm with the message of:

 

Average (5 samples) total cpu is now 82.61%, which is above the warning threshold (75%)

 

 

/.*total\scpu.*above.*threshold./

 

The above expression would match an alarm that comes into the NAS wich states total cpu is above its defined threshold. Using the example and syntax definitions try to work out how this works.

 

 

Now let’s imagine we would like to match a similar alarm but for memory:

 

Average (5 samples) total cpu is now 82.61%, which is above the warning threshold (75%)

 

 

We could write a similar expression as to the one previous, however in some cases it might make more sense to have one regex rule match both alarms we do this by using “grouping” syntax and the | (pipe) [OR] operator.

 

()             (Open, Close Parenthesis) Wrapping syntax in parenthesis creates a “group” this is useful to us in two ways. It allows us to isolate certain parts of our syntax and hence use operators local to that part of the expression or use references to those “groups” in the NMS software (The logmon probe is a good example of addressing groups). In the following example we use a “group” to isolate part of our syntax so we can use the OR operator on just that section of the expression.

 

/.*total\s(cpu|memory).*above.*threshold.*/

 

Notice the group (cpu|memory) section which essentially states if the string matches cpu OR memory then match. Grouping this section is very important without the parenthesis the expression would say match the string:

 

total cpu

OR

Memory above threshold

 

So that’s the basics covered and you should now be able to create pattern matches based on regular expressions within the Nimsoft NMS Software. You will find these techniques especially useful then using the NAS Auto-Operators and probes such as Logmon.

carl

MSI Robot Installation

Posted by carl Aug 29, 2012

Version 5.60 of the Nimsoft NMS server introduced RMP and MSI packages for mass deployment of robots. The MSI version is a little different to many MSI in the fact it requires an answer file located in the install directory. The answer file is a basic robot configuration file called nms-robot-vars.cfg and follows the below format.

 

DOMAIN=companyZdom
HUBIP=145.23.31.5
HUB=companyZhub
HUBROBOTNAME=hubsysA
ROBOTNAME=NotNecessarilyHostnameButSomethingElseIChoose
ROBOTIP=198.201.4.7
HUBPORT=48002
FIRST_PROBE_PORT=48000
SECONDARY_DOMAIN=
SECONDARY_HUB=
SECONDARY_HUBROBOTNAME=
SECONDARY_HUBIP=
SECONDARY_HUBPORT=
SECONDARY_HUB_DNS_NAME=

 

This file should be place in the same location the MSI is copied to. At the bottom of the page is a windows batch file created to make the install process of the MSI robot much more automated. The script completes the following:

 

1 - Creates a temp mapped drive
2 - Copies the MSI to the local server
3 - Creates the nms-robot-vars.cfg
4 - Set the ROBOTNAME to the hostname of the device
5 - Sets the robot IP to that of the first interface in ipconfig (if someone whiches to improve on this please leave a comment below! :smileyhappy:
6 - Starts the MSI installation using the nms-robot-vars.cfg to build the robot configuration
7 - Deletes the robot MSI from the local machine
8 - Un-mapps the network drive

Lets investigate this step by step to understand what the script is doing and how to modify it. As always if anyone has suggestions or tips to improve the script we would love to hear from you either via email or the comments section below.

 

Map the nework drive

 

:: Edit the following line to map a temp drive to a share with the MSI file hosted there
net use r: \\<ip or UNC>\<file_share>

Create a file share that the remote servers can connect to and transfer the MSI file from. Copy the MSI file to this network share. Find the above lines in the script and change to reflect the share location.

 

Set the install path

 

::smileyfrustrated:et the drive letter and install path then create it
set install_path=progra~1\nimsoft\

set drive_letter=c:\

Change the install path by chaging the variables in the script.

 

Set up the nms-robot-vars.cfg - Domain and Hub

 

:: These variables should stay static so no need to dynamicly update, just change to suit your environment
echo DOMAIN=nevil >> %drive_letter%%install_path%nms-robot-vars.cfg

 

The lines in the script such as the above will build the nms-robot-vars.cfg you should change the variables to suit your environment, domain, hub name/ip etc. These should stay fairly contstant an hence do not need to be changed dynamically or updated very often.

 

 Set up the nms-robot-vars.cfg - IP and Robot name

 

::Robotname is currently set using the hostname as %host% variable or change to manually overide
echo ROBOTNAME=%HOST% >> %drive_letter%%install_path%nms-robot-vars.cfg
::IP address of the first adaptor it finds.
echo ROBOTIP=%IP% >> %drive_letter%%install_path%nms-robot-vars.cfg

 

The IP variable will take the first IP address it finds from ipconfig, this might not be suitable for all environments

The above two lines set the robot IP and robot name, the IP is gained from a cat of ipconfig.exe and looking for the "IPv4 Address" this means it will take the first configured IP address it finds and this may pose a limitation in some environments. The code below is what parses the IP from ipconfig:

 

::Gets the IP address of the first adaptor it finds
for /f "usebackq tokens=2 delims=:" %%f in (`ipconfig ^| findstr /c:"IPv4 Address"`) do
(        set IP=%%f)

 

Last but not least the script sets the robot name to the hostname of the device. All these settings can be changed later when the robot is in the NMS system via Infrastructure Manager but these setting will get the robot up and running with a unique identifier. Below is the code from the script to find the hostname and set it as a variable:

 

::Allows %HOST% to be used as a variable for the robotname
for /f "tokens=* delims= " %%a in ('hostname') do (set HOST=%%a)

 

The nms-robot-vars.cfg is now built and ready to use my the MSI as an answer file.

 

Start the MSI Installation and clean up



The script will now copy the MSI file locally to the server and start the install. For a full list of switches and their meaning execute MSIEXEC from the command line. Once the file has been copied and executed by MSIEXEC the MSI itself is deleted and the network drive unmapped.

 

::smileyfrustrated:tart the MSI install from mapped drive R: uncomment relivent architecture
cp R:/nimsoft-robot-x64.msi %drive_letter%%install_path%
msiexec /i %drive_letter%%install_path%nimsoft-robot-x64.msi /qn INSTALLDIR="%drive_letter%%install_path%"
sleep 60

del
%drive_letter%%install_path%nimsoft-robot-x64.msi

net use r: /delete:finish

 

 

Distributing the batch file to remote servers

 

The last part of this guide discusses pushing the script we have been working on to the remote servers for this we reccomend PSEXEC.

PSEXECis a free microsoft tool and can be downloaded here:  http://download.sysinternals.com/Files/PsTools.zip

 

PSEXEC Usage

Open a dos window and launch the PSEXEC Tool:

 

Install on all computers currently logged in your domain

psexec \\* -s \\Server\NetLogon\robot_install.bat

 

Install on a single computer

psexec \\COMPUTER_NAME -s \\Server\NetLogon\robot_install.bat

 

Install on all computers using the domain administrator credentials

psexec \\* -s -u Domain\Administrator -p Password \\Server\NetLogon\robot_install.bat

 

Install on specific computers (ALL.TXT is a text file that lists target computer names, one per line), using domain administrator credentials

psexec @ALL.TXT -s -u Domain\Administrator -p Password \\Server\NetLogon\robot_install.bat

 

Note: when doing mass deployment on multiple computers, you should monitor the response file to get a list of computers that were not deployed (were not connected when the PSEXEC ran), and run the PSEXEC mutliple times throughout the business hours to make sure all your computers are getting installed.

carl

LUA - Tables for Newbies

Posted by carl Aug 10, 2012

After starting out with LUA and struggling with tables and getting my data out of them and thanks to various people on this forum I have started to get my head around how they work. I wrote this short article in effort to share this beginner knowladge with others like myself.

 

This will also be in the Nimsoft KB but posting here also as this seems to be the "place" for development chatter.

 

There are probably many better ways to achieve similar results but if, like myself your struggling to get your head round tables this might help get you started :smileyhappy:

 

-------------------------------

 

 

In this article we will take a closer look at LUA tables. Most of the data we return from callbacks will be stored in LUA tables hence we need a strong understanding of table structure and how to access the data contained within.

 

First we build a basic script to return the values from the “getrobots” call back from a hub and store this data in a table. For a more in depth look at building call back requests with the nimbus.request function see other articles in this LUA series.

 

--Connect to hub probe. Edit this line with the address of your hub you wish to query local addr = "/nevil-nmsdom/nevil-nmshub/nevil-nms/hub" --Command to request configuration of probe local command = "getrobots" --Build PDS answer file when probe_config_get asks for probe name local args = pds.create() pds.putString(args, "name", "", "detail", "")  --Send request and store data in h_resp{} local h_resp,rc = nimbus.request(addr, command, args)   print(h_resp)   

  Execute this in the NAS script editor and it will produce out similar to the below:

 

 ----------- Executing script at 07/08/2012 15:07:06 ---------- 

table:0x2275b50

 

 In order for us to extract anything valuable from the said table we need to understand the table structure. At this point I refer you to the “LUA- tdumper” article. If you have not yet followed this article I highly suggest you do as the rest of this article will be much easier to follow. So assuming we have the tdumper.lua file saved in the nas/scripts/ folder at the top of our script we can use:

 

dofile "scripts/tdumper.lua"

 

 

to include this function in our above code and replace the print function with:

 

tdump(h_resp)

 

 

Executing the script now produces a full list of the table. A snippet is shown below for reference:

 

----------- Executing script at 07/08/2012 15:14:07 ----------         root:       domain:nevil-nmsdom       robotlist:         1:           ssl_mode:0           os_user2:           origin:nevil-nmshub           os_major:UNIX           ip:192.168.1.58           os_minor:Linux           addr:/nevil-nmsdom/nevil-nmshub/nevil-multibot_03           status:0           license:1           last_inst_change:1340754931           created:1341306789           offline:0           last_change:1341306869           lastupdate:1344348377           autoremove:0           os_user1:           flags:1           os_description:Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64           name:nevil-multibot_03           metric_id:M64FB142FE77606C2E924DD91FFCC3BB4           device_id:smileyvery-happy:DFF83AB8CD8BC99B88221524F9320D22           heartbeat:900           port:48100           version:5.52 Dec 29 2011

 

Addressing Table Elements

 

Now we can preview the table structure we can start selecting table elements. The output you see from this request is called a “nested table” which is essentially tables stored within tables and can get quite messy for the beginner to get their head around. Programmers with a background in Perl for example will have a distinct advantage when tackling LUA tables.

 

So what do we have here? Let’s break this table up a little, to keep things simple lets list the main elements of the table. Start by commenting out our tdump function:

 

--tdump(h_resp)

 

 

Add the following:

 

for k,v in pairs(h_resp) do print(k.."    ",v) end

 

 

This simple code says for each table entry in the h_resp table print its key (k) and value (v) pairs. The table h_resp we created has two initial entries domain and robotlist:

 

domain     nevil-nmsdom robotlist     table:0x22a0a60

 

 

 

From this return we can see the key for “domain” has a string value of “nevil-nmsdom” the name of my hub robot. The robotlist key is slightly more complicated as its “value” is actually another table, a “nested” table. To list the entries from the “robotlist” key we can just append the key name to the table name.

 

for k,v in pairs(h_resp.robotlist) do print(k.."    ",v) end

 

 

This provides us with yet another list of tables, one for each of the robots the call back found:

 

 

  1    table:0x2213600   0    table:0x2392ab0   3    table:0x2394c30   2    table:0x225eb80   5    table:0x22133e0   4    table:0x22131f0   7    table:0x2275b00   6    table:0x227d6f0   8    table:0x227d490

 

 

 

We can drill down further and examine one of these tables with:

 

for k,v in pairs(h_resp.robotlist["1"]) do print(k.."    ",v) end

 

 

We can provide any of the integer values of the “key” to examine that particular table in this case we used “1”

 

 

ssl_mode    0   os_user2      origin    nevil-nmshub   os_major    UNIX   ip    192.168.1.58   os_minor    Linux   addr    /nevil-nmsdom/nevil-nmshub/nevil-multibot_03   status    0   license    1   last_inst_change    1340754931   created    1341306789   offline    0   last_change    1341306869   lastupdate    1344590476   autoremove    0   os_user1      flags    1   os_description    Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64   name    nevil-multibot_03   metric_id    M64FB142FE77606C2E924DD91FFCC3BB4   device_id    DDFF83AB8CD8BC99B88221524F9320D22   heartbeat    900   port    48100   version    5.52 Dec 29 2011

 

 Now all that probably seemed like a lot of work but by now we should be getting used to how to address tables and start getting at the data we want.

 

Now let’s assume we are just interested in the “name” key. We can print the value for they “name” key of this table with the following syntax:

 

print(h_resp.robotlist["1"].name)    ----------- Executing script at 10/08/2012 11:56:57 ----------    nevil-multibot_04

 

If we wanted to return all the name values for each of the tables stored under “robotlist”:

 

 

for k,v in pairs (h_resp.robotlist) do print(v.name) end   ----------- Executing script at 10/08/2012 13:41:30 ----------    nevil-multibot_03   nevil-mysql   nevil-multibot_02   nevil-multibot_04   debian-apache   nevil-ump   nevil-nms   nevil-windows-dc1   Nevil-MS-SQL

 

 

As we discussed we can iterate through an entire table and its nested tables using the tdump function. However from time to time you might just require to iterate over a single nested table. To help us understand which table to iterate over for a particular robot edit our previous syntax to display the “key” the robot “Name” value descended from.

 

for k,v in pairs (h_resp.robotlist) do print(k,"    "..v.name) end   ----------- Executing script at 10/08/2012 13:51:49 ----------    1    nevil-multibot_03   0    nevil-mysql   3    nevil-multibot_02   2    nevil-multibot_04   5    debian-apache   4    nevil-ump   7    nevil-nms   6    nevil-windows-dc1   8    Nevil-MS-SQL

 

Now we can see what table index within h_resp.robotlist belongs to which robot. As an example iterating over h_resp.robotlist[“1”] would give us the information about nevil-multibot_03, we have already written the code for this in a previous example as an exercise I will leave you to explore this.

 

 As a final note in the article let’s look at how we would iterate over the whole of h_resp.robotlist and just this nested table. This becomes a little more technical as we need to create a new function.

 

 

function DeepPrint(a)       if type(a) == "table" then         for k,v in pairs(a) do             print(k.."    ",v)             DeepPrint(v)            end     else       end     end

 

 

I won’t go into any great depth here but if the script sees that a values nests another table it will iterate over that too.

 

Execute this function passing it the name of the table we want to iterate over:

 

DeepPrint(h_resp.robotlist)    ----------- Executing script at 10/08/2012 13:59:17 ----------    1    table:0x2394df0   ssl_mode    0   os_user2      origin    nevil-nmshub   os_major    UNIX   ip    192.168.1.58   os_minor    Linux   addr    /nevil-nmsdom/nevil-nmshub/nevil-multibot_03   status    0   license    1   last_inst_change    1340754931   created    1341306789   offline    0   last_change    1341306869   lastupdate    1344603075   autoremove    0   os_user1      flags    1   os_description    Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64   name    nevil-multibot_03   metric_id    M64FB142FE77606C2E924DD91FFCC3BB4   device_id    DDFF83AB8CD8BC99B88221524F9320D22   heartbeat    900   port    48100   version    5.52 Dec 29 2011   0    table:0x22135b0   ssl_mode    0   os_user2      origin    network   os_major    UNIX   ip    192.168.1.31   os_minor    Linux   addr    /nevil-nmsdom/nevil-nmshub/nevil-mysql   status    0   license    1   last_inst_change    1342178648   created    1341922451   offline    0   last_change    1344591508   lastupdate    1344603108   autoremove    0   os_user1      flags    1   os_description    Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64   name    nevil-mysql   metric_id    M7CDEF667D86384966EF5876BEAB7AA52   device_id    D518C83671EA63842F7C08EEBA7DD63E2   heartbeat    900   port    48000   version    5.63 Mar  2 2012