DX Unified Infrastructure Management

 View Only

NAS LUA Scripting: Triggering events from alarms - Part 1

By Anon Anon posted Oct 17, 2012 01:38 PM

  

This article is intended for all users, experienced or none and I hope it will be written in such a manner as to be accessible by all. Diving right in, open the NAS configuration screen on your primary NMS hub server:

 

[If you are experienced in auto-operator profiles and LUA and just want to grab the code the script is attached at the bottom of the guide]

 

 

 

Select: Auto-Operator - Scripts -> <Right-Click> - New

 

This opens the script editor window, save this right away and call it "nexec"

 

 

 

Now cut copy past the following code into the top of the script. Don't worry if nothing makes any sense you wont need to understand any of this code to use the tutorial. The code is a "function" that will allow the printing of "tables" in a human readable format and really eases the processes.

 

--Table Dumper - Used in table function tdump(t)       local function dump(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     dmp(t, 1, "root")             end --End Table dumper code

 

 

Next we need an alarm to work with, the finished script with process an alarm as it enters the NAS but that's tedious when developing so we use a "static" alarm. My dev environment wont change much but in production alarms come and go so choose you alarm carefully as if it clears or gets acknowledged it might throw you off so be aware of this. From the alarm console in Infrastructure Manager (IM) or UMP find an alarm and write down its ID. [TIP: Within IM you might need to widen the column with the "color" alarmseverity indicator as by default ID is not shown]

 

 

For this example I took the first alarm in my console with an ID of "VV72149947-04467". The next step is to GET this alarm from the NAS and store its contents as a table. For this we use the alarm.get() function:

 

a = alarm.get("VV72149947-04467")

 

 

This simple one line gets the alarm, easy as that! Using the function we pasted in the last step we can print out that table and take a look at the alarm detail. Provided below is the code and STDOUT [you only need to copy the tdump(a) into your code]

 

tdump(a)      ----------- Executing script at 08/10/2012 17:40:04 ----------      root:       robot:suse-nms6       i18n_dsize:0       source:esxi2.nimsoft.no       origin:nevil-nmshub       time_origin:2012-09-26 10:29:37       arrival:1348651782       hub:nevil-nmshub       visible:1       time_supp:2012-10-08 17:39:42       prevlevel:4       supp_key:"ESC:H:44454c4c-5300-1043-8054-c4c04f58344a"."Disk"."Disk Write RateESC:402"       nimid:VV72149947-04467       supptime:1349714382       level:4       hostname:esxi2.nimsoft.no       aots:1349449180       change_id:F895A8C9794BD483514EA6C559A9AB87       dev_id:smileyvery-happy:0e42d2226baa63a8e29b987337ed78f5       suppcount:3542       domain:nevil-nmsdom       severity:major       nas:nevil-nmshub       nimts:1348651777       subsys:smileyvery-happy:isk       message:The disk usage for esxi2.nimsoft.no.Disk Write Rate on 130.119.1.2:443 is outside expected limits (287.94 > 100)       tz_offset:-3600       met_id:M598929e72112f857db9e557f82e5a72d       time_arrival:2012-09-26 10:29:42       prid:vmware       supp_id:53CF19BDB2CE3ED21B2B87EF2066C711       event_type:2       sid:1.1.1.1

 

 

The LUA table is build of pairs of keys and values, for example the key "robot" has a value of "suse-nms6"

 

robot:suse-nms6

 

From this information we can build the "address" of the robot that generated the alarm so we can send it commands. We do this by creating a local variable called "addr" to referance later. The line 1 below "concatenates" (using the double dots ..) the values of the keys domain, hub and robot and will provide the address. You can go ahead and remove or comment out (-- at the start of a line provides comments) the tdump(a) and copy the following:

 

local addr = '/'..a.domain..'/'..a.hub..'/'..a.robot..'/'  print(addr) 

 [Note: At this point of writing I had to use another alarm ID so watch for this as the robot changes]

 

How we should have a line printed to the output section of the LUA editor which reads similar to the following:

 

----------- Executing script at 17/10/2012 09:11:10 ----------    /nevil-nmsdom/nevil-nmshub/suse-nms6/

 

That address will be use to communicate with the robot the alarm was send from. The next step is to build our "call back" to the robot. The process here is quite simple even if it looks complicated to a first time LUA scripted. Each probe has a list of built in functions (call backs)  that can be called via a LUA (or other SDK). Lets take a look at some of the NEXEC call backs via the GUI interface we call the "Probe Utility".

 

Select the NEXEC probe so it is highlighted and then press CTRL+P to open the probe utility. You will see a screen like the screenshot below.

 

 

 

Notice the drop down list "Probe commandset" this lists the call backs we can issue. Closing this, we can now create a test profile in the NEXEC probe. I am using a Linux robot for this tutorial so if using MS Windows change the simple NEXEC profile to something similar for windows. Open the config for NEXEC on your robot and create a test profile, here I just echo the word test to a file.

 

 

 

 

 Give the profile a quick test run:

 

 

 

 As you can see the profile runs successfully and "test" is echoed to the file /tmp/lua_test. Open the Probe Utility again, select the "run_profile" callback from the drop down, enter the profile name [NOTE: This is cases sensitive on Linux at least] and select the green arrow on the top tool bar to execute the profile.

 

 

 

 This should return with "command status: OK" at the bottom of the screen. Right, back to our LUA script editor.

 


As a recap, our script should look like the following with our TDUMP function, a line to grab an alarm and build the nimbus address from that alarm. [Note extra comments and removal of the print(addr)]

 

--Table Dumper - Used to "dump" the contents of a table (used for debug) function tdump(t)       local function dmp(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     tdump(t, 1, "root")             end --End Table dumper code  -- Get the Alarm using the NIMID  a = alarm.get("PI59815363-66418")  -- Form a variable "addr" which will contain the nimbus address of the robot we issue the callback to. local addr = '/'..a.domain..'/'..a.hub..'/'..a.robot..'/'  -- Print he address for debug, comment out until required. -- print(addr) 

Crate a variable to hold the name of the probe we want to issue a call back to:

 

probe ="nexec"

 

Create a variable to hold the name of the callback we wish to issue:

 

cmd = "run_profile"

The next step is ever so slightly more complicated (at first). When we run a call back it expects a PDS (essentially a table of key, value pairs) that holds the arguments for the probe. To create an "empty" PDS structure called "args" add the following line:

 

local args = pds.create()

Now we can add our arguments to the PDS structure, in this example we only have one argument to pass, the name of the profile. Other probe callbacks may require a number of arguments to be issued. Lets go ahead and insert an argument for the profile name we want to run:

 

pds.putString(args, "profile", "Test")

pds.putString is simply a function that states we want to add a string to a pds (obvious I know!) and takes the following arguments, name of the PDS structure (args in this case which is a local variable so no need to encapsulate in quotes) separated by a comma. Next are the key, value pairs which are strings so need to be encapsulated in quotes and separated by comma and form the actual arguments we want to send, in this case profile and Test.

 

[NOTE: When adding a integer use pds.putInt(args, "number", 1) this is for future reference and not required now]

 

Now we have all the variables we need to issue the call back. The callback is sent using the nimbus.request function. The following lines are an example:

 

output,rc = nimbus.request(addr..probe, cmd, args) print(rc)

 Lets break that down to make is easy to understand:

 

  • output - A local variable that will hold and return output from the command
  • rc - Holds the return code which will lett us if the command was executed successfully.
  • nimbus.request() - Is the LUA SDK function that send the callback to the probe and requires the three variables to complete.

Nimbus.requet requires: the nimbus address of the probe, the callback to issue, any arguments that callback expects

 

  • addr..probe - concatenates the two variables addr and probe to form the full nimbus address of the NEXEC probe i.e. /nevil-nmsdom/nevil-nmshub/suse-nms6/nexec
  • cmd - the variable we built with the callback "run_probe"
  • args - the pds structure holding the arguments

The very last line print(rc) prints the return code to the screen and all being well wil return 0. But wait what happens if it returns 4? What does that mean? Good point! Copy the following function and insert into the script below our tdump function:

 

--Added the error codes function for extra debug function codes(a) codes = {[0]="OK",[1]="error",[2]="communication error",[3]="invalid argument",[4]="not found",[5]="already defined", [6]="permission denied",[7]="temporarily out of resources",[8]="out of resources",[9]="no space left",[10]="broken connection", [11]="command not found",[12]="login failed",[13]="SID expired",[14]="illegal MAC",[15]="illegal SID",[16]="Session id for hub is invalid", [17]="Expired",[18]="No valid license",[19]="Invalid license",[20]="Illegal license",[21]="Invalid operation finv",[100]="user error from this value and up"} if rc == 0 then  print("The return code is: "..codes[rc].."\n\n") else if rc > 0 then print("The return code is: "..codes[rc].."\n\n") end end end

 

Now replace the last line print(rc) with:

 

codes(rc)

 When you execute the script now the output will return a human readable interpretation of the return code:

 

----------- Executing script at 17/10/2012 10:16:19 ----------    The return code is: OK

 

 If it returns anything other than OK you have a problem. Common error would be "Communication Error" which would suggest the nimbus address is wrong, check your addr and probe variables.

 

Assuming all went well this will have triggered the "Test" profile for the NEXEC probe! Congratulations!!

 

Only one last change when we used the alarm.get function we passed it a NIMID so it would select a specific alarm to work against, remove this NIMID. Now when the script is run via an auto operator profile it will use the NIMID of the alarm that triggered the script execution.

 

Your full script should now look like this:

 

--Table Dumper - Used to "dump" the contents of a table (used for debug) function tdump(t)       local function dmp(t, l, k)         if type(t) == "table" then           print(string.format("%s%s:", string.rep(" ", l*2), tostring(k)))           for k, v in pairs(t) do             dmp(v, l+1, k)           end         else           print(string.format("%s%s:%s", string.rep(" ", l*2), tostring(k), tostring(t)))         end       end     dmp(t, 1, "root")             end --End Table dumper code  --Added the error codes function for extra debug function codes(a) codes = {[0]="OK",[1]="error",[2]="communication error",[3]="invalid argument",[4]="not found",[5]="already defined", [6]="permission denied",[7]="temporarily out of resources",[8]="out of resources",[9]="no space left",[10]="broken connection", [11]="command not found",[12]="login failed",[13]="SID expired",[14]="illegal MAC",[15]="illegal SID",[16]="Session id for hub is invalid", [17]="Expired",[18]="No valid license",[19]="Invalid license",[20]="Illegal license",[21]="Invalid operation finv",[100]="user error from this value and up"} if rc == 0 then  print("The return code is: "..codes[rc].."\n\n") else if rc > 0 then print("The return code is: "..codes[rc].."\n\n") end end end   -- Get the Alarm using the NIMID  a = alarm.get("PI59815363-66418")  -- Form a variable "addr" which will contain the nimbus address of the robot we issue the callback to. local addr = '/'..a.domain..'/'..a.hub..'/'..a.robot..'/'  -- Print he address for debug, comment out until required. -- print(addr)  probe ="nexec" cmd = "run_profile" local args = pds.create() pds.putString(args, "profile", "Test")  output,rc = nimbus.request(addr..probe, cmd, args) codes(rc)
6 comments
14 views