Skip navigation
All Places > CA Unified Infrastructure Management > Blog > 2012 > August
2012

What logmon does


An important source of information for an IT operations staff is the wide variety of log-files on the systems they maintain. Checking these files manually is a very time-consuming job, and it may also be a challenge for all members of the staff to be able to interpret all types of messages in all types of logs. The Nimsoft Log File monitoring probe (logmon) can simplify the job for the systems operations staff
by:  

 

  • Automatically informing about error situations immediately after they have occurred.  
  • Filtering out the log-file entries that need manual action. Usually the majority of entries in a log-file are not of interest to the daily operations staff. By setting up watchers and filters inlogmon, alarms are generated only for the important log-file messages.  
  • Specifying a more informative alarm message by modifying the original message text, thus

helping the operations staff to locate and fix the problem more easily, without requiring assistance from the system specialist.  
Logmon can be configured to monitor ASCII log-files in any format. Experience has shown that very few log-files have the same layout. Some files are line-oriented (single-line files like the UNIX system log-files /usr/adm/messages), while other log-files are record-oriented (multiple-line files, like the ones produced by Oracle). The logmon probe monitors both line-oriented and record-oriented log-files effectively, using a powerful regular expression and/or pattern-matching scheme. The probe checks
the log-file for new entries at user-configurable, timed intervals, keeping track of the position within the file between each run. This ensures that only one alarm is sent per log-entry, even if the log-file is truncated or wrapped in the meantime.  


A single instance of the logmon probe can be configured to monitor multiple log-files. Within each log-file, logmon can be set up to look for occurrences of many different log-file entries with each log-file entries generating a different alarm message, which may contain both text from the original log-file entry and/or user-defined text. 

 

To read the rest of this document please download the attached PDF

carl

REGEX: Part 1 - The Basics

Posted by carl Aug 29, 2012

We get a lot of cases logged with regards to Regex and it's usage, for those who may not have used regex before here is a little guide:

 

Regex is a powerful pattern matching language used as part of many of the Nimsoft Probes and Tools, the following is a brief introduction to Regex and it’s usage within the Nimsoft NMS software.

 

The first point to make is Nimsoft implementation regex is based around Perl’s implementation (Other flavors use differing syntax).

 

When specifying a pattern match using REGEX in NMS we need to tell the NMS software we want to use Regex, we do this by opening and closing the syntax with / [Forward-slash] as in the following example.

 

/<regex syntax goes here>/

 

The next most basic usage is the “match all” characters statement:

 

.*            (Dot Asterisk) – Dot represents any alpha-numeric, character or special character. Asterisk represents “Any number of times”. The two used in conjunction create the expression match anything any number of times (everything!)

 

|              (pipe Symbol) Is used as an OR operator

 

\              (Back-slash) Is the escape character operator and is used to escape special characters. As a example if you wanted to match a back-slash in an expression, it has a special meaning in regex and hence has to be escaped like \\

 

\s            Matches white spaces, ie breaks between words

 

 

 

Let’s take a look at a very simple REGEX statement to match an alarm with the message of:

 

Average (5 samples) total cpu is now 82.61%, which is above the warning threshold (75%)

 

 

/.*total\scpu.*above.*threshold./

 

The above expression would match an alarm that comes into the NAS wich states total cpu is above its defined threshold. Using the example and syntax definitions try to work out how this works.

 

 

Now let’s imagine we would like to match a similar alarm but for memory:

 

Average (5 samples) total cpu is now 82.61%, which is above the warning threshold (75%)

 

 

We could write a similar expression as to the one previous, however in some cases it might make more sense to have one regex rule match both alarms we do this by using “grouping” syntax and the | (pipe) [OR] operator.

 

()             (Open, Close Parenthesis) Wrapping syntax in parenthesis creates a “group” this is useful to us in two ways. It allows us to isolate certain parts of our syntax and hence use operators local to that part of the expression or use references to those “groups” in the NMS software (The logmon probe is a good example of addressing groups). In the following example we use a “group” to isolate part of our syntax so we can use the OR operator on just that section of the expression.

 

/.*total\s(cpu|memory).*above.*threshold.*/

 

Notice the group (cpu|memory) section which essentially states if the string matches cpu OR memory then match. Grouping this section is very important without the parenthesis the expression would say match the string:

 

total cpu

OR

Memory above threshold

 

So that’s the basics covered and you should now be able to create pattern matches based on regular expressions within the Nimsoft NMS Software. You will find these techniques especially useful then using the NAS Auto-Operators and probes such as Logmon.

carl

MSI Robot Installation

Posted by carl Aug 29, 2012

Version 5.60 of the Nimsoft NMS server introduced RMP and MSI packages for mass deployment of robots. The MSI version is a little different to many MSI in the fact it requires an answer file located in the install directory. The answer file is a basic robot configuration file called nms-robot-vars.cfg and follows the below format.

 

DOMAIN=companyZdom
HUBIP=145.23.31.5
HUB=companyZhub
HUBROBOTNAME=hubsysA
ROBOTNAME=NotNecessarilyHostnameButSomethingElseIChoose
ROBOTIP=198.201.4.7
HUBPORT=48002
FIRST_PROBE_PORT=48000
SECONDARY_DOMAIN=
SECONDARY_HUB=
SECONDARY_HUBROBOTNAME=
SECONDARY_HUBIP=
SECONDARY_HUBPORT=
SECONDARY_HUB_DNS_NAME=

 

This file should be place in the same location the MSI is copied to. At the bottom of the page is a windows batch file created to make the install process of the MSI robot much more automated. The script completes the following:

 

1 - Creates a temp mapped drive
2 - Copies the MSI to the local server
3 - Creates the nms-robot-vars.cfg
4 - Set the ROBOTNAME to the hostname of the device
5 - Sets the robot IP to that of the first interface in ipconfig (if someone whiches to improve on this please leave a comment below! :smileyhappy:
6 - Starts the MSI installation using the nms-robot-vars.cfg to build the robot configuration
7 - Deletes the robot MSI from the local machine
8 - Un-mapps the network drive

Lets investigate this step by step to understand what the script is doing and how to modify it. As always if anyone has suggestions or tips to improve the script we would love to hear from you either via email or the comments section below.

 

Map the nework drive

 

:: Edit the following line to map a temp drive to a share with the MSI file hosted there
net use r: \\<ip or UNC>\<file_share>

Create a file share that the remote servers can connect to and transfer the MSI file from. Copy the MSI file to this network share. Find the above lines in the script and change to reflect the share location.

 

Set the install path

 

::smileyfrustrated:et the drive letter and install path then create it
set install_path=progra~1\nimsoft\

set drive_letter=c:\

Change the install path by chaging the variables in the script.

 

Set up the nms-robot-vars.cfg - Domain and Hub

 

:: These variables should stay static so no need to dynamicly update, just change to suit your environment
echo DOMAIN=nevil >> %drive_letter%%install_path%nms-robot-vars.cfg

 

The lines in the script such as the above will build the nms-robot-vars.cfg you should change the variables to suit your environment, domain, hub name/ip etc. These should stay fairly contstant an hence do not need to be changed dynamically or updated very often.

 

 Set up the nms-robot-vars.cfg - IP and Robot name

 

::Robotname is currently set using the hostname as %host% variable or change to manually overide
echo ROBOTNAME=%HOST% >> %drive_letter%%install_path%nms-robot-vars.cfg
::IP address of the first adaptor it finds.
echo ROBOTIP=%IP% >> %drive_letter%%install_path%nms-robot-vars.cfg

 

The IP variable will take the first IP address it finds from ipconfig, this might not be suitable for all environments

The above two lines set the robot IP and robot name, the IP is gained from a cat of ipconfig.exe and looking for the "IPv4 Address" this means it will take the first configured IP address it finds and this may pose a limitation in some environments. The code below is what parses the IP from ipconfig:

 

::Gets the IP address of the first adaptor it finds
for /f "usebackq tokens=2 delims=:" %%f in (`ipconfig ^| findstr /c:"IPv4 Address"`) do
(        set IP=%%f)

 

Last but not least the script sets the robot name to the hostname of the device. All these settings can be changed later when the robot is in the NMS system via Infrastructure Manager but these setting will get the robot up and running with a unique identifier. Below is the code from the script to find the hostname and set it as a variable:

 

::Allows %HOST% to be used as a variable for the robotname
for /f "tokens=* delims= " %%a in ('hostname') do (set HOST=%%a)

 

The nms-robot-vars.cfg is now built and ready to use my the MSI as an answer file.

 

Start the MSI Installation and clean up



The script will now copy the MSI file locally to the server and start the install. For a full list of switches and their meaning execute MSIEXEC from the command line. Once the file has been copied and executed by MSIEXEC the MSI itself is deleted and the network drive unmapped.

 

::smileyfrustrated:tart the MSI install from mapped drive R: uncomment relivent architecture
cp R:/nimsoft-robot-x64.msi %drive_letter%%install_path%
msiexec /i %drive_letter%%install_path%nimsoft-robot-x64.msi /qn INSTALLDIR="%drive_letter%%install_path%"
sleep 60

del
%drive_letter%%install_path%nimsoft-robot-x64.msi

net use r: /delete:finish

 

 

Distributing the batch file to remote servers

 

The last part of this guide discusses pushing the script we have been working on to the remote servers for this we reccomend PSEXEC.

PSEXECis a free microsoft tool and can be downloaded here:  http://download.sysinternals.com/Files/PsTools.zip

 

PSEXEC Usage

Open a dos window and launch the PSEXEC Tool:

 

Install on all computers currently logged in your domain

psexec \\* -s \\Server\NetLogon\robot_install.bat

 

Install on a single computer

psexec \\COMPUTER_NAME -s \\Server\NetLogon\robot_install.bat

 

Install on all computers using the domain administrator credentials

psexec \\* -s -u Domain\Administrator -p Password \\Server\NetLogon\robot_install.bat

 

Install on specific computers (ALL.TXT is a text file that lists target computer names, one per line), using domain administrator credentials

psexec @ALL.TXT -s -u Domain\Administrator -p Password \\Server\NetLogon\robot_install.bat

 

Note: when doing mass deployment on multiple computers, you should monitor the response file to get a list of computers that were not deployed (were not connected when the PSEXEC ran), and run the PSEXEC mutliple times throughout the business hours to make sure all your computers are getting installed.

carl

LUA - Tables for Newbies

Posted by carl Aug 10, 2012

After starting out with LUA and struggling with tables and getting my data out of them and thanks to various people on this forum I have started to get my head around how they work. I wrote this short article in effort to share this beginner knowladge with others like myself.

 

This will also be in the Nimsoft KB but posting here also as this seems to be the "place" for development chatter.

 

There are probably many better ways to achieve similar results but if, like myself your struggling to get your head round tables this might help get you started :smileyhappy:

 

-------------------------------

 

 

In this article we will take a closer look at LUA tables. Most of the data we return from callbacks will be stored in LUA tables hence we need a strong understanding of table structure and how to access the data contained within.

 

First we build a basic script to return the values from the “getrobots” call back from a hub and store this data in a table. For a more in depth look at building call back requests with the nimbus.request function see other articles in this LUA series.

 

--Connect to hub probe. Edit this line with the address of your hub you wish to query local addr = "/nevil-nmsdom/nevil-nmshub/nevil-nms/hub" --Command to request configuration of probe local command = "getrobots" --Build PDS answer file when probe_config_get asks for probe name local args = pds.create() pds.putString(args, "name", "", "detail", "")  --Send request and store data in h_resp{} local h_resp,rc = nimbus.request(addr, command, args)   print(h_resp)   

  Execute this in the NAS script editor and it will produce out similar to the below:

 

 ----------- Executing script at 07/08/2012 15:07:06 ---------- 

table:0x2275b50

 

 In order for us to extract anything valuable from the said table we need to understand the table structure. At this point I refer you to the “LUA- tdumper” article. If you have not yet followed this article I highly suggest you do as the rest of this article will be much easier to follow. So assuming we have the tdumper.lua file saved in the nas/scripts/ folder at the top of our script we can use:

 

dofile "scripts/tdumper.lua"

 

 

to include this function in our above code and replace the print function with:

 

tdump(h_resp)

 

 

Executing the script now produces a full list of the table. A snippet is shown below for reference:

 

----------- Executing script at 07/08/2012 15:14:07 ----------         root:       domain:nevil-nmsdom       robotlist:         1:           ssl_mode:0           os_user2:           origin:nevil-nmshub           os_major:UNIX           ip:192.168.1.58           os_minor:Linux           addr:/nevil-nmsdom/nevil-nmshub/nevil-multibot_03           status:0           license:1           last_inst_change:1340754931           created:1341306789           offline:0           last_change:1341306869           lastupdate:1344348377           autoremove:0           os_user1:           flags:1           os_description:Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64           name:nevil-multibot_03           metric_id:M64FB142FE77606C2E924DD91FFCC3BB4           device_id:smileyvery-happy:DFF83AB8CD8BC99B88221524F9320D22           heartbeat:900           port:48100           version:5.52 Dec 29 2011

 

Addressing Table Elements

 

Now we can preview the table structure we can start selecting table elements. The output you see from this request is called a “nested table” which is essentially tables stored within tables and can get quite messy for the beginner to get their head around. Programmers with a background in Perl for example will have a distinct advantage when tackling LUA tables.

 

So what do we have here? Let’s break this table up a little, to keep things simple lets list the main elements of the table. Start by commenting out our tdump function:

 

--tdump(h_resp)

 

 

Add the following:

 

for k,v in pairs(h_resp) do print(k.."    ",v) end

 

 

This simple code says for each table entry in the h_resp table print its key (k) and value (v) pairs. The table h_resp we created has two initial entries domain and robotlist:

 

domain     nevil-nmsdom robotlist     table:0x22a0a60

 

 

 

From this return we can see the key for “domain” has a string value of “nevil-nmsdom” the name of my hub robot. The robotlist key is slightly more complicated as its “value” is actually another table, a “nested” table. To list the entries from the “robotlist” key we can just append the key name to the table name.

 

for k,v in pairs(h_resp.robotlist) do print(k.."    ",v) end

 

 

This provides us with yet another list of tables, one for each of the robots the call back found:

 

 

  1    table:0x2213600   0    table:0x2392ab0   3    table:0x2394c30   2    table:0x225eb80   5    table:0x22133e0   4    table:0x22131f0   7    table:0x2275b00   6    table:0x227d6f0   8    table:0x227d490

 

 

 

We can drill down further and examine one of these tables with:

 

for k,v in pairs(h_resp.robotlist["1"]) do print(k.."    ",v) end

 

 

We can provide any of the integer values of the “key” to examine that particular table in this case we used “1”

 

 

ssl_mode    0   os_user2      origin    nevil-nmshub   os_major    UNIX   ip    192.168.1.58   os_minor    Linux   addr    /nevil-nmsdom/nevil-nmshub/nevil-multibot_03   status    0   license    1   last_inst_change    1340754931   created    1341306789   offline    0   last_change    1341306869   lastupdate    1344590476   autoremove    0   os_user1      flags    1   os_description    Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64   name    nevil-multibot_03   metric_id    M64FB142FE77606C2E924DD91FFCC3BB4   device_id    DDFF83AB8CD8BC99B88221524F9320D22   heartbeat    900   port    48100   version    5.52 Dec 29 2011

 

 Now all that probably seemed like a lot of work but by now we should be getting used to how to address tables and start getting at the data we want.

 

Now let’s assume we are just interested in the “name” key. We can print the value for they “name” key of this table with the following syntax:

 

print(h_resp.robotlist["1"].name)    ----------- Executing script at 10/08/2012 11:56:57 ----------    nevil-multibot_04

 

If we wanted to return all the name values for each of the tables stored under “robotlist”:

 

 

for k,v in pairs (h_resp.robotlist) do print(v.name) end   ----------- Executing script at 10/08/2012 13:41:30 ----------    nevil-multibot_03   nevil-mysql   nevil-multibot_02   nevil-multibot_04   debian-apache   nevil-ump   nevil-nms   nevil-windows-dc1   Nevil-MS-SQL

 

 

As we discussed we can iterate through an entire table and its nested tables using the tdump function. However from time to time you might just require to iterate over a single nested table. To help us understand which table to iterate over for a particular robot edit our previous syntax to display the “key” the robot “Name” value descended from.

 

for k,v in pairs (h_resp.robotlist) do print(k,"    "..v.name) end   ----------- Executing script at 10/08/2012 13:51:49 ----------    1    nevil-multibot_03   0    nevil-mysql   3    nevil-multibot_02   2    nevil-multibot_04   5    debian-apache   4    nevil-ump   7    nevil-nms   6    nevil-windows-dc1   8    Nevil-MS-SQL

 

Now we can see what table index within h_resp.robotlist belongs to which robot. As an example iterating over h_resp.robotlist[“1”] would give us the information about nevil-multibot_03, we have already written the code for this in a previous example as an exercise I will leave you to explore this.

 

 As a final note in the article let’s look at how we would iterate over the whole of h_resp.robotlist and just this nested table. This becomes a little more technical as we need to create a new function.

 

 

function DeepPrint(a)       if type(a) == "table" then         for k,v in pairs(a) do             print(k.."    ",v)             DeepPrint(v)            end     else       end     end

 

 

I won’t go into any great depth here but if the script sees that a values nests another table it will iterate over that too.

 

Execute this function passing it the name of the table we want to iterate over:

 

DeepPrint(h_resp.robotlist)    ----------- Executing script at 10/08/2012 13:59:17 ----------    1    table:0x2394df0   ssl_mode    0   os_user2      origin    nevil-nmshub   os_major    UNIX   ip    192.168.1.58   os_minor    Linux   addr    /nevil-nmsdom/nevil-nmshub/nevil-multibot_03   status    0   license    1   last_inst_change    1340754931   created    1341306789   offline    0   last_change    1341306869   lastupdate    1344603075   autoremove    0   os_user1      flags    1   os_description    Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64   name    nevil-multibot_03   metric_id    M64FB142FE77606C2E924DD91FFCC3BB4   device_id    DDFF83AB8CD8BC99B88221524F9320D22   heartbeat    900   port    48100   version    5.52 Dec 29 2011   0    table:0x22135b0   ssl_mode    0   os_user2      origin    network   os_major    UNIX   ip    192.168.1.31   os_minor    Linux   addr    /nevil-nmsdom/nevil-nmshub/nevil-mysql   status    0   license    1   last_inst_change    1342178648   created    1341922451   offline    0   last_change    1344591508   lastupdate    1344603108   autoremove    0   os_user1      flags    1   os_description    Linux 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64   name    nevil-mysql   metric_id    M7CDEF667D86384966EF5876BEAB7AA52   device_id    D518C83671EA63842F7C08EEBA7DD63E2   heartbeat    900   port    48000   version    5.63 Mar  2 2012