DX Unified Infrastructure Management

Expand all | Collapse all

Robot restart via LUA

  • 1.  Robot restart via LUA

    Posted Sep 12, 2018 10:25 AM

    I would like to know how to restart a robot using LUA script when the below  alert generates

    Max. restarts reached for probe 'cdm'

     

    It is generating from controller probe. I have seen after restarting the  NImbus Watcher service , it is getting the port and the issue is resolved. Please advise.



  • 2.  Re: Robot restart via LUA
    Best Answer

    Broadcom Employee
    Posted Sep 13, 2018 05:01 AM

    That is quite easy to do.

     

    Here is the lua code for that bit of the script…

     

          controller = r_value.addr.."/controller"

                print ("  Restarting robot: " .. r_value.name)

                out,rc = nimbus.request(controller,"_restart")

                if rc == 0 then

                   print("  Robot:" .. r_value.name .. " : restart executed \n")

                else

                   print("  ******* Robot:" .. r_value.name .. " : restart failed \n")

                end

     

    you just need to setup the r_value.addr which is the fully qualified robot address (/domain/hub/robot)

    cheers



  • 3.  Re: Robot restart via LUA

    Posted Sep 13, 2018 07:08 AM

    Thanks Rowan. 

     

    We have  many hubs and i am looking for centrally putting the LUA script on primary hub. Can you help me how to add automatically picking up the fully qualified robot address (/domain/hub/robot) from the alert and restart it  ?



  • 4.  Re: Robot restart via LUA

    Broadcom Employee
    Posted Sep 13, 2018 07:55 AM

    Ok I'm feeling kind and will give you the script

     

    Try this script in your AO profile…

    You can test it by putting in a nimid of an existing alarm in the alarm get (see commented line below)

     

    --

    -- robot_restart.lua

    --

    print('Robot Restart script')

    print('====================')

    print(' ')

     

    --al=alarm.get("RC43670325-12873")

    al=alarm.get()

     

    domain=al.domain

    hub=al.hub

    robot=al.robot

     

    robotaddr = "/"..domain.."/"..hub.."/"..robot

    print ("  Restarting robot: " .. robotaddr)

     

             controller = robotaddr.."/controller"

                out,rc = nimbus.request(controller,"_stop")

                if rc == 0 then

                   print("  Robot:" .. robotaddr .. " : restart executed \n")

                else

                   print("  ******* Robot:" .. robotaddr .. " : restart failed \n")

               end

     



  • 5.  Re: Robot restart via LUA

    Posted Sep 25, 2018 07:35 AM

    I am getting the  below error. I tried executing from NAS editor.  Please help

     

    ----------- Executing script at 9/25/2018 7:31:39 AM ----------

    Robot Restart script
    ====================

    Error in line 11: attempt to index global 'al' (a nil value)



  • 6.  Re: Robot restart via LUA

    Broadcom Employee
    Posted Sep 25, 2018 07:40 AM

    You need to put in your actual current alarm when you run in the editor.

    Replace "RC43670325-12873” with one of your nimid's and uncomment the line.

    Comment the "al=alarm.get()” out while you are testing;  this line gets the nimid from the autooperator when you run it properly.



  • 7.  Re: Robot restart via LUA

    Posted Sep 25, 2018 07:44 AM

    Worked now. I replaced with the nimid. Is it possible to run this script for all servers without the  nimid? Automatically picking up the server address from the alarm and activating the reboot ?



  • 8.  Re: Robot restart via LUA

    Broadcom Employee
    Posted Sep 25, 2018 07:53 AM

    When you run without the nimid, the script is being triggered by the auto operator which is using the alarm to trigger it, which is where it gets the nimid from.

    If your criteria is wide then the script will be triggered for multiple alarms and therefore restart multiple robots

    You can run the script on a schedule and code a loop so that it restarts all robots attached to a particular hub.

    There are quite a few different options for you.

    Regards

     



  • 9.  Re: Robot restart via LUA

    Posted Sep 25, 2018 09:17 AM

    Thanks Rowan. Please review this code and let em know if it works or not.

     

    al = alarm.list()

    re = "%p%a+%d*_*%a*%d*%p"
    if al ~= null then
    for i = 1,#al do
    if al[i].prid == "controller" then

    if string.match(al[i].message,"Max. restarts reached for probe") then

    probe = string.gsub(string.match(al[i].message,re),"'","")
    --print(al[i].message.."! Probe-> "..probe)

    addr = "/"..al[i].domain.."/"..al[i].hub.."/"..al[i].robot.."/".."controller"
    --printf("/"..al[i].domain.."/"..al[i].hub.."/"..al[i].robot.."/".."controller".."<->Probe="..al[i].prid)

    local args = pds.create()
    pds.putString(args,"name",probe)
    nimbus.request(addr,"_stop",args)
    pds.delete(args)
    sleep (100)
    end
    end
    end
    end



  • 10.  Re: Robot restart via LUA

    Broadcom Employee
    Posted Sep 25, 2018 11:00 AM

    There are a few problems here.

    You are mixing arrays and pds's so your loop is not going to work (#al=0)

     

    But my first question is why alarm.list ?

    It won't work with the AO profile as only alarm.get() works that way so "al” will be empty.

     

    If this is triggered by an alarm then you need to use alarm.get().

    I think you are trying to run this script for all your alarms of "Max restarts….” In which case you don't need the loop.

     

    For loops are only used when, for example, you are processing multiple rows from a database call.

    Also I don't see the point of the sleep(100).

     

    Here's is what I would code, if I understand your req's correctly…

     

    al = alarm.get ()

     

    if al ~= nil then

             if al.prid == "controller" then

                if string.match(al.message,"Max. restarts reached for probe") then

                print("message match")

                probe = string.gsub(string.match(al.message,re),"'","")

                print(al.message.."! Probe-> "..probe)

     

                addr = "/"..al.domain.."/"..al.hub.."/"..al.robot.."/".."controller"

                printf("/"..al.domain.."/"..al.hub.."/"..al.robot.."/".."controller".."<->Probe="..al.prid)

     

                local args = pds.create()

                pds.putString(args,"name",probe)

                nimbus.request(addr,"_stop",args)

                pds.delete(args)

    --sleep (100)

     

    end

    end

    end

     

     

    also you proabably should put some error handling after the nimbus.request.

     

    cheers

     



  • 11.  Re: Robot restart via LUA

    Posted Sep 25, 2018 11:59 AM

    Thanks Rowan. Let me try this in my lab and let you know the results.