[askCA TRANSCRIPT] CA Unified Infrastructure Management – February 22, 2018

Document created by MelissaPotvin Employee on Feb 22, 2018Last modified by MelissaPotvin Employee on Mar 26, 2018
Version 3Show Document
  • View in full screen mode

Thank you for joining askCA for CA Unified Infrastructure Management. I want to thank the CA Team DavidLeDeaux Larry_Atlas and Gene_Howard for being available to take customer questions in a live Q&A setting and of course want to thank our customers for joining today, we hope you found today's session helpful.  

 

If you have any questions about the direction of the product I encourage you to register for a product roadmap session (registration is required) Product Roadmaps & Feedback Sessions - CA Technologies 

------------------------------------------

 

from Melissa Potvin (CA) to everyone: HI Everyone! we will get started in just a few minutes!

from Melissa Potvin (CA) to everyone: You can start asking questions now. I have support engineers here and ready to help.

from Praveen to everyone: Hi Team - Why a probe goes to inactive state? In what basis the work flow is happening . Is there any way with UIM to send a daily report of those probes.

from Gene Howard to everyone: @Praveen

from Gene Howard to everyone: @Praveen at this time there is no report you can run out of the box on this.

from Gene Howard to everyone: the probes go inactive or turn into a error state usually robots go inactive. but we would need to look at log files from the controller to investigate this.

from Lawrence Atlas (CA Technologies) to everyone: @Praveen, just for clarification, do you mean robots go inactive or probes? if probe, what probe do you see go inactive?

from Praveen to everyone: Thanks Gene ,but most the case robot is fine and supporting hdb ,spooler goes .

from Mike Arnone to everyone: Is there an ETA on UIM 9.0? Or is there any posting of the New Features?

from Lawrence Atlas (CA Technologies) to everyone: @mike, there is a raodmap session coming up.

from Gene Howard to everyone: @Praveen Sounds like the robot is the problem. You could run a sql_query ont he nas-alarms tables daily to see what robots are problems to address

from Lawrence Atlas (CA Technologies) to everyone: you can sign up here:

from Gene Howard to everyone: @Praveen I would make sure the problem robots are verison 7.93

from Lawrence Atlas (CA Technologies) to everyone: https://www.ca.com/us/product-roadmaps.html

from Gene Howard to everyone: @Praveen the other thing to check is to make sure if they are linux boxes that the file systems are not read only. I have seen linux boxes restart and come up in read only mode and this will cause the type of issue. Need to work with system admin to understand why this is happening.

from Praveen to everyone: @Gene - I have robots with 7.91 having issue ..sometime even cdm goes ,after validating it comes up

from Praveen to everyone: Thanks Gene!! ..

from Lawrence Atlas (CA Technologies) to everyone: Do the ip addresses change?

from Lawrence Atlas (CA Technologies) to everyone: this would cause the robot to go inactive and needing to run validate security to get it to work again

from Praveen to everyone: No Lawrence all are static IP .. it is happening for both windows and Linux ...

from Praveen to everyone: Is there any way to convert the system up time from secs (default qos)to no of hours or day format?

from Mike Arnone to everyone: I know you have a PVS Probe (for the backend), but are you planning to have a Robot that will work on the actual Provisioned servers? Provisioned servers have a Master server that machines are provisioned from and the C: drive gets reset on every reboot. This sometimes causes us issues. We also explored the RSP Probe, but this probe has 0 automation & every server must be added painfully via the GUI.

from David LeDeaux (CA Technologies) to everyone: @Praveen there is no way to change the units that are sent from the probe, but you can create a custom dashboard/report that will convert to the units desired

from Praveen to everyone: @ David - Can you please share the procedure. It will be helpful

from Gene Howard to everyone: @ Mike that sounds like you need to use the cloud install for the robot with a request.cfg

from Gene Howard to everyone: this would allow you to update your master image and not start the robot for x number of reboots and allow the robot to come up the first time the image is deployed. the request.cfg would then pull down the configuration and probes you need to run while it is up.

from Gene Howard to everyone: If I understand your question properly.

from David LeDeaux (CA Technologies) to everyone: @Praveen can you email me at david.ledeaux@ca.com? I'll need a little time to put something together

from Naveen Kumar to everyone: Is there any plan to use thik clients to avoid admin console for multitanancy envourmment becaue of multiapl N/W issue we are not able to use admin console and all new probes are only support admin console

from Praveen to everyone: Thanks David !!

from Gene Howard to everyone: @Naveen sorry currently not. But please check the road map session and talk to the product owners on this.

from Mike Arnone to everyone: @Gene - It's not the install that's the issue, it's that the C: drive gets reset on every reboot. I believe we tested & found the cloud install did not help. UIM doesn't like that.

from Praveen to everyone: Question regarding cluster Probe:
Assume that I have cluster probe configured in windows box, I want to monitor a service which is part of cluster i:e( SQL service)and I have enabled in nt services …ideally the service will be running in one node ,how do the service configured in nt-services will move to other node when cluster failed like how it works in cdm probe with disks .
from Mike Arnone to everyone: @Gene - even when we redirected files from the C: drive to a small persistant drive that they have, it still causes UIM to get machines cross-linked in the UIM DB.

from Gene Howard to everyone: I do know at least a couple of clients that do this and it works for @Mike them. trick is getting the correct number of restarts to wait. has to be tested but should resolve the issue .

from Lawrence Atlas (CA Technologies) to everyone: @Praveen, that may be an enhancement request for the processes probe.

from Gene Howard to everyone: @Mike the reason the get mixed up in the database is because of the nimid which is set the first tiem the robot starts. With the cloud install this is not created until the machine is provisioned and as such will be unique.

from Praveen to everyone: For NT- Services Lawrence ?

from Mike Arnone to everyone: Maybe CA needs to provide a Wiki Doc on how to get Robots to work on PVS provisioned servers... We have has mixed results.

from Lawrence Atlas (CA Technologies) to everyone: @Praveen, or NTservices probe, yes, it would be a feature request to support cluster nodes, otherwise it will just alarm when you switch nodes and the service is no longer active

from Praveen to everyone: Can you Please telll me the specfic config for this setting

from Melissa Potvin (CA) to everyone: @Mike, will take your suggestion to the documentation team..

from Naveen Kumar to everyone: cluster probe should auto populate the all resources which are shared no need to add them manually

from Mike Arnone to everyone: On the Provisioned servers, Alarms for servers sometimes get cross referenced with other machines. EX: Robot Server021, 092 & 106 Is Inactive sometimes is linked to machine Server087, which doesn't exist as a CI.

from Mike Arnone to everyone: @Melissa - Thanks!

from David van Lith to everyone: I watched a presentation at CA World about a general RESTMON probe. Any news on this probe?

from Gene Howard to everyone: @David, Sorry we do not have product managers on call today. But please attend the roadmap session and they should be able to answer this questions for you

from Gene Howard to everyone: https://www.ca.com/us/product-roadmaps.html

from Praveen to everyone: Is there any way to convert emails as alarms? Earlier there was an probe, now it is not supported any alternatives to achieve this ?

from Gene Howard to everyone: @Praveen, currently no there is not.

from David van Lith to everyone: I just joined, so maybe someone already asked, but is there a date for UIM 9?

from Lawrence Atlas (CA Technologies) to everyone: @Praveen, there looks like a customer built probe for email to alarm, you can find out more here: https://communities.ca.com/thread/241725365

from Lawrence Atlas (CA Technologies) to everyone: I do not know if it still works or not, it was written a long time agao

from Lawrence Atlas (CA Technologies) to everyone: David, please sign up for the roadmap session next week. https://www.ca.com/us/product-roadmaps.html

from David van Lith to everyone: Ok thanks, I just signed up

from Mike Arnone to everyone: I would like to see a PING test built into the Hubs because "Robot servername is Inactive" is ambiguous & does not state if the Robot is Down or the Server is Down. These are 2 different priorities. Server Down is a Sev 1/2 & Robot Down is a Sev3/4. One requires someone to be woken up in the middle of the night & the other doesn't. There are LUA Scripts & other things to tinker with, but as a long-time customer who came from Unicenter NSM, this was always included in the product. Now in UIM it isn't.

from Mike Arnone to everyone: Even CA GIS internally has to "tinker" with scripts to run Ping tests to handle Robot Inactive alarms.

from Gene Howard to everyone: @Mike currently that is not possible with the hub probe. Have you checked the net_connect with a port check for the spooler to use instead of the robot inactive alerts?

from Praveen to everyone: Is there any possibility to schedule excel report from USM?

from Mike Arnone to everyone: @Gene - why Should we double the man-hour work to Monitor a server? There is No Automation to add to Net_Connect that I am aware of. We have thousands of servers in UIM. This will causes extra man hours to maintain. DB Duplicates can happen when you add machines to different Hubs than the Robot is on (Yes, I know David L has an article on Discovery Correlation).

from Gene Howard to everyone: @praveen what reports are you looking for exactly?

from Praveen to everyone: @gene basic cpu ,memory,disk .. i can do in pdf not excel

from Gene Howard to everyone: @Mike I do understand that it is not a perfect solution for you. But currently the product can not do what your asking. The product management team would have to take this up as an enhacement request to move that forward as a new feature.

from Gene Howard to everyone: @Praveen sorry currently no this would be an Enhacement request to have this ability added.

from Praveen to everyone: When I install a robot what all system parameters the agent checks, like etc host entry ..(Windows and linux)

from Gene Howard to everyone: @Praveen the robots checkes the OS type and verison to make sure it is supported. ON linux we expect the IPV4 address and host name to be on the first line not the 127.0.0.1 that it usually is by default.

from Mike Arnone to everyone: @Gene - We have about 3000 servers already in UIM. Imagine the man-hours needed to go into Net_Connect & add Every machine that has a Robot, just to get simple Ping Monitoring to avoid ambiguous Robot Inactive Alarms. I feel this is a poor design & this oversight needs to be addressed by UIM Product Mgrs. Why don't you setup a poll of customers on what their thoughts are on this issue & how high of a priority this is to them, I believe you will be surprised by the results.

from Mike Arnone to everyone: Is there a way for UIM to monitor it's own Hub Queues? In the past we have has issues with "Stuck" queues & the only way to find out when you suspect a problem is to open the Hub probe & look at the tab with the Queue statitics.

from Melissa Potvin (CA) to everyone: @mike, good point on the poll or can i interest you to start a new discussion thread in the community and i can help you promote it?

from Lawrence Atlas (CA Technologies) to everyone: @mike, the issue comes down to the nas queue backing up, if that is not working, you can't alarm.

from Gene Howard to everyone: @Mike so the hub should already bee sending out an alarm when the queue size increases. the problem becomes if it involves nas then alarms can not process and be seen.

from Melissa Potvin (CA) to everyone: @mike, i will also make sure product managers see your comment.

from Mike Arnone to everyone: @Larry, an e-mail would be sufficient. Or I can setup an external process to monitor the Queues via Scheduled Task & trigger something externally.

from Gene Howard to everyone: Usaully something outside of UIm has to monitor the primary UIM server to be able to get notifications at this time...

from Lawrence Atlas (CA Technologies) to everyone: https://marketplace.ca.com/shop/uim/ca-uim-hub-queue-statistics-probe.html

from Praveen to everyone: @Mike i think we used to get alerts if the queue got queued more than specific valve >150 MB i:e in remote hubs

from Mike Arnone to everyone: FYI: Our UIM is connected to SOI using UIM Connector

from David van Lith to everyone: Currently we use automatic robot with probes installations with a lot of features like the request.cfg, cloud.cfg, putting files in /nimsoft/robot/changes, regenerating device ID's etc. Would be nice if there would be some best practice documents about thes installation, also for cloned servers. The documentation now is spread in the community and some CA doc's.

from Lawrence Atlas (CA Technologies) to everyone: @All, remember if you have a quick technical question, you do not need to open a case, we have live agent chat. You can learn more here: https://www.ca.com/us/services-support/ca-support/contact-support.html?intcmp=headernav

from Gene Howard to everyone: I know one client created a sql agent job that checkec the nas_alarms tables for updates if it one was not seen in 5 minutes would generate an alarm

from Gene Howard to everyone: acutally an email from a script

from Melissa Potvin (CA) to everyone: @David, another good one for the documentation team. i will make sure they see your comment. thank you for the feedback.

from Mike Arnone to everyone: FYI: When you release the MCS Probe with the new Garbage Collection feature, remember to give a Strong Warning that if you have been running MCS for a while, this will Bog Down your DB, unless you run the stored procedure to do an MCS DB Archive first.

from David van Lith to everyone: In some CA documents it is mentioned that the transactionlog.db, shouldn't be greater than 300MB. But in our environment this is just impossible, we have the history and the history summary back to 1 day and the .db always is bigger. Is 300 MB still a best practice since I have no Idea anymore to get it below 300 MB. Normally it is around 1 GB.

from Melissa Potvin (CA) to everyone: final call for questions. 5 minutes left.

from Gene Howard to everyone: @David yes that is still the best practice

from Gene Howard to everyone: Usaully under < 1 gig should be fine.

from David van Lith to everyone: So the only way to get this smaller is to minimize alarms?

from Praveen to everyone: @david - Any future plans are there to extend it performance if reaches more than 2 GB or any log rotation

from Gene Howard to everyone: @David correct. When the move to EMS is complete this limitaion will be removed.

from Naveen Kumar to everyone: in my case its more than 1 GB

from Mike Arnone to everyone: Speaking of EMS - does this probe have any access to the Alarms table?

from Praveen to everyone: if it reaches more than 500 MB ,alarm_manger is not working properly .

from Melissa Potvin (CA) to everyone: @MIke, Larry is going to take your final question...

from Lawrence Atlas (CA Technologies) to everyone: @mike, not at this time.

from Melissa Potvin (CA) to everyone: That is a wrap for today everyone!

from Mike Arnone to everyone: @Larry - thanks, I'll be sticking to NAS for now.

from Melissa Potvin (CA) to everyone: Thank you so much for participating, really great questions. will follow up on action items to prod mgmt and documentation team.

from Praveen to everyone: Thanks folks !!

from Melissa Potvin (CA) to everyone: Good Bye Everyone!

from Mike Arnone to everyone: Thanks

from Mike Arnone to everyone: @Melissa - please post the meeting

from Naveen Kumar to everyone: Thanks Everyone

Attachments

    Outcomes