We have run into more problems on 9, including outages and nagging little stuff but I heard that we are unusual in that respect. Is 9 better or worse or about the same from a reliability standpoint?
AM or OM V8? And are you talking about Automation Engine V9 or AM V9?
Sorry, I am not familiar with how the products are named and so on, but here is the explanation from somebody who is:
We were considered to have been
running OM Operations Manager V8, and the upgrade path puts us at Automation
Engine V9.00A with no Service Packs installed. There are minor variations
between the initial release depending on when you downloaded the installation
software and that is in the build level which is 9.00A312-371 to be
exact. That is how UC4/Automic would see it. For the layman(you and
I) we have always had a directory called AutomationEngine which is where the
main application and logs are stored. When we called into the Support
line for V8 issues we put in calls as OM V8.00A customers, now we put in calls
as AM V9.00A customers. Hope this is clear as mud. Thanks
My organization upgraded from V8 to V9 perhaps 1.5-2 years back, and while I have not been on the forefront for all issues, I cannot remember the platform ever going hard down. Most of the problems we face are with application teams not understanding how to do something, or their agents going down and their ignorance of service manager stopping them from getting agents right again.
Can you be more specific about the troubles you face?
Sure. First I need to explain my role, I am primarily a user of the system, I use it mostly to schedule Windows PowerShell automation type jobs (that is about 95% of what I use it for). I am about the third string administrator, so I don't work in that capacity much.
I should also add that we did not use their professional services to perform the upgrade.
We upgraded in the August time frame. Since then we have had mostly nagging type issues, I will list a sampling of the problems. We also had extended outages in prod (measured in hours) and dev (measured in days). This may have been mostly our fault because we may have been pointing at the wrong database via an old tnsnames entry, but the way the problem manifested itself was that our licenses were no longer valid. I hate these kinds of problems. OK, the vendor doesn't trust us to manage our licenses, at all, but the system should report not break - especially in production. Completely unacceptable.
The first problem we ran into after upgrade was that reoccurring jobs no longer worked. I had mostly moved away from using same but still had a few left and eventually stumbled across the fact that they were not working. No error messages, they just no longer worked in any prod client, but they worked OK in dev. We wrapped those jobs inside of events as a work around. There is also a service pack that might fix it, but it is tough to take the system down long enough to get it installed. Live patching with rollbacks is something I would like to see in Enterprise class software. We did not see the problem in 8.
For notifications, I manage recipients via notification objects. I never had a problem doing so in version 8, and in version 9 it usually works, but about once a week I have to edit-save-check-reedit-check-save-reedit-check-save etc. because it will drop recipients that I have added, and it will restore recipients that I have deleted. I have tried to figure out why it sometimes works and sometimes doesn't, but I have concluded that is just moody and that there is nothing I can do about it. Maybe a patch will fix it.
Due to policy we have to change passwords on service accounts periodically, and when we did so recently, one of the clients had some significant functionality lost - it would not execute event objects. Other clients running on the same UC4 (prod) server worked fine, it was just the one client. After about a week of working on it support had us cycle the agent (and maybe the service, I am not sure about that) and that fixed the problem. I really have no idea if when we cycle agents if we are going to see this problem again and if we do how often it will show up.
Hopefully that gives you some flavor for the types of issues we have been running onto with 9. In one case I was told that no one else had complaints about 9, so I wanted to post to the community to see if we are the only customer having issues. We did hear this week that there will be a Service Pack that will address quality issues at some point in the future, so there must be enough problems that such a service pack is in the works, but it could be that we have run into more than most people and that they are addressing what for most people would be a one-off type problem.
Much of my frustration comes from the fact that I had recommended that we buy UC4 version 8 and we found the vendor and product experience to be very good, probably one of the best that we dealt with. Now things seem to be trending the wrong direction at least for us. I am happy to hear from Automic that we are an exception; I wouldn't wish this on anyone else but I also wouldn't wish it on us.
We are currently migrating from OM V6 to AE V9. One of
your symptoms sounds familiar to me; "...it will restore recipients that I
I suspect that your recipients are being overridden by a parent object. By default, V9 assumes you wish to override
*EVERYTHING*. If you store a default value in a PROMPTSET, it will always
be overridden by the object that uses that PROMPTSET. If you store a value
under the VARIABLES tab of a JOB, it will be overridden by the WORKFLOW or
SCHEDULE that is the parent to that job. If you store a value in a
WORKFLOW under the variables tab, it will be overridden by the SCHEDULE it runs
in. The FTP settings in an RA FTP agent will also be overridden by its
We have had several embarrassing production incidents where we
thought we had properly modified our values, only to discover later that we also
needed to modify their overrides. I
tried to find ways to turn off the default override behavior, but there is no
bullet proof way to do so. So every time
we are asked to modify a variable or a setting, we must be diligent in also
modifying everywhere that can override that variable.
This behavior was foreign to us because V6 doesn’t behave this
Retrieving data ...