For point releases (like 10.2.2 to 10.2.3) we don't worry much at all. We will run through all our primaries. Then the next day all our secondaries. Then the following day the OneClicks(OC). Followed by the Spectrum Report Manager(SRM) OCs and CABI.
Now for major releases (like 10.2.3 to 10.3) we will be more thorough and will watch more closely. First we check with CA to see if we can have a short window to run in parallel. There were a couple releases where this was not recommended at all (v8-9).
If we can run in parallel for a short time then:
We follow steps very similar to what you have above, but with a couple changes.
For step 2 (OneClick upgrades), we will upgrade half our OneClicks.
Then we swap step 3 and 4.
We will bring up our upgraded primaries before stopping the secondaries (some of our primaries can take an hour or more to start, we can't afford being blind that long). Once all the primaries are up and look good, we will shut down the secondaries and continue the upgrades.
If we are told not to run in parallel then we segment our environment:
- We stop Spectrum Report Manager OCs. (they will catch back up after the upgrade)
- We update the Location Server parameters on the secondaries and half the OneClicks so that they point to our secondary Main Location Server(MLS) as their primary MLS.
- We update the .hostrc files so that the primaries can not talk to the secondaries and the secondaries can not talk to the primaries (assigning half the OCs to the primaries and half to the secondaries)
- At this point, we should be able to bring down the primaries. So we stop them and verify that the secondaries are running correctly.
- Now we upgrade the primaries and half of the OneClicks that support the primaries.
- Bring up the primaries and verify that we have a complete environment
- Shutdown the secondaries and the OneClicks that support the secondaries
- Upgrade the secondaries and rest of the OneClicks
- Startup the full environment (minus the SRMs, which will be upgraded last).
- Verify that it all works well
- Upgrade the SRMs and bring them up.
Note - most upgrades, point releases and releases where we can have things run in parallel have usually gone very well. Major upgrades where we have to segment the environment usually have catastrophic failures, so we make sure we have CA available to respond quickly.
We made one BIG change for most recent upgrade. For our upgrade from 9-10, we utilized CA's consultants to upgrade our environment. Not only did they do it 4 times faster than we have ever done it before, it was accomplished without any issue. Our customers praised us so much on that upgrade, that I think our management will approve using CA consultants for future major upgrades (many thanks Karen, Don, Rob, Joe and all).
Also, realize that our environment is massive and has a large number of integration points. So major upgrades can be quite complex (55 fully redundant landscapes, >200,000 devices, home built management modules, multiple CA supported integrations and many home built integrations). Less complex environments may not have the same problems with major upgrades that we have.