Shawn_Moore

Title: CA Clarity Tuesday Tip: SQL Server Bug# 50003815 Causes DB outage

Discussion created by Shawn_Moore Employee on Nov 2, 2011
Latest reply on Nov 2, 2011 by Chris_Hackett
Title: CA Clarity Tuesday Tip: SQL Server Bug# 50003815 Causes DB/Clarity Outage

CA Clarity Tuesday Tip by Shawn Moore, Sr. Principal Support Engineer for November 01, 2011.

Thanks to Stephen Riley, Solution Integration Architect, for the content of this tip!

Microsoft SQL Server Bug# 50003815 can cause Clarity to be unusable due to SQL server "hanging". End users will that Clarity is hung or not processing requests. Any Clarity jobs running at the time will fail. This bug is present in SQL Server 2005 SP2/SP3 and SQL Server 2008. (see link below for exact cumulative updates that would be impacted by this problem)


Symptoms:

1) Database services may show restarts during the day (in the field this has been observed at a frequency of up to 2-3 times a day)
2) There will be no evidence in the logs of a graceful shutdown, just a startup.
3) This problem will cause SQL Server to stop responding during database backups and possibly during other high utilization
4) The impact of a DB restart on Clarity would be a service interruption, i.e., interruption to end users, interruption to running jobs and instability.
5) Problem may be more prone to 64-bit, SQL Server 2005 with SP3 configurations.
6) On a Microsoft active/active cluster, this problem may cause the cluster to failover as well. In the Windows event logs you would see cluster failure errors, in the bg and app you will see SQL errors. In a nutshell, if this bug hits, SQL server becomes inactive. The cluster pings SQL server every 5 seconds or so. If the cluster finds that SQL is non-responsive, the cluster will fail over from node A to node B or vice versa, this causes a SQL server restart and all databases are started on the secondary node.
7) Windows event logs will also show evidence of the cluster healthcheck failures, communication link failures and sql server restarts.

Resolution:

a) If running on service pack 3, you can upgrade to service pack 4.
b) See link below for additional cumulative updates that would address this issue.


Reference:

This defect has been documented on Microsoft's site:
http://support.microsoft.com/kb/960543

-shawn

Outcomes