We are attempting to identify the cause of regular data_engine restarts. We had a big issue back in April where things backed up to the point where huge save files were created and the bulk inserts were always failing. Moving the save files out of the way restored normal operation but we never could get to the bottom of the issue. It was suggested that we look at the SQL logs for the reason bulk inserts fail, but we did not find anything useful in the SQL logs.
Datasebase is currently 53Gb and will auto-grow in 200Mb chunks.
We had another instance on the same system last week where things were starting to back up and save files were being created. We looked at the report_engine, data_engine, and variable_server logs to see if we could spot database related issues. I see in the data_engine log a regular pattern of “SubscribeCallback - disconnected from hub” followed by an Admin – done message. The admin times are anywhere from 2-3 hours. Is this one reason the attach queue would back up while administration tasks are performed? When are "save files" created? We also see where the data_engine issues “WARNING: more than 301 seconds since last QoS message; the data_engine will restart” messages and the probe is starting and stopping at 10 minute intervals. Is this normal operation when admin tasks are performed? I have attached the logs in an Excel file.
What are common reasons for bulk insert failures? We were speculating that the free space in the database was not large enough and SQL would not auto-grow as a result of a bulk insert attempt. We cannot seem to find a cause of this error condition.
I have seen other posts regarding data_engine performance issues and it may be possible that we need to tune this system based on some of the comments made.
Any help is appreciated,