9

My environment is the following: VMWare 5.5 vitalized server MS Windows Server 2008R2 Enterprise domain and SQL Server 2008 R2 Enterprise. Centralized storage with fibre-channel connection.

I have partitions in my SQL Server DB. I have 2 file groups: one with live data (FG1), second with historical data (HDG).

The second file group is read-only. Each month I make movement in partitions - I add new data (from previous month) to historical data. This process is automatic.

We moved our database to a new server. Initially, I had to do the process manually. During this operation my mirror breaks down (after operation 3 - see process flow bellow) with the following error:

ON PRINCIPAL SERVER:

ROW 0 in LOG:

Date        15.6.2015 20:54:11
Log     SQL Server (Current - 16.6.2015 07:55:00)

Source      spid84

Message
Setting database option MULTI_USER to ON for database MYDB.

ROW 1 in LOG:

Date        15.6.2015 20:54:11
Log     SQL Server (Current - 16.6.2015 07:55:00)

Source      spid18s

Message
Error: 1453, Severity: 16, State: 1.

ROW 2 in LOG:

Date        15.6.2015 20:54:11
Log     SQL Server (Current - 16.6.2015 07:55:00)

Source      spid18s

Message
'TCP://10.201.27.154:5022', the remote mirroring partner for database 'MYDB', encountered error 823, status 3, severity 24. Database mirroring has been suspended.  Resolve the error on the remote server and resume mirroring, or remove mirroring and re-establish the mirror server instance.

REMARK: I executed this operation on the old server many times automatically and I never experience such error.

ON MIRROR SERVER:

ROW 1 in LOG:

Date        15.6.2015 20:54:11
Log     SQL Server (Archive #3 - 15.6.2015 21:33:00)

Source      spid17s

Message
Error: 823, Severity: 24, State: 3.

ROW 2 in LOG:

Date        15.6.2015 20:54:11
Log     SQL Server (Archive #3 - 15.6.2015 21:33:00)

Source      spid17s

Message
The operating system returned error 5(Access is denied.) to SQL Server during a write at offset 0000000000000000 in file 'e:\Databases\MYDB_HISTRICAL.ndf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

MY PROCESS IS FOLLOWING:

1. I make several backups of the Database (Full, File Group and TLog backups).

2. I set DB to RESTRICTED_USER (to allow removing read-only of historical file group flag by script).

2a. I remove READ-ONLY flag of my Historical File Group.

3. I set DB to MULTI_USER to allow normal operation of our software.

4. I update partitions so the data is moved the the historical file group.

5. I repeat steps 2, 2a and 3 in so I can set historical file group READ ONLY again.

6. I make backups again.

Does anyone have idea why I receive that error?

EDIT: We receive the same issue during the different phase of the procedure. This is the only situation in which mirror breaks down so I suppose the problem is inside the procedure, but I can not figure out why!

6
  • Error: 823, Severity: 24 seems hardware problem. Check your DISKS to see if they have gone bad. Run checkdb on the databases to make sure they come clean.
    – Kin Shah
    Commented Jun 26, 2015 at 12:16
  • I am not sure @Kin. We have brand new optical attached specialized IBM storage. It operate from about 3 months. And this was the only time we receive such error. Actually there are about 10 rows with that error, but they all happened during that time period. We destroy mirror and create it again. We have issue to remove the mirror. So we remove it manually. Commented Jun 26, 2015 at 14:03
  • Error 823 with sev 24 is a hardware problem. Are you doing file level backups instead of native sql server backups or is any antivirus software running on the server ? You should put sql agent alerts to alert you when 823 error occurs - this script will help you. Also, 823 is a nasty error to get – it says that an I/O operation failed at the OS level and the I/O subsystem is causing corruption - sql server did not do page checsum
    – Kin Shah
    Commented Jun 26, 2015 at 14:13
  • We do both type of backups, @Kin. We also have VmWare replication to a remote host. The thing that I noticed until I wrote an answer to you is that we can not destroy mirror in normal way. The file was locked and we need to stop SQL service and to move the db files into another directory. From that moment everything is fine (I check logs using sys.xp_readerrorlog). Another thought is if a VmWare replication take place in that very moment, but I am not sure how this will affect process (I little know about VmWare). Commented Jun 26, 2015 at 14:54
  • We do both type of backups that might be a problem. VM snapshots should not be used as an alternative to native sql server backups.
    – Kin Shah
    Commented Jun 26, 2015 at 15:19

1 Answer 1

0

We found the issue. It is a bug in SQL Server. When we set READ_WRITE the command is not transfeered properly into the mirror DB. When script start change partitions on the mirror server an error occured. After that the syncronization is ruined and the DB on the mirror is locked (in suspended state).

We fix the issue by updating SQL Server to the latest version (our initial version was wiout SP).

Not the answer you're looking for? Browse other questions tagged or ask your own question.