Data in SQL databases will become corrupted, eventually. Thankfully there are several mechanisms in SQL Server to protect against it, and warn you that there is a problem. One of the most difficult things to learn about, to train for, is database corruption. Most of us don’t have a set of corrupted databases that we can practice repairing. Learning how to repair them in the heat of the moment is stressful, and often poor decisions can be made. With that in mind, I’ve created some tutorials along with corrupted databases for you to practice on. Download the Northwnd.mdf and Northwnd.ldf, attach them to a SQL Server, and run DBCC CHECKDB on the database. You will see an error message that looks like this.
DBCC results for ‘NORTHWND’.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 1056009275703296 (type Unknown), page (32993:-2015636120). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 133129 and -1.
Msg 8928, Level 16, State 1, Line 1
Object ID 2137058649, index ID 3, partition ID 984479205752832, alloc unit ID 984479205752832 (type In-row data): Page (1:295) could not be processed. See other errors for details.
Msg 8980, Level 16, State 1, Line 1
Table error: Object ID 2137058649, index ID 3, partition ID 984479205752832, alloc unit ID 984479205752832 (type In-row data). Index node page (0:0), slot 0 refers to child page (1:295) and previous child (0:0), but they were not encountered.
CHECKDB found 0 allocation errors and 2 consistency errors in table ‘Suppliers’ (object ID 2137058649).
CHECKDB found 0 allocation errors and 3 consistency errors in database ‘NORTHWND’.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (NORTHWND).
At first glance, this looks like a horrible error message. You might be tempted to try recovering the database from a backup, or running DBCC CHECKDB(NORTHWND, REPAIR_REBUILD). But it wouldn’t be necessary in this case. The key to understanding how to solve database corruption problems is to understand what corruption is, and what the error messages are telling you.
What is corruption?
Database corruption is simply a sequence of ones and zeros that got written to a datafile where they should not have been. Sometimes this happens when a hard disk fails to physically perform as it should, a device driver encounters a bug, or some part of the IO subsystem faults. It doesn’t happen often, but when it does, the result can be a huge headache.
The Error Messages:
Let’s look at each of the error messages, and figure out what they mean.
Msg 8939 & 8928 – The job of physically reading the data, is done by the buffer pool. When a page of data is read by the buffer pool, the page’s checksum is checked. If it doesn’t match the computed value of the data, an error is thrown. CHCEKDB is telling us there is corruption, and look at the other error messages to see how to deal with it.
Msg 8980 – This message is telling us that there is an error in object ID 2137058649. I can take a look at sys.indexes to see what that object is.
SELECT object_id, name, index_id, type, type_description FROM sys.indexes WHERE object_id = 2137958649
object_id name index_id type
———– —————————– ———– —-
2137058649 PK_Suppliers 1 1
2137058649 CompanyName 2 2
2137058649 PostalCode 3 2
(3 row(s) affected)
So now we know that the problem lies in the PostalCode index for the Suppliers table. Lets run a query to verify this.
SELECT PostalCode FROM Suppliers
Msg 824, Level 24, State 2, Line 19
SQL Server detected a logical consistency-based I/O error: torn page (expected signature: 0x00000000; actual signature: 0x55555554). It occurred during a read of page (1:295) in database ID 5 at offset 0x0000000024e000 in file ‘C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\NORTHWND.MDF’. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Note that the torn page error message refers to page (1:295). This matches the error we saw in DBCC CHECKDB. Lets see what happens if we try to query the Suppliers table without using the PostalCode index.
SELECT * FROM Suppliers
No error messages are returned. Indexes are really just metadata, so rather than taking a drastic step like recovering from backup, we can just recreate the index.
DROP INDEX [PostalCode] ON [dbo].[Suppliers]
CREATE NONCLUSTERED INDEX [PostalCode] ON [dbo].[Suppliers]
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
After it completes, we can try querying the Suppliers table using the index.
SELECT PostalCode FROM Suppliers
You should now see a list of the postal codes from the table. Now lets run DBCC CHECKDB to see how it looks.
CHECKDB found 0 allocation errors and 0 consistency errors in database ‘NORTHWND’.
The purpose of this exercise is to remind you that corrupted databases will occur, but the first step to solving the problem is to understand what is corrupted.