We've struggled with the RAID controller in our database server, a Lenovo ThinkServer RD120. It is a rebranded Adaptec that Lenovo / IBM dubs the ServeRAID 8k.
We have patched this ServeRAID 8k up to the very latest and greatest:
- RAID bios version
- RAID backplane bios version
- Windows Server 2008 driver
This RAID controller has had multiple critical BIOS updates even in the short 4 month time we've owned it, and the change history is just.. well, scary.
We've tried both write-back and write-through strategies on the logical RAID drives. We still get intermittent I/O errors under heavy disk activity. They are not common, but serious when they happen, as they cause SQL Server 2008 I/O timeouts and sometimes failure of SQL connection pools.
We were at the end of our rope troubleshooting this problem. Short of hardcore stuff like replacing the entire server, or replacing the RAID hardware, we were getting desperate.
When I first got the server, I had a problem where drive bay #6 wasn't recognized. Switching out hard drives to a different brand, strangely, fixed this -- and updating the RAID BIOS (for the first of many times) fixed it permanently, so I was able to use the original "incompatible" drive in bay 6. On a hunch, I began to assume that the Western Digital SATA hard drives I chose were somehow incompatible with the ServeRAID 8k controller.
Buying 6 new hard drives was one of the cheaper options on the table, so I went for 6 Hitachi (aka IBM, aka Lenovo) hard drives under the theory that an IBM/Lenovo RAID controller is more likely to work with the drives it's typically sold with.
Looks like that hunch paid off -- we've been through three of our heaviest load days (mon,tue,wed) without a single I/O error of any kind. Prior to this we regularly had at least one I/O "event" in this time frame. It sure looks like switching brands of hard drive has fixed our intermittent RAID I/O problems!
While I understand that IBM/Lenovo probably tests their RAID controller exclusively with their own brand of hard drives, I'm disturbed that a RAID controller would have such subtle I/O problems with particular brands of hard drives.
So my question is, is this sort of SATA drive incompatibility common with RAID controllers? Are there some brands of drives that work better than others, or are "validated" against particular RAID controller? I had sort of assumed that all commodity SATA hard drives were alike and would work reasonably well in any given RAID controller (of sufficient quality).