We've found some really strange I/O errors (Qlogic qle2460, firmware 1.24) using LUNs on our DMX-3 SAN. One HBA was faulty so we replaced it. However upon restoring the OS and reinstalling it, more problems appeared. The new HBA would not boot at all using the existing disks. So we disabled it in the BIOS and booted from the other (original) HBA. Both HBAs have the same firmware, same settings.
Upon booting anything involving the disks (we boot from SAN and have data disks there as well) is extremely sluggish. Letting the server do its thing, I got a ton of I/O errors first during disk discovery, then again during mounting of file systems.
and so on for all disks (LUNs) attached.
Searching the web gave me a few hits but no solutions (see 1|2|3). However, all errors were related to local RAID setups using ATA/SATA disks. I am not using local RAID. We have Dell Poweredge 2950 servers with 2 qle2460 HBAs. The internal PERC5/i is enabled as it provides the swap disk space, but it doesn't do anything. Furthermore, sdb, sdc and so on are SAN disks. So why do I get RAID errors from them? Could this point to motherboard errors? PCI bus errors? Broken FC cables? Bad FC switch configuration of simply damaged LUNs from the SAN?
Upon booting anything involving the disks (we boot from SAN and have data disks there as well) is extremely sluggish. Letting the server do its thing, I got a ton of I/O errors first during disk discovery, then again during mounting of file systems.
ERROR: ddf1: reading /dev/sdb[Input/output error]
ERROR: hpt37x: reading /dev/sdb[Input/output error]
ERROR: pdc: reading /dev/sdb[Input/output error]
ERROR: pdc: reading /dev/sdb[Input/output error]
ERROR: pdc: reading /dev/sdb[Input/output error]
ERROR: pdc: reading /dev/sdb[Input/output error]
ERROR: pdc: reading /dev/sdb[Input/output error]
ERROR: sil: reading /dev/sdb[Input/output error]
ERROR: ddf1: reading /dev/sdc[Input/output error]
ERROR: hpt37x: reading /dev/sdc[Input/output error]
ERROR: pdc: reading /dev/sdc[Input/output error]
ERROR: pdc: reading /dev/sdc[Input/output error]
ERROR: pdc: reading /dev/sdc[Input/output error]
ERROR: pdc: reading /dev/sdc[Input/output error]
ERROR: pdc: reading /dev/sdc[Input/output error]
ERROR: sil: reading /dev/sdc[Input/output error]
ERROR: ddf1: reading /dev/sdd[Input/output error]
ERROR: hpt37x: reading /dev/sdd[Input/output error]
ERROR: pdc: reading /dev/sdd[Input/output error]
ERROR: pdc: reading /dev/sdd[Input/output error]
ERROR: pdc: reading /dev/sdd[Input/output error]
ERROR: pdc: reading /dev/sdd[Input/output error]
ERROR: pdc: reading /dev/sdd[Input/output error]
ERROR: sil: reading /dev/sdd[Input/output error]
...
and so on for all disks (LUNs) attached.
Searching the web gave me a few hits but no solutions (see 1|2|3). However, all errors were related to local RAID setups using ATA/SATA disks. I am not using local RAID. We have Dell Poweredge 2950 servers with 2 qle2460 HBAs. The internal PERC5/i is enabled as it provides the swap disk space, but it doesn't do anything. Furthermore, sdb, sdc and so on are SAN disks. So why do I get RAID errors from them? Could this point to motherboard errors? PCI bus errors? Broken FC cables? Bad FC switch configuration of simply damaged LUNs from the SAN?
Comments