Skip to main content

Doing bus rescans to discover new LUNs

We discovered a serious issue when rescanning the SCSI bus on (Oracle/Red Hat) Linux to discover newly added LUNs and I thought I'd mention it here.

Our (old) Dell PowerEdge 2950 come with virtual media to allow mounting virtual floppy images (for driver disks, e.g.) and virtual CDs (great for ISOs). Dell adds these devices to the USB SCSI bus and therefore they show up as /dev/sda and /dev/sdb. I.e. before any regular (boot) media. While it makes sense, it creates a problem for us when adding new SAN LUNs to an existing Oracle database server, because we need to add some ASM disks.

When doing the bus rescan
echo "- - -" > /sys/class/scsi_host/host3/scan
the virtual media get enumerated again as sda and sdb, which throws off existing device mapping. Our boot LUN which was sdb is now sdd and that used to be an ASM disk. After initialization with ASM, my boot LUN gets wiped and things get ugly. No root device, read-only access and upon reboot grub fails because boot devices are lost. Enter panic mode!

I can think of two three possible work-arounds:
1) rescan only the FC SCSI busses i.e. echo "1 - -" or something
2) Use Qlogic's HBA utilities to rescan and hopefully prevent the Dell virtual media from appearing
3) Using udev to prevent disk IDs from getting "lost" as udev IDs are unique, always, by design.

I'll let you know what happens... BTW, we are using OEL 4u5 x86_64 on our servers.

Comments

Popular posts from this blog

Tuning the nscd name cache daemon

I've been playing a bit with the nscd now and want to share some tips related to tuning the nscd.conf file. To see how the DNS cache is doing, use nscd -g. nscd configuration: 0 server debug level 26m 57s server runtime 5 current number of threads 32 maximum number of threads 0 number of times clients had to wait yes paranoia mode enabled 3600 restart internal passwd cache: no cache is enabled [other zero output removed] group cache: no cache is enabled [other zero output removed] hosts cache: yes cache is enabled yes cache is persistent yes cache is shared 211 suggested size <==== 216064 total data pool size 1144 used data pool size 3600 seconds time to live for positive entries <==== 20 seconds time to live for negative entries

Preventing PuTTY timeouts

Just found a great tip to prevent timeouts of PuTTY sessions. I'm fine with timeouts by the host, but in our case the firewall kills sessions after 30 minutes of inactivity... When using PuTTY to ssh to your Linux/Unix servers, be sure to use the feature to send NULL packets to prevent a timeout. I've set it to once every 900 seconds, i.e. 15 minutes... See screenshot on the right.

Setting up SR-IOV in RHEL6 on PowerEdge servers

Dell Community : "RHEL 6 provides SR-IOV functionality on supported hardware which provides near native performance for virtualized guests. Single-Root I/O Virtualization (SR-IOV) specification, introduced by PCI-SIG details how a single PCIe device can be shared between various virtualization guests. Devices capable of SR-IOV functionality support multiple virtual functions on top of the physical function. Virtual Function is enabled in hardware as a light weight PCIe function. Operating System cannot discover this function as it does not respond to the PCI bus scan and requires support in the host’s driver. As in PCIe pass-through, a Virtual function of a SR-IOV capable card can be directly assigned to the guest operating system. A virtual function driver running in the guest manages this device."