Oracle: Unbreak my Linux Heart

Posts

Showing posts from August, 2008

SSH key-based attacks with rootkit

US-CERT is aware of active attacks against Linux-based computing infrastructures using compromised SSH keys. The attack appears to initially use stolen SSH keys to gain access to a system, and then uses local kernel exploits to gain root access. Once root access has been obtained, a rootkit known as " phalanx2 " is installed. Read more at US-CERT.

Booting multipathed Linux using GRUB

As a side-effect of the Linux boot process, I suddenly realized that our boot process is not fault-tolerant! We have 100+ servers that boot from SAN using two Qlogic 2460 HBAs. We installed EMC PowerPath 5.0.0 on OEL 4u5 to get multipathing and automatic fail-over in case a path fails. The OS is pretty well off in case of hardware faults. However, the boot process is not! Not even close! In stage 1, the boot loader reads the MBR, loads stage 1.5 so it can read the /boot ext2 partition where the kernel and initrd image are located. In our case, this is /dev/sdb1. But what if my HBA dies and /dev/sdb1 doesn't exist? It may be smart enough to try a device using the other path, but probably not. Also, since the kernel hasn't loaded yet, PowerPath does not exist and there is no multipath awareness to save the day... So once systems are installed and the number of disk partitions is stable, I can enhance GRUB with boot fallback systems , so it will try /dev/sdb1 and it that fails swi...

Inside the Linux boot process

Just an interesting tid-bit that I had been wondering about for a while: why is there a "boot stage 1.5" during Linux boot? I see a "stage 1" notification, then stage 1.5, but nothing else... How come? Well, IBM has the answer in " Inside the Linux boot process ". Linux uses a 2 stage boot process, stage 1 loads the boot loader, stage loads the kernel. But to allow Linux to load the kernel from a native file system such as ext2 or ext3 (or Reiser, XFS, ZFS, etc.), GRUB introduces an additional stage 1.5 that understands those file systems. Stage 2 then still loads the kernel, only now it can reside in a normal Linux partition instead of a raw disk sectors, such as with LILO. Neat!

Oracle high profile at the Linuxworld

ZDnet reports the Oracle is going high profile at the Linuxworld conference , positioning it as a world league player for Enterprise Linux... :yawn: But it made an interesting announcement : the introduction of the Oracle VM Template , pre-configured software stacks designed to reduce installation, configuration and maintenance costs by eliminating the need to install and test from the ground up. Always a big time saver, so I'm definitely going to look into this. Hope I can convert the Xen VMs to VMware format, but I think so.

Viewing Linux partition headers of existing disks

After a boot disk crash, I wanted to make sure existing data disks were unaffected. We use Oracle ASM for all our bare metal database servers and in this case there wasn't a proper backup of the (development) database. ASM marks its disks as being part of an ASM disk group and checking the first block of a partition. This lets you verify that the data still exists. Upon reinstallation of the OS and ASM software, you can then reuse existing ASM disks and groups and restore the data. To verify the disks are still marked as ASM disk groups, check the first block of the partition: dd count=1 if=/dev/sdc1 | od -c sdc being our first data disk after local disks (sda) and the boot LUN (sdb). Note the text string on the third line in capital letters and the string " D G _ D A T A 1", naming the disk group. You can check the list of available partitions using ' cat /proc/parititons|less '.

Troubleshooting VMWare Tools on Linux

Had an issue with an attempted upgrade of VMware Tools using the supplied script. It failed miserably and left my VM without network and IP address. Going to the console, I tried to remove (rpm -e VMwareTools) or update (rpm -Uhv VMwareTools) the RPM, but the script had crippled the installation. HowToGeek had some helpful commands to manually remove most of the VMware Tools files from the file system, so a new clean installation could overwrite all existing traces.

up2date A socket error occurred: Timeout Exception

I think I finally managed to solve a really annoying error I get on occasion, but I still don't know what exactly causes it, though. When running yum or up2date to check for and/or update installed packages, I receive the error: up2date A socket error occurred: Timeout Exception and /var/log/up2date shows a serie of timeouts after which up2date just fails. Common attempts to fix it seem to be to check your system date (mine is perfectly synced using NTP); to update your profile on RHN/ULN; to clean the cache using 'yum clean all'; to clear the yum or up2date cache directories and such. None of these fixed it for me. What finally did fix it for me, was to delete /var/spool/up2date ( sudo rm -rf /var/spool/up2date ) and to recreate that same directory again ( sudo mkdir /var/spool/up2date ).

MySQL and the Linux swap problem

Don MacAskill from Smugmug wrote an interesting article on a problem with Linux and single-app-boxes. I.e. servers that only run one application that consumes basically all available resources. It seems Linux will swap something sooner or later, even when it is swapping the only application out to disk. So your performance will plummet. But you can not disable the swap partition or go without one (according to their experiences, not mine). However, there is a simply sweet "solution" or work-around: create a small swapdisk on a RAM disk! Read more details and a howto in Don's article. Of course, the article is about MySQL. But Oracle RDBMS is also a single-use application. At least in our data center. And I have been getting reports of some Siebel apps showing weird performance problems. This may be it!