RSS Feed
Feb 10

HyperVM login error: not_in_list_of_allowed_ip

Posted on Tuesday, February 10, 2009 in blog, hypervm, hypervm problem, kernel, linux, not_in_list_of_allowed_Ip, xen

We had a client who was not able to login to his HyperVM control panel to make modifications to his virtual private server, the error he was getting was the following:

Alert: not_in_list_of_allowed_ip [xx.xx.xx.xx]

The IP of the client is located at the “xx.xx.xx.xx” part, this is easily fixed by clearing the block list on the server with this command (on the main node), you must replace the user.vm part by the username of the client at HyperVM (most of the time, something.vm).

/script/clearallowedblockedip --class=client --name=user.vm

It should return something like the following:

AllowedIp Sucessfully cleared for client:user.vm

Afterwards, the client/you should be able to login with no problem at all.

Nov 24

AACRAID based controllers timing out / aborting / SCSI hang

Posted on Monday, November 24, 2008 in aacraid, adaptec, cpanel, grsecurity, kernel, linux, network, raid, scsi

We’ve been lately starting to use more Adaptec RAID controllers rather than 3ware RAID controllers.  3ware has been nothing but trouble for us, dropping hard drives, even RAID5 arrays are running slower than a regular hard drive with no RAID.  Our latest issue was a server just simply having a Kernel Panic when using high IO, our experience with 3ware RAID controllers & Linux is terrible.

On this other side, Adaptec has been great.  We’ve been using them for a while now and see no problems at all, however there is just a small catch, Linux usually has a SCSI subsystem timeout of less than 30 seconds which results in a small difference between the controller timeout (at 35 seconds) versus the Linux timeout (at 30 seconds).  This usually brings a server to a halt for a couple of seconds (and minutes in cases) till the server recovers, errors like this are thrown in the console:

aacraid: Host adapter abort request (0,1,3,0)
aacraid: Host adapter abort request (0,1,1,0)
aacraid: Host adapter abort request (0,1,2,0)
aacraid: Host adapter abort request (0,1,1,0)
aacraid: Host adapter abort request (0,1,2,0)
aacraid: Host adapter reset request. SCSI hang ?

The best method that usually works best is to increase the timeout higher than 45 to ensure that the Linux timeout does not occur before the RAID controller timeout, this is done per device / array.

echo '45' > /sys/block/sda/device/timeout
echo '45' > /sys/block/sdb/device/timeout
echo '45' > /sys/block/sdc/device/timeout

This should be done to every device, 45 is a good number however you can use what you’d like as long as it’s over 35. If you’re experiencing issues with loads going sky-high with no apparent reason, this might very well be the reason, to check if this is a possible cause, you can run the following

dmesg | grep aacraid

If you see errors like the ones that I have up there, then I suggest using that small workaround, if even after using the workaround, you’re still facing these problems, here are the suggestions/checklist that Adaptec suggests:

  • Check for any updated firmware for the motherboard, controller, targets and enclosure on the respective manufacturer’s web sites.
  • Check per-device queue depth in SYSFS to make sure it is reasonable.
  • Engage disk drive manufacturer’s technical support department to check through compatibility or drive class issues.
  • Engage enclosure manufacturer’s technical support department to check through compatibility issues.

Anyhow, just like with every Linux issue, your mileage may vary, so if you know of any other fixes or figured out a way how to fix this, feel free to post it as a comment to help others.

Feb 18

vmsplice, belated.

Posted on Monday, February 18, 2008 in 0-day, grsecurity, kernel, linux, recompiling, vmsplice

I’m sure everyone who has a server that has any sort of remote user accessing it has heard of this 0-day exploit (not so zero-day now, is it?) — Sadly, I belonged to those who heard of it after a few servers kernel panicked, I had no clue on what was the cause but I was going through the logs and noticed that LFD (props to ConfigServer for that useful software) had reported a suspicious process, decided to look what exactly it was, there was a x.c file, a simple cat show’s that it was copied from milw0rm, visited the related milw0rm page and I had discovered that I’m screwed. Thankfully, all of our servers ran CentOS 5 so all that happened was a kernel panic instead of being rooted.

After reading a bit about it, I prepared for a kernel compile, I had read about grsecurity and how it’s very commonly used so I decided to add it to my kernel plus the patch for the vmsplice exploit, I have setup a link to it here, sorry 64-bit folks, I only did it for i386/i686, even if our hardware is all 64-bit hardware, but I won’t start ranting about the software we use and how terrible the 64-bit support is. I have uploaded the kernel with grsecurity added and the patch for vmsplice here: linux-2.6.22.9-grsec-i686.tar.gz

I must admit, all of our servers are pretty fast and powerful, I mean when your lowest spec. server has the following specifications: 2x Intel Dual-Core 2.33Ghz Woodcrest CPUs, 4GB of RAM, 2TB total storage. Even with these specifications, kernel compiles took around 15-20 minutes, that may be fast compared to other servers however thankfully we have fast servers as I had to compile, twice, because when Linux means 4GB memory limit, it really means 2.5GB, I had to re-set the limit to 64GB for memory so the server can see all the memory.

Now, the instructions are pretty complicated at the beginning but at the end they are very straight forward, we’ll download the kernel first, you’ll either use the .config file that I had used or use your current kernel’s .config file, then a few make commands, I’ll also show how you can setup grub to fallback to your stable kernel if it actually crashes or something doesn’t work out.

mkdir /usr/src/linux
cd /usr/src/linux
wget http://files.momotonic.com/linux-2.6.22.9-grsec-i686.tar.gz
tar -xvzf linux-2.6.22.9-grsec-i686.tar.gz

The extraction should take a good few minutes as there are a lot of files to make all of these 1’s and 0’s. After that we have extracted the files, those who would want to use my configuration file can just stick with it but if you have more than 2-3GB of RAM, I’d suggest typing make menuconfig (when you are in the linux-2.6.22.9 directory), use your arrow keys to go down to “Processor type and features —>” and press Enter, then go to “High Memory Support (4GB) —>” and press Enter, select “( ) 64GB ” and press Enter, then use your right arrow to move in the bottom menu to exit, do that twice till it asks you to save the new kernel configuration, pick Yes (duh!) — If you just want to keep using your same configuration, then just copy it with this command: cp /boot/config-`uname -r` . (ofcourse, run that command in the linux-2.6.22.9 directory. You can modify a few other options and then we go to the longest process, compiling!

make
make modules_install
make install

Now, I will only cover instructions on how to do the fault-free reboot so if it does not work out, it will automatically be switched to the old kernel and everything will be like before. However, I have never worked with lilo and I’m not sure if you can do this so I will only cover instructions for doing it on grub. First, we’ll modify our /boot/grub/grub.conf file and change any of your options to the following ones

default=saved
fallback=1

Then we’ll need to modify the title for the CentOS boot and add the savedefault fallback so it should be something like the following:

title CentOS (2.6.22.9-grsec)
root (hd0,0)
kernel /boot/vmlinuz-2.6.22.9-grsec ro root=LABEL=/
initrd /boot/initrd-2.6.22.9-grsec.img
savedefault fallback

Now, we’ll first set grub to manually see that the main or default OS to run as the main operating system, in this setup, if the system does not boot up in a reasonable time, you can call your data center or use whatever method to reboot the server and it’ll boot to your old previous kernel that works fine, then you can take a look at your logs (mostly /var/log/messages) for the reason why it didn’t boot. We will need to run this command before rebooting:

echo "savedefault --default=0 --once" | grub --batch

Now you can safely reboot your server, I won’t even mention how to do that because if you do not know how to, I don’t think you should really be messing with kernels! Once you reboot, you can try re-running the exploit from milw0rm but I won’t mention on how to do that as evil minds will go and start using it.

Until then, I’m back to recompiling the kernels with 64GB memory limit because 4GB limit is actually 2.5GB!