Friday, February 29, 2008

Really confused

Now I'm seeing a real interesting problem. The clock on the local system is drifting. In fact, it's running really slow (lost 6 hours in the past week or so). I'm kind of wondering if that's the issue here. Must investigate more.

Tuesday, February 26, 2008

Maybe not

It's now looking like there's something more fundamentally wrong with the system.  Just tried copying 2Gb of files from my old HP (with a 100Mbit connection) to the base system with a 1Gb connection.  The 1Gb connection on Mindy took longer.  That shouldn't be the case.  Need to spend some time with the base system doing performance profiling to confirm what I'm thinking ... that I have a fundamental configuration problem with the system.

Friday, February 22, 2008

Closer ...

So, I'm getting closer, but still not there. I'm still seeing I/O performance problems. They tend to manifest themselves in alternating bursts of speed and slowness. I can transfer a 2Gb movie in about 8 minutes, which isn't bad, but it seems to move in jumps, so something still isn't right. Guess I need to make a couple of posts on the VMware community and 3ware sites.

Monday, February 18, 2008

Promising so far ...

Ok, so I've only been playing with Mindy for about 24 hours, but this build is looking a bit more promising. Right now, I'm copying files to my file server, listening to music off my Roku (also from the same VM) and importing CD audio on a second and nothing seems to be bogging down like before.

I still don't seem to get the write performance that I would want from the network. It still seems to be writing in spurts (20-30 Mb straight and then a pause), but I think these can be fixed via filesystem fine tuning.

Once I have a good, working system, I'll post my complete settings.

Sunday, February 17, 2008

New (old) Band

Found a new favorite band (at least for now).  Tremolo is a band featuring Justin Dillon (and others, though he's the one I know) from the old Dimestore Prophets from the late 90s.  Well the new album Love is the Greatest Revenge is available on iTunes and eMusic (where I got it).  The album's growing on me and I'd tell everyone who knows DSP and / or likes good alterative-pop/rock/singersongwrit... well, just listen.

Yet Another Install Attempt

So, all my changes so far haven't resulted in a stable system. Well, stable yes, but performing ... absolutely not.

Now that I'm back from my business trip, it's time to try again. This time I had planned on using Ubuntu 6.06.2 LTS amd64 as it's supported (officially) by both the 3ware card and VMware Server. However, that seemed to not like something about my configuration. All I got was a hang after "Booting kernel ..." on the installer.

So, then I tried OpenSuSE 10.2 Community (the OS I was using before all this started). That booted, but it couldn't find the array.

Arrrgh!!!

So, I happened to have a Ubuntu Server 7.0.4 x86 32-bit CD and out of desperation, I tried that. An hour or so later, Mindy is back up and running. Now I just get to see if this solves my other problems. Sigh.

Saturday, February 09, 2008

Clock problems ....

No, not that kind of clock problem. I just finished rebooting Mindy with the "nohz=off" kernel setting and that appears to have solved at least 75% of my problem. I can now play music with my Roku, copy data to the server, download a Podcast on iTunes and still hear basically unbroken audio notifications from my WinXP virtual machine over Remote Desktop. (Granted I have gigabit ethernet), but before any 2 of those would cause problems. Will it stay? If so, I'll post a bit of information on what I did for those of you searching for the same problems I'm seeing.

Friday, February 08, 2008

More Mindy ...

So, I guess I was wrong. Turns out the issue wasn't what I thought it was. It appears that it may be unrelated to the 3ware card after all. Brought everything back up this morning and I was still seeing the problem (sigh). So, I dug into the VMware community forums (gotta live the whole interweb community thing) and found this thread: http://communities.vmware.com/thread/109891?tstart=30 Apparently the problem could be a slight change in the 2.6.20 kernel used in Ubuntu 7.10. Some people were reporting the same issue I'm seeing (intermittent freezing of the VMs). Their response was it could be the new "tickless" kernel setting in Ubuntu 7.10. So, I've edited the Grub configuration and added a new configuration for "nohz=off" to the kernel line. Hopefully that'll help. If not, I may have to revert to Ubuntu 6.06LTS. Sigh.

Mindy RAID!

So, I set about this weekend doing a long-needed upgrade to Mindy ... She was given the gift of
  • 3ware 9650SE 8-port SATA-II RAID card
  • 2 additional 500Gb Seagate Barracuda drives (well 1 for now, the other still contains my backup)
  • Replacing OpenSUSE 10.2 32bit with Ubuntu Server 7.10 64bit

Now, I picked the 3ware card because of the good reviews and good support for Linux. Also, I was looking for a reliable way to do RAID-5 on the system. A friend of mine scared me because he had a 50% failure rate on the same Western Digital drives I was using. (I'm still a WD fan, just not of these specific drives).

I assumed that when I launched into this project on Friday night it would be just a matter of bringing down the old system, backing up the VMs, installing the new hardware, reinstalling the software and off to the races.

Well, not quite ...

My first attempt seemed to be a complete failure. The I/O performance I was getting was terrible. It took 5 hours to copy 100Gb of virtual hard disks to the internal drive from a USB 2.0 external drive. In that same 5 hours before, I was able to copy all 500Gb of virtual hard disks. In addition, it appeared that the CPU utilization was tremendously high, so I knew something wasn't quite right.

It would appear that my first pass was a failure for the following reasons (though I'm not 100% sure on this):

  • I had to se my motherboard's video settings to use only the integrated video (my RAID card is PCI Express 4x)
  • Somehow in trying to enable write caching, the array had been left in a weird state (the 3ware BIOS Manager keep saying INIT ARRAY ... whatever that means)
  • I had used the wrong block size (64k instead of 256k)

So, after about 18 hours of copying data and mucking about, I decided it was time to cut my losses and try again, so I burned down the array and recreated, this time enabling write caching and setting a 256k block size on the array.

This time, I seemed to have much better luck. I was able to copy the VMs at near correct speed (hour for the 100Gb of data) or something like that anyway and I could do other things while the data was copying too. So it looked much more promising this time.

I finally got everything loaded and was ready to start copying my media (220Gb worth) from the virtual hard disk stored on my external drive to a newly minted 500Gb virtual drive on the array. Both drives were attached to Pesto my Ubuntu 7.0.4 32-bit server.

This process took about 14 hours. Definitely not right, but hey, at least I could copy things, so it was progress. Unfortunately, the interactive performance of the system was still abysmal. After booting Bobby, my Windows XP VM, it was horrendously sluggish and it seemed that the drive array was in constant motion, something I hadn't seen in the old system.

After whining to 3ware and mucking about with system tuning, I realized 2 things:

  • I had VMware set to allow VMs to be swapped out
  • I had the "swappiness" of the Linux VM system too high ... or more specific, I needed to instruct the kernel not to swap out until absolutely necessary by adding the following line to /etc/syscontl.conf: "vm.swappiness = 0"
That seemed to help, but I was still seeing unacceptable file transfer times from my Linux server VM to any machine on the network. It got so bad at one point that it would take almost 4 hours to transfer a 2.2Gb ISO image from the VM to my local desktop.More performance tuning ensued. This time, I took down all VMs and started with just the base system. Using that, the same 2.2Gb ISO file copied in 10 minutes ... much more like it. So it appeared that the problem wasn't with the hardware (thank goodness) and it wasn't with the base install, but was in the VM itself.So, I took a stab and started rebuilding the VM. Though instead of doing a rebuild and reinstall, I just rebuilt the .vmx file. And that seemed to work. On restarting the VM, transfers were back up to speed and the I/O wait states seemed to be much smaller now. ("iostat -d 2 -x" is your friend).
Once I get a chance to figure out what's different between the old and new .vmx files, I'll post what I found.  Or if I can't figure out what's changed I'll post the whole file to see if anyone can tell me what's the deal.
I really hope that this is the last issue I have. I'd like to add the 4th drive into the array and use the additional 400Gb of storage space to keep my lossless audio collection online.

Thursday, February 07, 2008

Meet Mindy ...

So, meet Mindy. She's my VMware Server host that's become the cornerstone of my digital office / media / life. I started building her about a year ago after having a couple near hard drive failures on my older systems (morella, ultra60, firewall and plumb). Also, at about the same time, VMware started giving away their VMware Server product, so I started thinking seriously about server consolidation.

The end result was Mindy, an Intel Core2 Duo-based server running Linux and VMware. For you tech geeks, here are the "vitals"

  • Antec P180 case
  • Antec Earthwatts 500 watt power supply
  • Intel DG965RYCK motherboard
  • 4x Corsair 1Gb DDR RAM (originally 2x)
  • 2 Western Digital 500Gb SATA drives
  • OpenSuSE 10.2 32-bit Linux
  • VMware Server 1.0.4

Then running on top of it are a number of virtual machines, including (but not limited to):

  • Pesto: Ubuntu 7.0.4 server (file server, music server, Subversion server)
  • Bobby: Windows XP Professional (iTunes podcatcher, e-mail client, etc)
  • Morella: Slackware 8.1 (domain controller, imaged from the original machine)
  • DrPlotz: OpenBSD 4.0 SSH server

It took about a week to put together, but she's now an indispensable part of my life. And to make thins even better, I took what was a 600 watt power budget (at least according to my UPS) down to 185 watts and taken the wind tunnel sound of the old systems down to a much more manageable volume. She's still a bit noisy (need better fans and maybe a better power supply), but for now, it's way better.