News from “The Lab”… part 2

HP DL380 G4 front vewOk, finally my DL380 G4 (I talked about it here) is up and running, with VMWare ESXi 4.1 Update 3.

Strictly from an hardware point of view, this server is still a good machine: it has 8GB of RAM and only three 36GB 15000rpm SCSI hard drives. I disconnected the other three hard drives to lower power consumption, that is now attested around to 290W.

From a software point of view, I had several options to follow. My first idea was to install ESXi, and then run one or more virtual machines depending on what I needed at the moment. As current 5.5 VMWare ESXi release won’t run on this hardware (due to lacking VT support), I had to use a earlier version. The latest supported version should be 3.1 Update 5, but in several sites it can be found that also 4.1 Update 3, stated as “not supported on this hardware”, seems to run without any problem, at least if you don’t run a 64 bit host. Another option was to install a bare Linux system, and then to use VMWare Player to run different machines, but this solution it was discarded, because I wanted to to have the opportunity to test ESXi.

Installing ESXi was a simple task, with except to the HP Proliant Support Pack for VMWare (at the moment I’m not sure if it’s fully up and running with the whole option pack).

Now I’m thinking of adding a virtual appliance based on Turnkey Linux 64, and then I’ll find out soon if the 64 bit host limitation is true.

Update: I loaded TurnKey FileServer 64 bit edition in OVF format. Its deployment was succesfull, but virtual machine refused to turn on. A quick look on the event log showed that 64 bit support was disabled for the machine, because the processor has not a true 64 bit computation capability (the two provided Xeon are Nocona core based, with only EM64T instruction set extension). So finally I had to switch back to a 32 bit edition of my appliance, but that’s should not be a problem.


Solve Windows XP svchost.exe 100% CPU load

Today I think I found the perfect solution to Windows XP svchost.exe 100% CPU load problem.

Just a step behind to show what this problem is. I use some XP virtual machines (but this problem applies to real machines too) to develop and test software. A couple of them (one running in VMWare Player, the other running in Virtual PC) suddenly slowed down to a unpractical speed. “What the … is happening now” – I thought….

Running the taskmanager showed a svchost.exe process taking up the 100% of CPU time. Uhm that was really suspicious, so I had to investigate more. I used Process Explorer from SysInternals (now Microsoft) to detect who was that process and what it was doing to my poor machine.

svchost.exe is a part of XP (and others Microsoft’s NT-like OS) that takes care of running different processes as services, so it’s definitively a kind of srvany launcher (a commonly used program to start an executable as a NT service). But why was it taking up so much CPU time? Ok, to cut the longest debugging part, I stopped, one at a time, the services launched by the faulty copy of svchost, and I found that the Windows Update Service was responsible of taking up all of the CPU time.

Ok, once found were the problem was, now I had to find the solution for it. After googling a bit around, I found that this was a widely known problem afflicting XP with Internet Explorer 8, but the solution was not so obvious. Somewhere it was stated to clear the downloaded patch folder located in %WINDIR%\Software Distribution, deleting it after the service has been stopped, but that didn’t fix the problem.

The final solution that worked for me in both my virtual machines was:

  • stop Windows Update Service from management console or command line
  • start Internet Explorer and connect to
  • search for KB2898785, that is Cumulative Security Update for Internet Explorer 8 for Windows XP
  • download the executable
  • close Internet Explorer
  • launch the patch and wait it to end the installation (you will be warned it has to restart system)

After restart, you will be able to connect to Microsoft Update or Windows Update to download other updates, and svchost.exe should not take up the whole CPU time again.

Running commands on multiple Linux machines

The gsh software was born to run a command in parallel on multiple Linux machines.

Sincerely I did not test it, but posted this link on my blog as a future reference, it may become usefull one day..

Windows Server 2008 R2 backup (wbadmin)

It’s a long time since I dont’ write… Many things happened and many more coming..

This is just a small note to remember a couple of interesting links related to Windows Server 2008 R2 backup utility (wbadmin) troubles you may have. This blog article is a good starting point for having an idea on the problem.

If you try to backup your data to a network location hosted on a Linux-based system (but I think this problem can be found also elsewhere), wbadmin could fail with the following error:

The requested operation could not be completed
due to a file system limitation.

This happens only if you try to backup only some directories and not the entire disk(s) you have. Why this problem? This is related in how Linux manage sparse files, required for such a backup. This link explains the problem in details and a way to solve it, supposing you can modify Samba configuration on target machine. If you cannot modify your configuration, you have to backup the entire disk instead of single directories.

More information on sparse files can be found in the documentation of fsutil command.

Actualizing A-GPS data when service is broken…

Since May 15th Long Time Orbiter data provided by Global Locate lacks some files, so my iPaq cannot update GPS data using Quick GPS Connection software.

I’ve just sent an email to Global Locate to signal this problem, and I’m waiting an answer. In the meanwhile I tried to manually fix the problem.

If you check, you can see that at some point directory name changed: from hh:mm:ss.nnnnnn to hh:mm:ss. Also directory contents changed: svstatus.txt file is missing in the new directories, and this is the first file downloaded by Quick GPS Connection software.

So I manually downloaded lto.dat file from directory, and via ActiveSync I copied it into \Application Data\Global Locate\Gpsct on the iPaq. Now Quick GPS Connection says data is valid and will expire within 6 days and 20 hours.

I’ll wait an answer from Global Locate to see if the problem will be fixed on server.

June 16th UPDATE

It seems that service status has been restored. Today I connected my iPaq and it successfully downloaded data on its own. Checking directories contents let me know that in directory the missing svstatus.txt file has been restored, and so in all subsequent directories.

rsync to non standard port

It’s a long time since I don’t write on my blog… so just a simple post to help me remember a non standard rsync command syntax.

rsync is a powerful synchronization tool, that could help to perform a fast-and-secure backup of files and folder to a remote server. Usually rsync uses ssh to connect to the remote server, thus transfers are secure. But rsync relies on standard ssh configuration to connect to that server. What happens if you changed, for instance,  ssh daemon listen port?

rsync has a command line option to specify what kind of remote shell you would use to connect to the server. This way you can specify the non standard port to connect to. Say you want to transfer all *.tar.gz files from local directory to the remote server myremoteserver, using ssh shell connecting to port 1234 with user user. The command line you have to use is the following:

rsync --progress -vrae 'ssh -p 1234' *.tar.gz user@myremoteserver:/path/to/remote/directory

The only pitfall on this command is you’ll be asked for a connection password, so you cannot schedule this command to be executed automatically. But I know there is a way to logon automatically, providing the local system with a certificate to access the remote system. I’ll write more on this in the next day.

News from “The Lab”…

DL380 G4Ok, it’s time to update the blog…

I’ve spent the past days enjoying my “new” HP Proliant DL380 G4 (specs here). Well new is not a correct word: I fount this server on eBay, and I had to buy it.. My idea was to use it as a VMWare ESXi server machine, to test O.S. installation and configuration in virtual machines. Unfortunately it came with only 1GB of RAM, too few to have ESXi running. I’m actually waiting an 8GB RAM memory upgrade, from an eBay auction.

In the meantime I installed Linux Slackware 13.37 on it. It was not so difficult, except for LILO installation. Will provide details on it as soon as possible. Stay tuned!

Update: as I promised, this is first update about Linux Slackware 13.37 installation on my HP Proliant.

Basically, when you install LILO as bootloader, the setup procedure seems to fail to locate the correct position of the MBR in which to write the loader itself. This is probably due to a desktop oriented setup procedure, that assumes you’re working with a standard hardware. Before installing anything on the server, you have to create at least one logical volume using RAID controller’s BIOS. This system has 6 36GB U320 SCSI drives; I decided to create two RAID5 volumes: the first one spanned on disks 0,2 and 4 and the second one spanned on disks 1,3 and 5 (this way, as there are two rows of three disks in the rack, each volume is so located in a row).

When you begin your installation you have to partition your disk using fdisk or cfdisk, and this is your first issue: which device do you have to specify to the partitioning software? After navigating around in /dev finally I found that RAID devices are located under /dev/cciss/c0d0 (first volume) and /dev/cciss/c0d1 (second volume). So I partitioned /dev/cciss/c0d0 into three partitions:

  • /dev/cciss/c0d0p1 mounted as /
  • /dev/cciss/c0d0p2 mounted as /home
  • /dev/cciss/c0d0p3 as swap space

Finally, after installing all of the needed packages, LILO installation came up. It was able to locate my boot partition to be /dev/cciss/c0d0p1 and I expected it would also locate /dev/cciss/c0d0 as my boot disk, but it specified /dev/hda as boot disk!! It was too late to get something to work, so I turned all my system off.

Few days ago I decided to fix LILO setup. I had to boot from the installation disk, and when I was able to logon in the setup shell, I did the following:

  • first I created a /mount mountpoint
  • then I mounted my /dev/cciss/c0d0p1 partition in the mountpoint created above (mount /dev/cciss/c0d0p1 /mount)
  • I changed my root to /mount (chroot /mount)
  • I edited my /etc/lilo.conf (vi /etc/lilo.conf). As I changed my root to /mount, I was editing my system’s lilo.conf)
  • in lilo.conf I changed the line boot=/dev/hda with boot=/dev/cciss/c0d0
  • I saved the newly created lilo.conf
  • I issued the command lilo -v

The newly created configuration was written to /dev/cciss/c0d0 MBR. I removed the installation disk from the drive and rebooted the system with CTRL+ALT+CANC. It worked!!

Resurrecting a Toshiba Satellite A60-217

About a month ago, my Toshiba Satellite A60-217 notebook had a big problem. When I turned it on, on the screen appeared some vertical gray lines with several flashing dots. I firstly though about a LCD monitor problem, but when I tried to boot in Windows and Linux my opinion changed: something weird happened to its RAM memory.

Firstly I removed the 512MB SODIMM memory expansion; I thought that if that was faulty, maybe the on-board memory could be still in good conditions to make the notebook working. But I was wrong: the gray lines were still be present on display, and so the boot was unsuccessful. My last trying was to remove all of the peripheral except for the DVDRW drive, and booting from a Knoppix DVD executing memtest-x86. It was a not so big surprise when memtest-x86 crashed. I was definitively convinced about on-board memory failure.

Now the question: what to do with a notebook that has on-board memory broken? At that time I had already changed the power supply since about one year (because the previous one died), the internal 2.5″ IDE hard disk (because the original was too small) and the DVDRW drive since a couple of month (also the original drive unexpectedly died). I didn’t want to spend money for a new notebook, as I spent a lot of money on it changing all of those stuff (and surely I won’t be able to use them in a new laptop).

So I Googled around a little bit, and found this web site. It was clearly explained that removing the on-board memory and placing a SODIMM module in the expansion slot will lead to fix the problem.

As I had nothing to loose on trying this solution, I found a way to open my laptop here.

Well, after some time spent opening the case, using a desoldering station at my workplace to remove chips, and some time again to reassemble my notebook, now I have a perfectly working Toshiba Satellite A60-217, with no more on-board defective RAM.

The only issue I found is this one: A60-217 model can only have 1.2GB of RAM; 1GB in the expansion slot and 256MB (fixed) on-board memory. Available memory is reduced by 64MB because that amount is reserved as video memory for the embedded graphic card. As I only had a 512MB memory expansion, before the failure I had 768MB-64MB=704MB of memory available; now I only have 512MB-64MB=448MB of RAM available. I’m thinking about buying a 1GB SODIMM expansion to increase available memory, but I have to look for a DDR PC2700 module, as it is the only kind of expansion supported by this notebook.

Making LightScribe work…

Today a friend of mine was trying to label a disk with LightScribe, but the drive did not show up in LightScribe control panel drive list.

After googling around, i found a couple of articles that reports how to turn LightScribe to work. You have to manually edit your registry, then the device will show up and you will be able to label your disk.

The key you have to modify is located into

HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon

and is named allocatecdroms. Put its value to 0 and you will see your drive working.

Why do you have to do so? Technical description of the key meaning is available on Microsoft TechNet website. Basically this item controls sharing of CDROM drives between current logged on user and other administrative PC’s users. If the value is 1, CDROMs are allocated privately for currently logged on user, and cannot be shared with other administrative users.As LightScribe runs with a service, logged with another account with administrative privileges, it won’t be able to access your drive. Putting 0 in this key allows LightScribe service to access your drive.

Remember: after changing the key you have to restart your PC (maybe you can also logoff and logon again without having to restart).

Hello world!

Ok, today is April 15th 2010. And I decided to open a blog.

I think there are almost two big reasons to open a blog:

  1. You have something important you want the others to know
  2. You know something and you want to know others’ opinion about it

But I decided to open this blog for another reason. Recently my memory seems to lack some neurons, so I usually forgot things I should must remember. So I first decided to write things on paper, but I discovered I could forgot where I put those piece of paper. So I thought “maybe my notes won’t be lost on Internet”, and I opened this blog.

So here you will find my notes on electronics, computer and whatever else I have to wrote down before i forgot them. Obviously, you won’t find my credit card number: I forgot where i put it…