Keeping a process running

Have you ever had a process that dies on occasion? For me, I hate that situation and prefer to fix the software as opposed to have a monitor that restarts the process when it dies. I’ve run into a case lately however, that has defied me for a solution to my dying process. I think it may be a hardware related issue but haven’t tracked down the cause yet. Anyhow, I read an email on the Provo Linux User Group in which the poster referred to PS-Watcher. I thought I’d give it a try for kicks.

After installing the program and reading through the documentation, I found that PS-Watcher is really quite nice. In addition to monitoring the results of the ps command, you can add custom actions that occur at the beginning or ending of the monitor cycle ($PROLOG and $EPILOG). You can also customize actions to be taken based on the number of processes, memory size, and a few other useful metrics.

For most situations where you want to monitor a process and take action, I think PS-Watcher will probably do the job nicely. After all this however, I decided what I really wanted was a little script that did a custom restart of my particular web server when the test URL wasn’t functioning properly. I decided to simply run it on a scheduled interval with cron. I’ve placed the script below for all to glean information from or make fun of as appropriate. Feel free to provide some additional tips as I don’t claim to be a “Bash Jedi Master”. The following script sends a request to the web server and parses the response for a string that lets us know the server is working properly.

#!/bin/bash

user="<the user my process is running under>"
port="<the port>"
okresp="^OK$" # I configured a test URL that returns OK if the server is up and running right.

# make a simple HTTP request to send
req="GET /lbuptest HTPP/1.0

"
# send it using netcat
resp=$(echo "$req" | nc localhost $port)
# test for the ok string
ok=0
echo "$resp" | grep $okresp 2>&1 >> /dev/null && ok="1"

# you could really place whatever actions you want here.
if [[ $ok != "1" ]]; then
/etc/init.d/<my process init script> restart
fi

The process I’m having trouble with is a TurboGears web application. I don’t think this is a Python problem however. Like I mentioned before, it only happens on this one server so I think I’ve got a hardware problem. Either way, if you found this page searching for TurboGears information, you might as well be interested in my TurboGears Init Scripts.

Can Google’s Adsense bot understand gzipped html pages?

During my experiments with WP-Super-cache, I noticed a strange thing happen to my Adsense ads. A short while after getting gzip compression to work properly, all my ad content had foreign characters and strange seemingly unrelated content.

Having changed nothing on my blog except for installing WP-super-cache, I decided to add an additional check to my .htaccess. Here is a modified snippet that disallows Google’s Adsense bot from receiving the gzipped page:

RewriteCond %{HTTP_COOKIE} !^.*comment_author_.*$
RewriteCond %{HTTP_COOKIE} !^.*wordpressuser.*$
RewriteCond %{HTTP_COOKIE} !^.*wp-postpass_.*$
RewriteCond %{HTTP_USER_AGENT} !Google*
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1index.ht ml.gz -f
RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1index.html.gz [L]

Notice the new line that says the User Agent can’t have Google in it’s description.

Sure enough, ads are back to normal. I’m not sure how exactly Google’s crawlers handle gzip compressed pages. They are sending an “Accept-Encoding” header that includes gzip or the page wouldn’t be served to them in the first place. Judging from the change in my Ads however, I’d suspect that the bot isn’t uncompressing the received file.

How to virtual host load balanced websites with ldirectord and Apache

I posted a while back on getting Heartbeat set up to add reliability to websites. After a few weeks of experience with the system, I thought I’d add a few additional tips on making the setup more reliable. There are already a few good guides on getting heartbeat set up. You could also read my original post on the subject if you don’t already have heartbeat load balancing your site. This post however, deals with the case when you are servicing more than one site per physical server.

We host three different websites on three different physical servers. Each physical server hosts two websites with Apache. Each website is hosted on two different physical servers. The sites are load balanced with ldirectord which resides on two different servers that manage the public IP addresses to our services with Heartbeat. If load increases on any of our services, we could always add additional physical servers relatively easily.

Continue reading “How to virtual host load balanced websites with ldirectord and Apache”

Using piped svndumpfilter commands to separate an svn repository

According to the documentation for svndumpfilter, you can include one subcommand when filtering a dumped repository. Suppose you have a repository that has a path “/some/path” that you’d like to separate out into its own new repository. From the documentation, you simply pipe the original dumped repository through the svndumpfilter command.

Example:
cat repos-dumpfile | svndumpfilter include some/path > new-dumpfile

The caveat is that if there are paths copied from other paths in your repository that the include argument does not cover, you’ll get an error. I got around this command by piping the output from one svndumpfilter command to another, each with exclude commands instead of include commands. The result leaves only the paths I want, but included the alternate copy branch that I had used part way through the coding process.

Example:
cat repos-dumfile | svndumpfilter exclude unwanted/one | svndumpfilter exclude unwanted/two | svndumpfilter exclude unwanted/three --drop-empty-revs --renumber-revs > new-dumpfile

Notice on the last svndumpfilter command, I added a couple arguments to renumber the repository revisions and drop the empty revisions. While these are of course optional, in my opinion, they make the new repository cleaner.

The subversion book states you can edit the Node-Path values in the dumped file to have the new repository imported at different paths. I chose to simply issue an “svn mv” command once I imported the repository.

Upgrading an OLD Gentoo Machine.

I’m in the process of re-installing a pretty old machine with the latest Gentoo. I’ve got a shared NFS directory with portage and all my machines are using a packages directory. After one machine builds something, another machine can simply install the built package.

Here is a portion of the make.conf on each machine.
FEATURES="-distlocks buildpkg"
PKGDIR="/usr/portage/packages/x86"

Well, this particular machine was installed with Glibc 2.3.x. I typed the following to do the upgrade:
emerge -auDvk system
About 1/3 of the way through the upgrade, tar suddenly stopped working:
tar: /lib/libc.so.6: version `GLIBC_2.4' not found (required by tar)
I realized that tar had been upgraded but the glibc was not yet upgraded. To fix the problem, I created a new version of tar on a different system:
USE="static" emerge -av tar # on another machine
scp :/bin/tar /bin/ # from the broken machine.

Off I go, things are working.

One way to unemerge lots of unneeded packages on Gentoo Linux

As part of a recent project, I had installed a lot of packages on a separate machine to test my configuration. As is common, with Gentoo, you want to run the following before you actually emerge anything:
emerge -p <package_name>
In this particular case, I noticed the dependency list was pretty long (50 packages to be exact). Instead of going ahead with the emerge, I first recorded the package list to a file for later reference:
emerge -p <package_name> --nospinner > dep.list
Now that I’m done with the project, I can clean up the packages I no longer need like this:
emerge -aC `cat dep.list | grep 'ebuild N' | cut -d ' ' -f 8`
Notice the grep. That is because a couple of the packages were simply upgraded and I don’t know that they aren’t needed. After a quick scan of the resulting list to see what is going to be uninstalled, I let emerge do the rest of the work.

Walla, no more 50 extra packages on that machine.