Can Google’s Adsense bot understand gzipped html pages?

During my experiments with WP-Super-cache, I noticed a strange thing happen to my Adsense ads. A short while after getting gzip compression to work properly, all my ad content had foreign characters and strange seemingly unrelated content.

Having changed nothing on my blog except for installing WP-super-cache, I decided to add an additional check to my .htaccess. Here is a modified snippet that disallows Google’s Adsense bot from receiving the gzipped page:

RewriteCond %{HTTP_COOKIE} !^.*comment_author_.*$
RewriteCond %{HTTP_COOKIE} !^.*wordpressuser.*$
RewriteCond %{HTTP_COOKIE} !^.*wp-postpass_.*$
RewriteCond %{HTTP_USER_AGENT} !Google*
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1index.ht ml.gz -f
RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1index.html.gz [L]

Notice the new line that says the User Agent can’t have Google in it’s description.

Sure enough, ads are back to normal. I’m not sure how exactly Google’s crawlers handle gzip compressed pages. They are sending an “Accept-Encoding” header that includes gzip or the page wouldn’t be served to them in the first place. Judging from the change in my Ads however, I’d suspect that the bot isn’t uncompressing the received file.

Making WP-Super-Cache gzip compression work

I was pretty excited to see an update to WP-Cache. The first thing I noticed is that when I enabled the new super cache compression option, I started getting a file save as dialog instead of my pages. As of the current version of WP-Super-Cache, the readme.txt file states that if you get this, you need to disable the super cache compression option.

Not being satisfied with this answer, I’ve done a little digging and come up with the following solution. Continue reading “Making WP-Super-Cache gzip compression work”

WP Super Cache – The Ultimate WordPress Caching Plugin

I’ve upgraded my old WP-Cache plugin to this one that I found on Digg.com today.

From the Digg.com Post:

Tired of clicking a link off the Digg front page only to find a crashed or mortally lagged site on the other side? Finally, Donncha (one of the main WordPress developers) has solved the problem once and for all with a plugin that blows WP-Cache away.

I had a minor issue but was able to find the answer on the WordPress plugins wp-super-cache faq page. If you are upgrading from the old plugin, you need to correctly set up you cache files in the wp-content directory. I had old files based on the original WP-Cache and needed to remove those and add the new ones.

# from within the wp-content directory
>rm wp-cache-config.php
>cp plugins/wp-super-cache/wp-cache-config-sample.php wp-cache-config.php
>ln -s plugins/wp-super-cache/wp-cache-phase1.php advanced-cache.php

After that, I was able to enable and use the plugin successfully.

In addition to enabling the plugin, I thought I’d try out the super cache functionality. To do this, you have to add a few more rewrite rules to your .htaccess file. I didn’t notice this in the documentation, but you have to add these before your other rewrite rules.

# new .htaccess file after enabling super cache
RewriteEngine On
# if these rules come after, you'll not get the super cache functionality
RewriteCond %{HTTP_COOKIE} !^.*comment_author_.*$
RewriteCond %{HTTP_COOKIE} !^.*wordpressuser.*$
RewriteCond %{HTTP_COOKIE} !^.*wp-postpass_.*$
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1index.html.gz -f
RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1index.html.gz [L]

# my original rules
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

Edit: I posted an update that deals with getting the super cache compression to work.

read more | digg story

WordPress and Caching

I just installed the plugin wp-cache. I’m not sure why more WordPress users don’t enable this. From the Wp-Cache description:

WP-Cache is an extremely efficient WordPress page caching system to make you site much faster and responsive. It works by caching Worpress pages and storing them in a static file for serving future requests directly from the file rather than loading and compiling the whole PHP code and the building the page from the database. WP-Cache allows to serve hundred of times more pages per second, and to reduce the response time from several tenths of seconds to less than a millisecond.

I don’t know how many times I’ve gone to a link on Digg.com and found an unusable site with mysql database connect errors, or simply a crashed web server. The comments always say “Another WordPress Blog”.

The problem isn’t WordPress specifically. Any site with a database backend for storage could have the same issues. The problem is that WordPress doesn’t cache pages by default. Any site serving static content with Apache as a front end should be able to handle digg traffic for a while assuming that they enough memory, bandwidth, and the apache directive “MaxClients” set high enough. Well, WP-Cache turns your dynamic WordPress installation into static pages and only regenerates them when they change.

We were marveling at the efficiency of this all when Scott’s Site was dugg twice on the same day.