scriptygoddess

28 Mar, 2003

Compressing Webpages for Fun and Profit

Posted by: Christine In: How to's

(Written by the Guest Goddess, Photo Matt. Please note: You need to have PHP on your server to do this. No PHP? Won't work.)

So your page is now totally pimped out. You have gads of content on your sidebar, you've used ScriptyGoddess know-how to have comments and extended entries pop out like magic, and you even have some entries to take up some space between all the gadgets. The problem? The code on your page is now weighing in at half a meg and you can actually hear people cry when they load your site with a modem. You start to think about what features you could take out, maybe cutting out entries on the front page, but what if I told you that you can third your content easily with no work on your part whatsoever? It sounds like a pitch I might get in a lovely unsolicited email. The secret lies in the fact that every major browser of the past 5 years supports transparently decompressing content on the fly. There are three ways to do it—easy, right, and weird—and we'll cover all three here. Before we even get started you should check for compression of your pages, because if it's already happy it's probably best to not fix what ain't broken.

Easy

<?php ob_start("ob_gzhandler"); ?>

I hate to be anti-climatic, but that's it. Put that at the very top of your PHP-parsed page that you want to compress and that's it. The only thing to watch for is it really does have to be at the top, or the sky will fall. Actually before you call me Chicken Little, you'll probably just get a cryptic "headers already sent" error, but you can never be too careful. Basically what this magical line of code does is start an output buffer which takes all your content, checks if the client can receive compressed content, and if it can it zips up the buffer and sends it on its merry way. This can be a great technique to curb your bandwidth usage to; I've seen it save gigabytes on content-heavy sites.

Right

While the overhead associated with the above is minimal, if you'd like to see the benefits of compressed content on a larger scale, mod_gzip is the way to go. Mod_gzip is an Apache module which will compress files whether they are CGI scripts, processed by PHP, static HTML or text files, whatever it can. It is completely transparent to both the user and client, and it supports sophisticated configuration to allow it to be tweaked to your heart's content. However if you don't have permissions on your box to compile modules and modify httpd.conf, this option is unavailable, but don't let that stop you from bugging your host to include it, as there is really no good reason to not include it. It's always faster to send a smaller file. If you're interested in writing your own Apache module, studying mod_gzip is a great way to learn as it has extremely informatative debug code.

Alternative

There are certain circumstances where output buffering, which by definition has to wait for everything to process before it sends anything to the browser, can cause a perceivable delay in viewing scripts that take a while to run. With mod_gzip this isn't a problem because it streams content as it comes to it, and using PHP it doesn't have to be a problem either because it offers an alternative method of compressing and sending content, called zlib output compression. It's a little trickier to enable though, because there is no good way enable or disable it with straight PHP code, so the way we're going to do here is use .htaccess to modify the php.ini configuration. Instead of waiting until everything is finished, zlib output compression can take the content as chunks and send them as it comes to it. Here's what you need to put in your .htaccess file:

<FilesMatch "\.(php|html?)$">
php_value zlib.output_compression 4096
</FilesMatch>

Basically what this code says is if the file ends in php, htm, or html turn zlib output compression on and stream it out every 4 kilobytes. It's common to see a 2K buffer suggested on the web but I've found the overhead with that is higher, and this is a nice balance. You should know that this is the slowest of the three methods, but by slow I mean it adds .003 seconds instead of .001, so it's not really that big of a deal.

So now you have a faster site that's more fun to visit, and you're saving money on bandwidth. You can sit back now and wait for the love letters to pour in from your readers saying how much faster everything is loading. Enjoy!

Geek Notes

  • Like with so many other things, Netscape 4 really screws up gzip encoding in a lot of ways, but you can avoid 99% of its problems simply by making sure that you don't gzip any linked JS or CSS files and you should be alright. On a more technical level, early versions of Netscape 4 try to use the browser cache to store compressed content before decompressing it, which works unless you have your browser's cache turned off, and then it will do something crazy. Note that this behavior even varies from version to version of Netscape 4, so overall I wouldn't worry about it.
  • If you're doing things over SSL and you want to use mod_gzip as well, you have a little hacking to do.
  • PHP.net documentation on ob_gzhandler and zlib output compression (they recommend using zlib).
  • Things like images, zip files, and Florida ballots are already highly compressed so trying to compress them again might actually make them bigger. And then you have to recount.
  • Avoid compressing PDF files as well because sometimes Internet Explorer on Windows (the 900-pound gorilla) forgets to decompress them before the Acrobat plugin takes over.
  • According to the RFC, technically compressed content should be sent using transfer encoding rather than content encoding, since technically that's what is going on. One browser engine supports this, can you guess which one?
  • Internet Explorer on Mac doesn't support any sort of content compression like the methods described above, but that's okay because all of the above methods intelligently look for the HTTP header that signals the client can accept gzip encoding, and if it isn't there—like in IE Mac, handheld browsers, whatever—they just sit idly by.

25 Mar, 2003

A Beginners Guide to TrackBack

Posted by: adam In: Bookmarks

Yesterday Ben and Mena published a A Beginners Guide to TrackBack which provides a non-technical primer on the how and why of TrackBack pings.

[via PhotoJunkie]

21 Mar, 2003

Comment Leaders and UPDATE statements

Posted by: kristine In: MT hacks

Brenna released a new version (0.3) of the CommentLeaders plugin and so while I was up last night, I did some tweaking to make my list work a bit better. This tip would also be helpful if you are using the version from this site: Scripts: Show recent comments WITH total comments from that comment author (which is based on Comment Leader Board with PHP and MySQL).

See, when you've been blogging for as long as me, your blogging friends tend to have changing email addresses over the years and so it makes it hard to tally the top commenters! And besides, I have blogger and greymatter posts in there, and the GM posts didn't require email address, so I have a lot of empty posts.

So I did some UPDATE statements in my MySQL database to make the newest addresses apply to all posts by that author. It took a little browsing through the database to see which ones needed changing, but I think I've got most of it. I didn't bother changing any other information about the authors, just the email address for grouping correctly in the plugin output.

Here's some examples in case you'd like to do some condensing too…
Read the rest of this entry »

21 Mar, 2003

O'Reilly's Developing MT Plugins

Posted by: Christine In: Bookmarks

The O'Reilly Network has put out an article on Developing Movable Type Plugins. It was written by Timothy Appnel and it looks like a great read for anyone that wants to build their own. (Link Via Anil Dash.)

20 Mar, 2003

Titles of other blogs on your site…

Posted by: Jennifer In: Bookmarks

I'm actually going to try and start focusing on sharpening my CSS skills, but wanted to post this here for when I was up to PHPing again. I was browsing through the php-princess site and noticed how they had the headlines from other blogs in their sidebar. And I can't see a toy and not want it so I went on an investigation to try and figure it out. I emailed them, but they didn't get back right away so I contacted the other authors here to see if they knew.

Christine, passed the email onto Mike who came up with this code (demo)

Kristine also weighed in saying it looked like what she does on the mt plugins syndication page. She uses the MT-RSS Feed plugin and the MT-List plugin together. More info here and an overview tutorial here.

Daynah was finally able to respond and said they actually got a script from here.

Heh. More proof of how there's more than one way to scape a site. ;0)

20 Mar, 2003

Read a CSV file…

Posted by: Jennifer In: Bookmarks

I will now profess my undying love for php. I have a project at work where I have to read a CSV file into an array. I thought it was going to take an endless amount of time to try and figure out how to do that. But PHP has done it all for me with this function fgetcsv and shows how it works with this snippet. Read and print the entire contents of a CSV file

<?php
$row = 1;
$handle = fopen ("test.csv","r");
while ($data = fgetcsv ($handle, 1000, ",")) {
$num = count ($data);
print "<p> $num fields in line $row: <br>\n";
$row++;
for ($c=0; $c < $num; $c++) {
print $data[$c] . "<br>\n";
}
}
fclose ($handle);
?>

*smooch*

18 Mar, 2003

SETI xml parser…

Posted by: Jennifer In: Bookmarks

Peter had contacted me with a script request to parse the xml file that the SETI@HOME creates so he could publish his stats on his blog. Guess I took too long to get around to it, so he wrote one himself.

18 Mar, 2003

The RSS Trend

Posted by: Christine In: Bookmarks

To continue my latest trend of RSS posts, there is a tutorial up at 4GuysFromRolla.com – Syndicating Your Web Site's Content with RSS. Link from Dave Winer of Scripting News. (I read it first in my news reader, Newzcrawler.)

As for the "What does the acronym RSS stand for?" debate – that's been a debate for years. One of those "depends on who you ask" sort of things. Then there is the debate on who created it first. All of which we'll probably never solve here at ScriptyGoddess…

18 Mar, 2003

alt-php-faq.org

Posted by: Jennifer In: Bookmarks

After finding the info for the post below, I found a bunch of other things on alt-php-faq.org. like:

encrypting using pgp class and php here too
making thumbnails with php
using php to POST to another url without forms and hidden variables (this script will come in handy if/when I take a shot at fixing some of the annoying parts of the "subscribe to comments" script)

…there's just a BUNCH of stuff there…

18 Mar, 2003

Get users IP with PHP

Posted by: Jennifer In: Script snippet

Not sure why I haven't posted this before. (or maybe I did and I'm just drawing a blank). Get the IP of the user:

if ($_SERVER['HTTP_X_FORWARD_FOR']) {
$ip = $_SERVER['HTTP_X_FORWARD_FOR'];
} else {
$ip = $_SERVER['REMOTE_ADDR'];
}

I've also seen it done this way too but for some reason I was under the impression the method above was more reliable:

$ip = $REMOTE_ADDR;

If you want to resolve the domain name of the IP (as seen here):

$domain = GetHostByName($REMOTE_ADDR);

Featured Sponsors

Genesis Framework for WordPress

Advertise Here


  • Scott: Just moved changed the site URL as WP's installed in a subfolder. Cookie clearance worked for me. Thanks!
  • Stephen Lareau: Hi great blog thanks. Just thought I would add that it helps to put target = like this:1-800-555-1212 and
  • Cord Blomquist: Jennifer, you may want to check out tp2wp.com, a new service my company just launched that converts TypePad and Movable Type export files into WordPre

About


Advertisements