You are currently viewing a snapshot of www.mozilla.org taken on April 21, 2008. Most of this content is highly out of date (some pages haven't been updated since the project began in 1998) and exists for historical purposes only. If there are any pages on this archive site that you think should be added back to www.mozilla.org, please file a bug.



performance: HTTP Compression

Owners: John Giannandrea (jg@netscape.com), Eric Bina (ebina@netscape.com)
Last Updated: 15-September-1998

Getting the Apache module source.

This project aims to improve real and perceived web browsing performance by having the server send compressed HTML files to the browser, and having the browser uncompress before displaying. Assuming fast enough processors on most machines these days, the user should end up seeing the document sooner this way than sending uncompressed HTML. Also, since a majority of network traffic these days is HTTP traffic, compressing all HTML sent via HTTP should recover a significant amount of wasted network bandwidth.

Stage 1 - Content-Encoding: gzip

Status: Complete

The current Mozilla source already sends Accept-encoding: gzip and can do a streaming decompression of HTML data received with Content-encoding: gzip. All that is needed is a server set up to serve this data to mozilla, while maintaining backwards compatibility with browsers that can't handle the compressed data.

To this end a new Apache 1.3 server module has been written. It is activated on a per-directory basis with a command in the access.conf file of the format:


CompressContent Yes

When activated, and only if an Accept-encoding: gzip header is received, all requests for files from that directory will be redirected to requests for an equivalent compressed file from that directory if one exists. In essence if you ask for foo.html and both it and foo.html.gz exist then those requests with an appropriate Accept-encoding will get the compressed file, and other requests will get the uncompressed file.

This neatly solves the backwards compatibility problem for the browser, but creates a maintenance problem on the server end. One would need to run some sort of automated script to regularly maintain up to date compressed versions of files in the directories that needed them. For a solution to this maintenance problem, see Stage 2 below.

Results:

Here is an optimal case where all images are in the cache.
LocalISDN 64 kbits/sec28.8
No GZIPGZIPNo GZIPGZIPNo GZIPGZIP
56.9 sec61.0 sec105.1 sec83.2 sec327.9 sec121.8 sec
7% Slower21% Faster63% Faster
Notes:
  1. For the Local run both the client and server are running on the same machine, so we are seeing both the overhead for client unzip, and the slight extra overhead for the server to locate and send the gzipped content. (an extra call to stat() a file)

A more realistic workload was then generated simulating a user starting with an empty cache, and visiting the CNN site to read in order: Main Page, World, U.S., U.S. Local, Weather, Sci-Tech, Entertainment, Travel, Health, Style, and In-Depth.

LocalISDN 128 kbits/sec28.814.4
No GZIPGZIPNo GZIPGZIPNo GZIPGZIPNo GZIPGZIP
53.0 sec53.2 sec82.1 sec77.6 sec264.7 sec184.4 sec474.1 sec307.7 sec
0.4% Slower5.5% Faster30% Faster35% Faster
Notes:
  1. A much more realistic set of data with a mix of image hits and misses after the first CNN page.
  2. Note that the gzip cost on the local system is basically lost in the noise.
  3. Also all the image loads make the apparent gain at 28.8 much lower.
  4. It is curious that the 14.4 load doesn't show a greater speedup.

These results seem promising enough to warrant moving on to implementation of Stage 2.

Stage 2 - Transfer-Encoding: gzip

Status: Begun

Here we hope to use the new HTTP1.1 TE: gzip header to request compressed versions of HTML files. Then the server would need to do streaming compression to generate the results. To minimize the overhead on the server it should keep a cache of the compressed files to quickly fill future requests for the same compressed data.

The current Mozilla source can already accept and decode Transfer-encoding: gzip data, but does not currently send the TE: header. Work has begun on implementing the streaming compression in the latest Netscape Enterprise Server. (General call for volunteers to implement this as a module for Apache 1.3).

Stage 3 - Other compression types

The previous two stages all dealt only with gzip as a form of compression. While a great general compression scheme, we probably want to negotiate compression type based on the data type requested. For example if the client requested with a TE: gzip header data that turned out to be a JPEG image, the server probably should know not to try to transfer-encode this with gzip.

Comments etc.

Any comments/questions or any volunteers to do the TE-aware Apache module, or other work, contact: Eric Bina.