.htaccess mod_rewrite performance

Tag: performance , .htaccess , mod-rewrite Author: xcc20080808 Date: 2011-10-30

i searched a lot on SOF about .htaccess and mod_rewrite and i want to performance wise which one is faster:

RewriteRule ^([a-z0-9]+)/?$ index.php?id=$1 [NC,L]
RewriteRule ^(.*)/?$ index.php?id=$1 [NC,L]
RewriteRule ^([^/]*)/?$ index.php?id=$1 [NC,L]

since the first one only accepts letters and numbers does it make it faster to execute?

+1 good question

Best Answer

When in doubt, test it out. I setup a test server running Ubuntu 2011.10 with Apache2 and used the siege load testing app to perform 3 tests. The test ran for 1 minute (or until 5000 failures) with 50 concurrent users requesting '/index.html'

Test #1 used the following rewrite rule configuration:

RewriteEngine on
RewriteRule ^([a-z0-9]+)/?$ /index.html?id=$1 [NC,L]

The results from siege:

Transactions:                 300970 hits
Availability:                  98.36 %
Elapsed time:                  57.25 secs
Data transferred:              20.38 MB
Response time:                  0.00 secs
Transaction rate:            5257.12 trans/sec
Throughput:                     0.36 MB/sec
Concurrency:                    9.04
Successful transactions:      300970
Failed transactions:            5009
Longest transaction:            0.02
Shortest transaction:           0.00

Test #2 with a rewrite rule configuration:

RewriteEngine on
RewriteRule ^(.*)/?$ /index.html?id=$1 [NC,L]

The results:

Transactions:                 225244 hits
Availability:                  97.82 %
Elapsed time:                  42.43 secs
Data transferred:              15.25 MB
Response time:                  0.00 secs
Transaction rate:            5308.60 trans/sec
Throughput:                     0.36 MB/sec
Concurrency:                    8.71
Successful transactions:      225244
Failed transactions:            5009
Longest transaction:            0.18
Shortest transaction:           0.00

Test #3 with the following rewrite rule:

RewriteEngine on
RewriteRule ^([^/]*)/?$ /index.html?id=$1 [NC,L]

The results:

Transactions:                 210469 hits
Availability:                  97.68 %
Elapsed time:                  39.39 secs
Data transferred:              14.25 MB
Response time:                  0.00 secs
Transaction rate:            5343.21 trans/sec
Throughput:                     0.36 MB/sec
Concurrency:                    8.60
Successful transactions:      210469
Failed transactions:            5009
Longest transaction:            0.02
Shortest transaction:           0.00

comments:

so clearly the [a-z0-9] is faster (300,000 hits with 5272 t/s) vs the other 2 (which are pretty close) with 225,244 and 5308 t/s. Very nice! thanks
+1 for using siege. Do you get the same results with ab?
Sorry to say, but -1 for your whole approach. See my counterpoint response.

Other Answer1

Sorry but IMHO Jason's answer demonstrates that he doesn't understand some of the basic 101 of benchmarking. The spread is <1% with one sample. This comparison is statistically meaningless as the sample variance is infinite. I would be just as interested in the timing for the first case repeated three times, say and what spread came from that.. It is also focuses on the wrong issues.

When you strace what is going on as follows then you will get a better understanding of what it going on. (Limit the apache child processes to 3 or so otherwise you'll have a lot to track!)

strace  -u www-data -tt -ff -o /tmp/strace $(ps -o "-p %p" h -u www-data) &  

Somewhere between 99% and 99.9% of the overhead here is filesystem overhead of the probes to lstat and opens various files e.g. all of the putative .htaccess files on the path to the SCRIPT_FILENAME (in the case of my shared services, there are 8 such probes), and reading any that exist. The lowest in the hierarchy with RewriteEngine On is parsed by the mod_rewrite engine.

If you switch on one of the higher log levels then you can see that the per statement execution on a test VM (including the log overhead) is typically around 0.1 mSec. The cost of PHP image activation on an suPHP-based shared service is ~100 mSec. The cost of a single "-f" file probe if the file isn't in the virtual file system cache can be of the same order. The cost of reading the script files (if the service hasn't got an Opcode cache enabled) especially for an app such Mediawiki or Wordpress can take a second or more, again depending on caching.

So whether the actual calls to ap_regcomp and ap_regexec in httpd-2.x.y/server/util_pcre.c takes 30 µSec or 35 µSec is irrelevant. Benchmarking as an exercise is irrelevant to this selection, as any runtime differences are in the sampling noise. The point is that the three variants have different semantic meanings. Eric should be guided here by two principles:

  • He should pick the version which he knows does what he wants
  • When in doubt: Keep It Simple Stupid, because "clever" can end up biting you in the arse, and in this case there is just no point.