Summary
Those familiar with Apache will be used to the luxury of being able to specify redirects on the fly, without having to write programs to catch errors, and ensure they return the correct HTTP status codes. Being new to Apache, I was amazed at just how easy it is. The following provides an overview of the Apache Redirect
directive.
Author: Gez Lemon
As my new host uses Apache and PHP, I need to redirect requests for old documents to their new equivalents. One of the things I was concerned about, was returning the correct HTTP response code, to let clients know that the page has been moved permanently. Being new to Apache, I've no idea how powerful the directives that may be added to the .htaccess file are, but I decided to look up the Redirect
directive, and see whether that had a method to include the HTTP status code.
By default, the Redirect
directive returns a status code of 302
(moved temporarily). Fortunately, the Redirect
directive has an optional parameter that allows you to specify a different status code. The correct status code for moved permanently is 301
. The following directive returns an HTTP response status indicating that the resource has moved permanently, asking the client to fetch the new URI.
Redirect 301 /cognitive-impairment.asp http://juicystudio.com/article/cognitive-impairment.php
A complete list of HTTP status codes can be found in RFC 2616. Status codes in the 3xx
range must include the new URI parameter; for any other status code, the URI parameter must not be provided. Apache also supports the following tokens, which can be used in place of a literal HTTP status code:
- permanent
-
Returns a status code of
301
, indicating that the resource has moved permanently. - temp
-
Returns a status code of
302
, indicating that the resource is temporarily under a different URI. This is the default value if no status code is provided. - seeother
-
Returns a status code of
303
, indicating that the resource has been replaced. - gone
-
returns a status code of
410
, indicating that the resource has been permanently removed. As this status code is outside of the3xx
range, the URI parameter should be omitted.
The Redirect
directive above could be written using the more readable token for moved permanently.
Redirect permanent /cognitive-impairment.asp http://juicystudio.com/article/cognitive-impairment.php
As well as the Redirect
directive, Apache also has a few other directives for redirecting clients, including a RedirectPermanent
directive.
RedirectPermanent /cognitive-impairment.asp http://juicystudio.com/article/cognitive-impairment.php
Another directive that will be useful to me when I have moved more of the material to the new host is the RedirectMatch
directive, which allows the match to be based on a regular expression.
Category: Apache.
[apache-http-status.php#comment1]
Welcome to Apache!
Are you on a shared host Gez? If you have access to the httpd.conf file you are much, much better off adding these directives there rather than falling into the .htaccess trap.
Not that they aren't useful, and many webmasters on shared hosts don't have the luxury of the better option.
The problem with .htaccess files is a performance one. If AllowOverride is enabled on a directory, for every hit to a page in that directory, Apache will search for .htaccess files there, and *every sub-directory under it*
Regardless of if they are there and used or not. ~d
Posted by Douglas Clifton on
[apache-http-status.php#comment2]
Hi Doug,
I am on a shared server, so my only option is .htaccess. I'm using httpd.conf at home, and was quite concerned when I found out that I wouldn't have access to it when I moved everything to the server. That's when I discovered .htaccess, although I am concerned about the performance. It's going to be a steep learning curve, but good to be learning all this new information
Posted by Gez on
[apache-http-status.php#comment3]
One solution to the problem of lack of access to httpd.conf on a shared host is a virtual host configuration file. I use them anyway to keep things modular, even though I have a dedicated server [one I share with a mate, to use a British/Aussie expression].
Anyhoo! If the hosting company will allow this, all they need to do is reconfigure the httpd.conf file to point the virtual host entry at the external config file. However, they may not allow this, as any syntax errors in the configuration file(s) will cause Apache to complain and exit on start, or restart. Make sure you test changes on your home/development box first! Also:
is your friend.
An example, only slightly modified, from my host:
Once this is done, you have all the power of apache configuration at your fingertips, but limited to the scope of your virtual. One of the first things you should do is disable .htaccess files:
Just a few of the things you can do with your own virtual host config:
tweak mod_gzip, if available
tweak PHP sessions, if you use them
prevent hotlinking
auto redirect
custom error documents
set-up "clean" URIs via:
which, when used with PHP code like this:
is very powerful stuff.
I think I just wrote my first Juicy Studio article. lol...~d
Posted by Douglas Clifton on
[apache-http-status.php#comment4]
Thank you for the heads up, Doug. I'm a proper newbie to Apache, and most of that is above me. Presumably, the host take care of the <VirtualHost [ip][ ort]> directive, and I just have to provide the virtual.conf file, which would be my version of httpd.conf? I'm not confident I could create a version of httpd.conf locally that would work. If there were a danger that anything I did would affect their other customers, it would probably be too dangerous to do; I just don't have enough knowledge, and tend to be particularly bad at configuring things. I do know that mod_gzip is only available through the dedicated hosting plan, which I couldn't afford.
With regards to using the ForceType directive, presumably I could use that in .htaccess? I'm currently using mod rewrite for clean URIs; is there any advantage to using ForceType over mod rewrite? The only obvious thing I can think of is that it would pass on the processing from Apache to PHP, but it's still going to need processing. Does Apache cause more of a performance hit on the server than PHP? It would seem strange to me if it did, but it's all alien to me.
Posted by Gez on
[apache-http-status.php#comment5]
Webmasters, once they learn about .htaccess and mod_rewrite tend to get really excited about them and go overboard. Especially when it comes to using regular expression, when there are often much simpler ways of achieving the same result. Regular expressions in general are way, way over-used. I wish I had a dime for every line of PHP code I've seen out there used to match and replace a simple string of text using preg_replace() (often inside nested loops) when str_replace() would have worked just fine. Likewise use explode() rather than split().
Okay, someone kick the soapbox out from under me. Use mod_rewrite when you need the power, and buy another CPU fan if you need it often.
Yes, you can use ForceType in the context of an .htaccess file. Again, watch out for assigning this globally, or something silly like .html files (which I also see a lot of -- why would you run all your static files through the PHP engine?)
This box is running Apache 1.3.29 and PHP 4.3.10, which is good since I have yet to move onto 2.0/5.0 myself. Anything you want to do, say the word and I'll help you through it.
This is a good place to bookmark:
http://httpd.apache.org/docs/mod/directives.html
As an example, say you wanted tom, dick and harry to run as PHP scripts w/o extensions. Place the following directives in an .htaccess file in the containing directory:
But that's a regular expression they shout! Yes, Apache has support for basic pattern matching built-in. mod_rewrite is voodoo, I tend to avoid it, not to mention have you ever visited Ralf's website? What's with the hands? hahahahahaha...
Posted by Douglas Clifton on
[apache-http-status.php#comment6]
Hi Doug,
Without meaning to sound disrespectful, could you explain that without the emotion? I need things spelling out to be able to follow it.
Why is mod_rewrite voodoo? Apache's a compiled application, so surely mod_rewrite executes faster than the PHP interpreter? If I understand the use of ForceType correctly, it just passes the processing onto an interpreter that uses regular expressions to extract the same information. I cant understand how that could possibly be quicker, or is the benefit something other than speed?
I'm not challenging your assertion; just trying to understand the reasoning.
Cheers,
Posted by Gez on
[apache-http-status.php#comment7]
There are really several related topics going at on here, sorry for any confusion that may result.
Let's see...Apache is a modular, extensible server. PHP (mod_php), mod_rewrite, mod_gzip, mod_* are mostly all written in C, and plug into the server API. mod_perl, (my personal favorite) allows you to author your own Apache modules without having to code them in C. mod_rewrite is generally only involved in the request phase of the client-server interaction.
You would really have to show me examples of how you are using mod_rewrite in order for me determine if it is worthwhile and if there is a perhaps a better way. Email might be better for that though. The warning I was trying to make relates to this scenario:
1. high amounts of traffic
2. to a directory
3. that has an .htaccess in it
4. hat has lots of subdirectories
5. that kicks on RewriteEngine
The point I was trying to make, and this really goes all the way back to your original post on this topic (HTTP 301) is that there are often much easier and faster ways of achieving the same thing that so many webmasters are relying on mod_rewrite for. The module is complex, it is difficult to use, and to understand. I only recommend delving into it if you need that sort of power. The same thing holds true for regular expressions in general: why use a sledge hammer to drive in a tack?
So the question really is, when would mod_rewrite be appropriate? Offhand, the best example I can think of is rewriting URIs in query string syntax to path_info style:
The old school has always been to associate media types with file extensions. The Web is moving away from that. In fact, it is a mistake to make any assumptions about a URI based on this sort of URI/path/file "metadata":
http://www.w3.org/2001/tag/doc/metaDataInURI-31#id2608413
By using ForceType on the new script, you're eliminating the extension. The other benefits are many:
1. it's shorter
2. it's easier for humans to read
3. search engines prefer them
4. you don't have to worry about encoding the ampersands
5. the underlying development language is no longer part of the URI
The last one is really pretty cool. Say you wanted to get into Python programming.
was myscript.asp, now myscript.php going to be myscript.py ... keep rewriting your URIs over and over?
The path_info style of script arguments is equally interesting. Consider drx for instance. I have a somewhat large, hierarchical representation of this list of resources. They certainly look like folders, the search engines gobble up the paths (full of keywords), yet do you think any of these directories actually exist on the host?
Nope.
Posted by Douglas Clifton on
[apache-http-status.php#comment8]
Hi Doug,
Thank you for the explanation. I think I follow your warning about the dangers of getting carried away with mod_rewrite. I'm not using a directory, and I only have one .htaccess file which resides in the root directory, and it only has the one RewriteRule. The rule is:
If I follow your reasoning correctly, this has the same benefit of ForceType, except it's processed in one place rather than passing on the regular expression parsing to the script. The script merely fetches the correct article, which is passed to it as a query string parameter. Would I benefit from changing that rule for a ForceType directive?
Posted by Gez on
[apache-http-status.php#comment9]
ForceType, simply put, allows you to alter the association that Apache makes between file extensions and media type. You can do anything you want really, have what appears to be an image really be a PHP script. But lets not go there. The only thing I use it for is to eliminate the suffix entirely.
Now article in the root becomes your loader script and you elimate both the RewriteRule and the GET query string.
Make sense?
Posted by Douglas Clifton on
[apache-http-status.php#comment10]
To briefly follow-up on this, please keep in mind that this is only a technique. I'm not suggesting that you take this advice. Just throwing out ideas. For instance, I would imagine that article/ is in fact a directory. And you would like to use this for new articles w/o having to load all of them from a central script. ~d
Posted by Douglas Clifton on
[apache-http-status.php#comment11]
Hi Doug,
Thank you for persevering with this. It's getting clearer, but I can't work out how article in the root becomes my loader script, just by adding a ForceType directive. How would it know to run articles.php?
I need to understand it in order to select the best approach. I'm sure it must be really frustrating for you, but I really appreciate your help with this.
There isn't a directory called article.
Cheers,
Posted by Gez on
[apache-http-status.php#comment12]
For any interested a few weeks back found this printable mod_rewrite Cheat Sheet.
http://www.ilovejackdaniels.com/cheat-sheets/mod_rewrite-cheat-sheet/
Also more items in his Apache catagory: http://www.ilovejackdaniels.com/apache/
And other printable Cheat Sheets [mySQL, CSS, PHP, ...]:
http://www.ilovejackdaniels.com/cheat-sheets/
Posted by holly on
[apache-http-status.php#comment13]
Thank you, Holly. That's the kind of thing I was looking for
Posted by Gez on
[apache-http-status.php#comment14]
Perhaps I'm only making this even more confusing for you. Sorry if that is the case. Part of the problem is I don't completely understand how your old site structure relates to the new one. Is it exactly the same, save for the change from .asp to .php? Or are you taking the opportunity to reorganize?
Regarding comment #9 above, in that scenario the script "article" doesn't call articles.php, it replaces it. And the need for the rewrite rule. It would look and behave like a directory, but would actually be a script that loads other resources.
Or in the case of having no path_info argument, simply display a menu and some content as articles.php does now.
Posted by Douglas Clifton on
[apache-http-status.php#comment15]
The penny's just dropped! Thank you for hanging in and explaining that, Doug. That technique's brilliant. The part I missed was that "articles" would literally be the name of the file without the extension, and that ForceType would make references to "articles" execute the file "articles" as a PHP file.
I tested it with a file called nonexist, and added the following to .htaccess:
Specifying a path of /nonexist/junk.php executed the file "nonexist", and /junk.php was extracted with the server variable "PATH_NAME". That is very, very good
Sorry I was so slow picking it up, but I'm glad you stuck with it as I really do like that technique.
Posted by Gez on
[apache-http-status.php#comment16]
Phew!
I was starting to get worried. lol ~d
Posted by Douglas Clifton on