Summary
I've come across an encoding problem with PHP, and am looking for help from anyone with experience of character encoding and PHP. When I save a file in UTF format, it seems to automatically send headers to the client, resulting in an error when I try to use the header
function.
Author: Gez Lemon
I've converted the colour contrast analyser from ASP to PHP. I've also converted the Spanish version of the colour contrast analyser, which works fine locally, but breaks when I load it to the server.
The error I am getting is a warning that the headers have already been written when I try to set the character encoding with the header
function. The headers shouldn't have been written at that point, and aren't on my machine at home. The only thing I can guess is that because the file is saved in UTF, somewhere along the line, headers are being written and sent to the client. I tried turning on buffering with ob_start
, and flushing after I'd collected the headers with ob_end_flush
, but that still resulted in the same error, possibly confirming that the headers are being written somewhere else, based purely on the character encoding of the source file.
The Spanish version of the colour contrast analyser is loaded on the server, but saved in ANSI to avoid the header problem. The characters no longer display correctly due to the incorrect character encoding. Does anyone have any ideas what could be causing this, and how to resolve it?
Category: Programming.
[php-character-encoding.php#comment1]
That's a new one on me. I've never done any language translations before. By "saving" you mean the PHP source code from a text editor? How do you serve the English vs. the Spanish version of the script? It almost sounds like Apache is getting in the way.
I'm sure you are well aware of this, but just in case:
Send as many additional HTTP headers as you need to, typically at the top of your script. Do not send any normal data (html, whatever) and attempt to use the header() function again afterwards or you'll get this error.
header
header
header
[PHP/Apache headers]
cr/lf
cr/lf
content...
Also, if you are using Firefox, grab a copy of the LiveHTTPHeaders extension which is a great tool for debugging this sort of thing and checking cookies and anything else related [cache, that pesky Accept thing I won't mention again...]
Give me a few to do a little research on this language translation thing. ~d
Posted by Douglas Clifton on
[php-character-encoding.php#comment2]
Thanks, Doug.
I suspect that saving files in UTF-8 is somehow inserting data before the headers, leading to the header modification errors.
That sounds good, I'll download that when I get back from work this evening.
Cheers,
Posted by Gez on
[php-character-encoding.php#comment3]
Another tool for debugging these problems on the server-side is the PHP headers_sent() function:
http://www.php.net/manual/en/function.headers-sent.php
The Apache functions, in particular apache_request_headers() and apache_response_headers() are also very useful:
http://www.php.net/apache
I have been poking around on SEs and such looking for articles/tutorials on implementing international/multi-lingual versions of PHP scripts, and using unicode in particular, to no avail. This really surprises me. I'll keep looking... ~d
Posted by Douglas Clifton on
[php-character-encoding.php#comment4]
Hi Doug,
Thank you for your help with this problem, it's very much appreciated. It turns out the problem was to do with a Byte-Order Mark (BOM) saved in UTF, which is interpreted as data by PHP, hence too late to write out headers. I've written an article which explains it in more detail.
The LiveHTTPHeaders extension you mentioned earlier sounds like it would be really useful, but it doesn't seem to available any more. Do you know of a mirror site that has the extension?
Thanks again for your help. Cheers,
Posted by Gez on
[php-character-encoding.php#comment5]
LiveHTTPHeaders:
http://livehttpheaders.mozdev.org/
Extensions mirror:
http://www.extensionsmirror.nl/index.php?showtopic=60&hl=http+headers
I saw your other article. Funny, this was just after I had found a resource that discussed the BOM issue. If you've found a text editor that solves the problem, then that's great. If you plan on continuing to use a PHP script that removes the offending 2 bytes, please let me know because there are lot more efficient ways of doing it!
Posted by Douglas Clifton on
[php-character-encoding.php#comment6]
Cheers, Doug. For some reason, the mozdev address didn't work yesterday, but it's working fine now. I'll have a play with it.
I've been using BabelPad, recommended by Hans, which seems to be very good. From the emails I've been receiving, it looks like I'd have a good choice of editors is I was using a Mac. I seem to be about 5 years behind everyone else
Posted by Gez on
[php-character-encoding.php#comment7]
Can you please indicate what headers you were setting to set the output as utf 8?
I am having an issue with a 'Spanish' file saved on the server with a special character in the file name and when i save it to my pc in plain old english it prompts to save it as something else.
Posted by Richard on
[php-character-encoding.php#comment8]
hmm, i notice that even in my post that the character has not presented properly....
the character is meant to be an e with a glyph on top.
Posted by Richard on
[php-character-encoding.php#comment9]
I have this in the head:
The Content-Type is changed to text/html for browsers that don't understand application/xhtml+xml. The files are saved in UTF-8, without the BOM.
The encoding error above is because my host is using MySQL 4.0, which has poor support for UTF.
Posted by Gez on