Juicy Studio: XHTML Registry Hack

Saturday, 8th January 2005

Summary

There appears to be some confusion about Internet Explorer's inability to handle XHTML documents delivered with the correct MIME type. Adding a registry entry to instruct Internet Explorer to handle content served as application/xhtml+xml as text/html is not a solution. Content developers who serve the correct MIME type are likely to be using the XML features of XHTML; telling Internet Explorer to handle that content as text/html is just plain wrong.

Author: Gez Lemon

Is IE just missing a Registry Entry?
Still Tag Soup

Is IE just missing a Registry Entry?

I received the following email today (name of sender removed).

I have applied this trick on IE and it seems to be working.

http://www.peterprovost.org/archive/2004/10/22/2003.aspx

Despite several attempts, my reply keeps bouncing back to me. As it's quite relevant to recent topics on this website, I've decided to post the reply here.

Peter Provost's article provides a solution that stops Internet Explorer offering to download files delivered as application/xhtml+xml with a registry hack, offering the following explanation.

Although a number of people seem to think that Internet Explorer doesn't support it, the real answer is that it is just missing an entry in the registry to tell it what to do with that MIME type.

Unfortunately, that is completely incorrect. Internet Explorer is not able to handle documents served as application/xhtml+xml, and all the registry hack does is tell Internet Explorer to handle the document as text/html by providing the CLSID for text/html against an entry for application/xhtml+xml in the registry. Considering there are already user agents that are capable of handling the correct MIME type, what is the point of adding a registry entry to trick it into handling the content as text/html? It doesn't affect the accept header sent by Internet Explorer (thankfully), so any website using content negotiation will continue to send the content as text/html.

From a web developer's perspective, it doesn't help at all. It's a ridiculous enough notion to inform users that your website is best viewed in a particular browser at a particular resolution. Most developers are now sensible enough to realise that a visitor is unlikely to change their browser and the resolution of their monitor just to view a website. This solution is taking it one step further: In order to view this website, you will need to add the following entry in your Windows registry.

[Back to the contents]

Still Tag Soup

Even if everybody who runs Internet Explorer on a Windows operating system decided to add the registry entry, and there were websites that served content as application/xhtml+xml regardless of the capabilities of the browser, what exactly does the registry hack achieve? XML stylesheet declarations are completely ignored, as is the case of other XML features such as xml:base. Content isn't parsed for well-formedness. The markup will still have implicit elements, such as a tbody in tables if one hasn't been explicitly added in the markup. There are also a whole host of scripting problems, such as CDATA comments being ignored, and SGML comments, document.write, and innerHTML all being obeyed, as well as countless other issues. If a developer has to use XHTML, and is concerned about Internet Explorer, they should either deliver the content as text/html following the HTML Compatibility Guidelines, or use content negotiation to deliver application/xhtml+xml to user agents that understand it, and text/html to other user agents.

All the registry hack does is make Internet Explorer handle content delivered as application/xhtml+xml as text/html. Most content delivered as application/xhtml+xml does not follow the HTML Compatibility Guidelines, meaning that Internet Explorer has to rely on its error handling capabilities even harder than it normally does just to render a page. A simpler solution would be to get a better browser.

The tone of the original email was far more polite than what I've written here. In re-writing it here, I almost slipped into a rant. :-)

[Back to the contents]

Category: Web Standards.

Comments

[xhtml-registry-hack.php#comment1]

Why not handle it as application/xml instead of text/html?
Posted by Robbert Broersma on Saturday, 8th January 2005 at 10:08:27
[xhtml-registry-hack.php#comment2]

For generic XML documents, application/xml is fine. For XHTML, there's a chance that the browser may not recognise it as an HTML document, which is why application/xhtml+xml was registered as a MIME type. The registry hack is still required in order for Internet Explorer to know what to do with XHTML documents served as application/xml, so there's still no benefit.
Posted by Gez on Saturday, 8th January 2005 at 13:47:09
[xhtml-registry-hack.php#comment3]

I'm a little confused.
1) Internet Explorer is certainly capable of acting as an XML UA. If you download foo.xhtml to your HD, and open it in IE, it will get treated as an X(HT)ML document (parsed by an XML parser, checked for well-formedness, etc.).
2) With the MathPlayer 2.0 plugin installed, Internet Explorer is capable of handling foo.xhtml directly from the 'net, without saving it to disk. And, yes, it really does use the aforementioned XML parser.
Perhaps this Registry Hack is wrong in the details, but it is certainly right in principle. IE *can* handle X(HT)ML documents. With an appropriate Registry entry, it ought to be able to handle them directly from the web.
Posted by Jacques Distler on Sunday, 9th January 2005 at 09:03:52
[xhtml-registry-hack.php#comment4]

You're right, Jacques, but it would require more than a registry entry to make IE understand how to handle application/xhtml+xml. It would also require the appropriate program instructing IE how to handle the content. In theory, IE should be capable of handling XHTML, as its XML parser is excellent. I use it on the comment system of this site to ensure that the entry is well-formed, and adheres to the XHTML DOCTYPE, so it's definitely more than capable of parsing XML. The problem is with the way that IE detects MIME types, and then what it tries to do with the content.

Peter Provost's solution gets around MIME type detection by informing IE that the content is text/html, so it's handled as text/html. As it stands, IE does not support application/xhtml+xml. There's nothing stopping anyone registering a DLL that makes IE understand the correct MIME type for XHTML, and how to handle the content, which is essentially how MathPlayer works. MathPlayer does as much as it needs to ensure well-formedness, and how to render MathML, but that's about it. Even with MathPlayer, XML stylesheet declarations are ignored, there's an implicit tbody element, and other features such as xml:base are ignored. Obviously, there's nothing stopping anyone writing a plugin to make IE handle application/xhtml+xml according to the specification, but as far as I'm aware, no one has.
Posted by Gez on Sunday, 9th January 2005 at 16:42:56
[xhtml-registry-hack.php#comment5]

I certainly agree that sending application/xhtml+xml content down the tag-soup parser code-path is consummately evil.

If that's what this hack does, then James was right: it is the death of XHTML on the web.

Even with MathPlayer, XML stylesheet declarations are ignored, there's an implicit tbody element, and other features such as xml:base are ignored.

But the document is being sent down the XML parser code-path. That's what counts. IE's implementation of XHTML may be a bit broken. But it's no more broken (maybe rather less) than its implementation of CSS 2.
Posted by Jacques Distler on Monday, 10th January 2005 at 03:00:35
[xhtml-registry-hack.php#comment6]

But the document is being sent down the XML parser code-path. That's what counts. IE's implementation of XHTML may be a bit broken. But it's no more broken (maybe rather less) than its implementation of CSS 2.

Yes, I totally agree; MathPlayer is defintiely a step (or leap) in the right direction.

Posted by Gez on Monday, 10th January 2005 at 08:15:16
[xhtml-registry-hack.php#comment7]

In other-words it's a limited hack of little practical use other than for testing purposes and I agree it has no real advantage for being applied when surfing the internet; if it cannot understand well-formedness and XML stylesheet declarations, etc.
Posted by Robert Wellock on Monday, 10th January 2005 at 09:45:31
[xhtml-registry-hack.php#comment8]

Jacques, it's not the death of XHTML. All the major browsers except Internet Explorer support it (same with CSS2 and 3, that's right, 3 is coming). When Internet Explorer 7 comes out (supposedly at the end of 2k5), I'm pretty sure it will support XHTML and CSS2 and all the other stuff it should. =) God Bless!
Posted by Nobody on Tuesday, 25th January 2005 at 07:14:16
[xhtml-registry-hack.php#comment9]

I have to admit that this whole thread is quite amusing. You people are obviously far more knowlegeable about this that I am. And certainly a lot more religious about this that I am.
But as I said in my original post, all I wanted was to be able to view the XSLTUnit site, right then right there. I wasn't interested in the philosophical battle between XML, XHTML, HTML, IE, Mozilla, etc. I just needed to read the web site.
I will grant you that the IE team has a problem if they are not supporting application/xhtml+xml correctly, but the statement, "it's a limited hack of little practical use other than for testing purposes" is obviously wrong, because I am not really interested in testing web pages, I just needed to read some content.
Which also brings up a good point, in a landscape where we have a large number of user running a browser that doesn't support this MIME type correctly, shouldn't that site have used content negotiation to send the correct type?
As for the registry hack, if you don't like it don't use it. It solved my problem at the time and that was all I cared about.
Regards,
Peter
Posted by Peter Provost on Thursday, 3rd February 2005 at 04:19:16
[xhtml-registry-hack.php#comment10]

There is a way to send the information the right way for different browsers with a server-side script. The site didn't do it right. The registry hack works, but not completely. You got the results you wanted, good. That it does, and for that purpose it works. And I'm sure you're happy you could read your site, and I'm happy for you too (if it was a good site, lol). About religion, I'm a Christian and I worship God, not web design, if that's what you're trying to say. Anyways, it worked, you're happy, the rest of us use Mozilla or something else, and everyone lived happily ever after, and no one is arguing (I hope). =) God Bless you all!
Posted by Nobody on Tuesday, 8th February 2005 at 00:08:09
[xhtml-registry-hack.php#comment11]

I was totally wrong about MathPlayer 2.0. (Thanks to Rikkert Koppes for straightening me out.)

MathPlayer 2.0 also sends the document down the tag-soup code-path. The MathML fragments are parsed using the XML parser (and must be well-formed). But the rest of the document can be tag-soup.
Posted by Jacques Distler on Friday, 11th February 2005 at 19:53:48
[xhtml-registry-hack.php#comment12]

Hi Peter

But as I said in my original post, all I wanted was to be able to view the XSLTUnit site, right then right there. I wasn't interested in the philosophical battle between XML, XHTML, HTML, IE, Mozilla, etc. I just needed to read the web site.

Which of course is fair enough, and I'm quite impressed with the way you found a solution to that problem. This post was never meant to be an attack on you, it was intended to point out a flaw in your conclusion:

Although a number of people seem to think that Internet Explorer doesn't support it, the real answer is that it is just missing an entry in the registry to tell it what to do with that MIME type.

That conclusion is clearly incorrect. If developers took a view that IE was able to handle application/xhtml+xml, and started marking up documents correctly, the content would not be displayed correctly in IE. Thankfully, it doesn't make any real difference, as IE would still correctly inform web servers using content negotiation that it is only able to handle content as text/html.

Which also brings up a good point, in a landscape where we have a large number of user running a browser that doesn't support this MIME type correctly, shouldn't that site have used content negotiation to send the correct type?

Ordinarily, delivering content as application/xhtml+xml without some type of content negotiation would be suicide as IE is still the most dominant browser. XSLTUnit obviously plan at some time in the future (if they don't already) to use XML features in a custom DTD that they've deemed to be suitable only to browsers that are able to handle the correct MIME type. Obviously, I can't speak on their behalf, but I'm sure their reasons are valid to them. I certainly don't regard it any worse than Microsoft building websites that reject non-IE browsers, then having the audacity to talk about building software that is interoperable by design, especially considering the mess we're in at the moment with IE being the most dominant web browser, and the least standards compliant.

Posted by Gez on Sunday, 13th February 2005 at 21:45:29
[xhtml-registry-hack.php#comment13]

MathPlayer 2.0 also sends the document down the tag-soup code-path. The MathML fragments are parsed using the XML parser (and must be well-formed). But the rest of the document can be tag-soup.

Wow, I'm really quite shocked. I also assumed that MathPlayer required the whole document to be well-formed.

Thank you for pointing it out, Jacques.
Posted by Gez on Sunday, 13th February 2005 at 21:52:10

Comments are closed for this entry.

XHTML Registry Hack

Summary

Contents

Is IE just missing a Registry Entry?

Still Tag Soup

Comments

Syndication