Validating (X)HTML With IE Using File Upload

Warning: The following describes how to modify the registry in order to trick Windows XP SP2 into allowing text/html to be sent with file uploads. This hack has known side affects which may affect other applications running on your system, some of which are discussed in the comments. As a result, I accept no responsibility for damage caused to your system as a result of applying this hack, and this solution is provided as-is, with no guarentee, warranty or support. If you do not understand the regitry, nor how to reverse any change, then do not apply these changes – use them at your own risk.

Update: This technique is no longer required for HTML. Please see Validation by file upload and Internet Explorer on WinXP SP2

After downloading Windows XP Service Pack 2 recently, I was shocked that IE was now sending HTML documents with a .htm or .html extension as text/plain, thus causing any the W3C Markup Validator to issue this warning message:

Sorry, I am unable to validate this document because its content type is text/plain, which is not currently supported by this service.

The Content-Type field is sent by your web server (or web browser if you use the file upload interface) and depends on its configuration. Commonly, web servers will have a mapping of filename extensions (such as “.html”) to MIME Content-Type values (such as text/html).

That you recieved this message can mean that your server is not configured correctly, that your file does not have the correct filename extension, or that you are attempting to validate a file type that we do not support yet. In the latter case you should let us know that you need us to support that content type (please include all relevant details, including the URL to the standards document defining the content type) using the instructions on the Feedback Page.

This essentially means that it was impossible to validate any local HTML document using IE. This is really annoying, especially for any unfortunate developers who are forced to develop using only IE at work. Although I do pity anyone in that situation, there is now some relief!

After spending about half an hour searching through the registry for any setting that could be causing .html files to be sent as text/plain, I realised that it would be eaiser to find where the setting for other content types that do work, such as CSS. So, I found the setting for that, modified, and tested. When the CSS Content Type value was set to anything but text/html, IE uploaded the file with that MIME type. Thus, I came to the conclusion that it was not that the setting was incorrect, but that something in Windows security was preventing any text/html content being sent by changing it to text/plain on the way.

After that, I tried setting the valud for .html files to another type that the validator may support, such as text/sgml or application/sgml, but sadly, without luck! But, just before giving up all hope, I realised that perhaps Windows security, being as insecure as ever, is only checking for an exact match on the content type being set by IE with file uploads. I was correct!

In a normal HTTP header, the Content-Type can also include a charset parameter. For example:

Content-Type: text/html; charset=UTF-8

So, I figured, what if I want IE to send a charset parameter also. I set the Content Type value in the registry to that above, and it worked perfectly — the file validated!!! However, the charset will not always be UTF-8, or any other charset for that matter, so I removed the chaset parameter, and was left with the value text/html; That extra little semi-colon on the end is enough to bypass Windows security, and validate any HTML file.

Then, I remembered that IE also does not know how to validate XHTML documents either. So, I went to the registry key for .xhtml files, added the application/xhtml+xml MIME type, tested and Guess What! It Worked.

I have exported the required settings from the registry and they are availble here. IE6-SP2-Content-Type-text-html.reg will fix the value for text/html, and IE6-SP2-Content-Type-application-xhtml+xml.reg will add the MIME type for XHTML documents. Download them both, inspect their contents to ensure that they are safe, and apply them by launching them. You will be prompted by Windows to confirm that you want to apply the settings.

Update: For any users of ICQ: If you use change the text/html value to text/html; then each time the ICQ advertisement rotates, you may be prompted to save the file, because it is an unknown file type. I don’t konw why this happens, because IE still works the same as always — full of bugs! But for some reason it affects ICQ. I recommend you only apply that work around on computers that you do not use ICQ on, or else change it each time you need to validate with IE.

9 thoughts on “Validating (X)HTML With IE Using File Upload

  1. Thank you for this post!!

    Several of us have been looking for a fix for this in the past couple of days to assist students we work with who still use IE.

    As there seems to be very little information available online regarding this issue, you may have just made our lives a LOT easier!

  2. A big thank you from me too! Wow, how frustrating it was to try to teach coding a valid web page to a classroom full of newbies, to find this silly problem popping up everywhere! Thank you for the fix!

  3. I just tried your solution to no effect. I rebooted and verified the registry change was accurate; it was. We just have to upload the files to the web in order to verify.

    (btw… thanks for allowing anonymous posting)

  4. That’s unfortunate, I thought it would work. It works for me, but that’s never a guarentee that it will work for everyone. There must be something different with your system setup that prevents it from working. However, I did say this may not be the best solution — it’s only an interim solution for some people until microsoft fixes it. The best solution is to simply install a better browser, but if that’s not possible, then your only option is to upload to the web first, as you said.

  5. I found that .html files were being sent by IE as ‘text/plain’, while .htm files were being sent as ‘application/octet-stream’.

    But, more annoyingly, I also found that Mozilla sent .htm files as ‘application/octet-stream’ too! A quick search through Mozilla’s preferences didn’t unearth any way of changing this. I suspect this may also be the case for FireFox.

    So, I think this is a more general not-thought-about-before browser problem.

    Anyway, thanks for finding a way to resolve this for IE!

  6. Re [HKEY_CLASSES_ROOT\.html]
    “Content Type”=”text/html;”

    Using Windows XP SP2. I added the semicolon as described and sometime later encountered two problems:

    (1) I could no longer print to any printer from Outlook Express (though other programs and the printer test page printed fine), and

    (2) Norton AntiVirus would no longer scan or even display its scan screen fully.

    I fiddled about trying everything for a couple of hours and then remembered this registry fix. I removed the semicolon and, bingo, everything immediately returned to normal.

    Anyone else similarly afflicted?

  7. Doug,
    Thanks for letting me, and anyone else reading this know your experiences. I knew there would be problems, which is why I added the warning. I didn’t think they would be as serious as they have been. The change seems to interfere with any application that uses IE’s HTML rendering engine. That’s why I recommend getting a descent browser, and only using this as a last resort, if you have no other choice.

    As it turns out, this causes more problems than it solves, but it was just a quick and dirty hack to get around a problem that should never have even occured. As I wrote in the article, I only spent a few hours working on it, and did not do extensive testing. I’ve got better things to do, and better browsers to use. I just wanted to help out those unfortunate people who have no other option.

  8. Hi everybody – and thank you for providing this insight as to why the validator stopped working.

    Anyway, I’ld thought I just wanted to share with you how I got around the problem. I use AIS’ web accessibility toolbar, which – in addition to really cool features, if you are a serious about accessibility, let alone xhtml accordance (no, I don’t own shares) – provide a validate file feature. The file upload feature really just uploads the file to w3c, but somehow bypasses win xp sp2 alleged security feature.

Comments are closed.