{"id":68,"date":"2005-04-10T08:41:01","date_gmt":"2005-04-10T08:41:01","guid":{"rendered":"http:\/\/lachy.id.au\/log\/2005\/04\/the-future-html-or-xhtml"},"modified":"2006-04-30T23:46:29","modified_gmt":"2006-04-30T23:46:29","slug":"xhtml-future","status":"publish","type":"post","link":"https:\/\/lachy.id.au\/log\/2005\/04\/xhtml-future","title":{"rendered":"The Future: HTML or XHTML"},"content":{"rendered":"<p>The discussion of <a href=\"http:\/\/www.jeroenmulder.com\/weblog\/2005\/02\/xhtml_vs_html_the_discussion_continues.php\" title=\"JeroenMulder.com: XHTML vs. HTML: And the discussion continues\">XHTML versus HTML<\/a>\r\n(<a href=\"http:\/\/stijlstek.nl\/greek2me\/2005\/04\/there-we-go-again-xhtml-hot-or-not\/\" title=\"Greek to me : There we go again: XHTML hot or not?\">via<\/a>)\r\n\thas popped up again, and until now I\u2019ve managed to resist the urge to throw\r\n\tin my 2\u00a2. Well, no longer will I sit on the side line while <a href=\"http:\/\/stijlstek.nl\/greek2me\/2005\/04\/there-we-go-again-xhtml-hot-or-not\/\" title=\"Greek to me : There we go again: XHTML hot or not?\">the\r\n\tsame arguments<\/a> (<a href=\"http:\/\/annevankesteren.nl\/archives\/href\/2005\/04#link-1486\" title=\"HREF April 2005 &lt;Anne's Weblog about Markup &amp; Style&gt;\">via<\/a>) get rehashed again and again, which will not get us\r\n\tanywhere. The question, which I will attempt to answer, is whether the future\r\n\tof the Internet lies with HTML or XHTML.<\/p>\r\n<p>Firstly, I\u2019m just going to set a few ground rules. This is not going to be\r\n\tanother version of <em>XHTML as <code>text\/html<\/code> is considered harmful<\/em> or <em>there are no\r\n\treal benefits to use XHTML<\/em> or an <em>XHTML isn\u2019t even supported<\/em> kind of article.\r\n\tI\u2019m going to get straight to the facts, so here goes&#8230;<\/p>\r\n<h3 id=\"html\">HTML<\/h3>\r\n<p><strong>HTML is all but dead.<\/strong> It\u2019s been getting beaten to death ever since the early\r\n\tversions of Netscape and <abbr title=\"Internet Explorer\">IE<\/abbr>. It\u2019s been on <em>life\r\n\tsupport<\/em> and holding on by a thread\r\n\t(albeit a particularly strong, <em>yet very much frayed,<\/em> thread) ever since <abbr title=\"Internet Explorer 5 for Macintosh\">IE5\/Mac<\/abbr>\r\n\tthrew it a lifeline called DOCTYPE sniffing. Yet no attempt to revive it has\r\n\tbeen, or will ever be, successful in prolonging its life more than a few years\r\n\tpast its use-by-date and it is almost time to let it rest in peace.<\/p>\r\n<p>I know what you\u2019re all thinking. I\u2019m either <strong>insane<\/strong> or just over <em>a\r\n\t\tweek late for April Fools.<\/em> How could, arguably, the most successful document format in\r\n\tthe history of the web, and computing in general, have been so irreparably damaged\r\n\tto be this close to death?<\/p>\r\n<p>The answer and the reason for my temporary insanity, which has lead to these\r\n\trather shocking and completely outrageous yet incredibly accurate claims, all\r\n\tcomes down to the question of what HTML is <em>supposed<\/em>\r\n\tto be, compared with the\r\n\t<em>mind numbingly deformed<\/em> representation we all know and love today, and how it\r\n\tcan <em>and cannot<\/em> be improved in the future.<\/p>\r\n<h4 id=\"html-supposed-to-be\">What is HTML Supposed to Be?<\/h4>\r\n<p>From its humble beginnings as a small, light-weight, non-proprietary, easy-to-use\r\n\tdocument format designed for the publication and distribution of scientific\r\n\tdocuments (created by the mastermind who is aptly titled the inventor of\r\n\tthe World Wide Web and whom we all know as <a href=\"http:\/\/www.w3.org\/People\/Berners-Lee\/\">Tim\r\n\tBerners-Lee<\/a>) closely resembled\r\n\tthe international standard, ISO:8879 \u2013 Standard Generalised Markup Language\r\n\t(SGML).<\/p>\r\n<p>While HTML was not originally based on SGML, the similarities in syntax and\r\n\tthe lack of formal parsing rules for HTML led to the decision to resolve the\r\n\tdifferences and formalise HTML 2.0 as an application of SGML. This was eventually\r\n\tpublished by the <abbr title=\"Internet Engineering Task Force\">IETF<\/abbr> as <a href=\"http:\/\/www.ietf.org\/rfc\/rfc1866\"><abbr title=\"Request for Comment\">RFC<\/abbr> 1866<\/a> in November 1995. Martin Bryan provides a\r\n\t<a href=\"http:\/\/www.is-thought.co.uk\/book\/sgml-1.htm#HTML\" title=\"Web SGML and HTML 4.0 Explained - Chapter 1\">relatively short summary of how HTML began,<\/a> and the process to convert it into\r\n\tan application of SGML.<\/p>\r\n<h4 id=\"what-html\">What is HTML Now?<\/h4>\r\n<p>Sadly, by the time HTML was formalised as an application of SGML, the irreparable\r\n\tdamage to the language (which would eventually lead to the coining of the term\r\n\t<dfn><a href=\"http:\/\/www.is-thought.co.uk\/book\/sgml-1.htm#HTML\">tag soup<\/a><\/dfn> by <a href=\"http:\/\/www.w3.org\/People\/Connolly\/\">Dan\r\n\tConnolly<\/a>) had already been done. None of the HTML browsers that\r\n\twere implemented prior to HTML 2.0 contained conforming SGML parsers, few have\r\n\tever done so since, and no mainstream browser ever will.<\/p>\r\n<p>As a result, browsers don\u2019t read <abbr title=\"Document Type Definitions\">DTDs<\/abbr>. Instead they have all known elements,\r\n\tattributes and their content models essentially hard coded, and basically ignore\r\n\tany element they have never heard of. For this reason it is widely believed\r\n\tthat <a href=\"http:\/\/www.autisticcuckoo.net\/archive.php?id=2005\/04\/08\/doctype-declaration-and-content-type-headers\" title=\"Doctype Declarations and Content-Type Headers @ The Autistic Cuckoo\"><abbr title=\"Document Type Definitions\">DTDs<\/abbr> serve\r\n\tabsolutely no purpose<\/a> for anything other than a validator, and\r\n\tDOCTYPEs are for nothing but triggering standards mode in modern browsers.<\/p>\r\n<p>There are many intentionally broken features in existing HTML  parsers\r\n\tthat directly violate both the HTML recommendation and SGML standard that will\r\n\tnever be fixed. The reason is the simple fact that to do so would break millions\r\n\tof legacy documents, which would only end up affecting the user\u2019s ability to\r\n\taccess them. See <a href=\"http:\/\/www.w3.org\/TR\/html401\/appendix\/notes.html\">HTML\r\n\t4.01 Appendix B<\/a> for a brief, yet very incomplete, summary\r\n\tof unsupported SGML features.<\/p>\r\n<h4 id=\"html-improved\">How Can HTML Be Improved?<\/h4>\r\n<p>The simple answer is not much at all. The ability of HTML to\r\n\tprogress and improve is severely limited by the aforementioned non-conforming\r\n\tparsers and millions of legacy documents that would break if any serious\r\n\timprovements were to be made. <a href=\"http:\/\/listserver.dreamhost.com\/pipermail\/whatwg-whatwg.org\/2005-April\/003482.html\">As <cite>Hixie<\/cite> put\r\n\tit:<\/a> <q cite=\"http:\/\/listserver.dreamhost.com\/pipermail\/whatwg-whatwg.org\/2005-April\/003482.html\">we\r\n\tcan at best add new elements when it comes to the HTML parser.<\/q><\/p>\r\n<p>The element content models for many existing elements cannot be changed much.\r\n\t(e.g. The <code>p<\/code> element cannot\r\n\tbe updated to allow nested lists, tables or blockquotes, the <code>title<\/code> element\r\n\tcannot be updated to contain any semantic inline-markup, etc.) Much of the\r\n\tquirky non-conformant behaviour exhibited by existing browsers will have\r\n\tto be inherited by any future implementations. In fact, such behaviour is\r\n\tbeing retroactively standardised by Ian Hickson and the <abbr title=\"Web Hypertext Application Technology\">WHAT<\/abbr>\r\n\tWorking Group.<\/p>\r\n<p>There is even <a href=\"http:\/\/listserver.dreamhost.com\/pipermail\/whatwg-whatwg.org\/2005-April\/thread.html#3369\" title=\"Mailing List Thread: [whatwg] [html5] tags, elements and generated DOM\">speculation about whether or not HTML should retain the pretence\r\n\tof being an application of SGML<\/a>. Other than the benefits of validation with\r\n\tSGML <abbr title=\"Document Type Definitions\">DTDs,<\/abbr> and the triggering of standards mode with an SGML DOCTYPE, there\r\n\tis little reason to do so. However, the extensive conformance criteria expressed\r\n\twithin the <abbr title=\"Web Hypertext Application Technology\">WHAT<\/abbr> Working Group drafts that simply cannot be expressed within a DTD would\r\n\tmake validation \u2013 as a quality assurance or conformance tool \u2013 limited, at best.<\/p>\r\n<p>Not only that, but any serious attempt at retaining backwards compatibility\r\n\twith existing browsers is expected to require an extensive library of hacks\r\n\t(like <a href=\"http:\/\/dean.edwards.name\/ie7\/\">Dean Edward\u2019s <abbr title=\"Internet Explorer 7\">IE7<\/abbr><\/a>) to make existing browsers do anything useful with the\r\n\tnew extensions. Not even style sheets will have any effect on the new elements\r\n\twithout this library of hacks, as the new elements will be essentially ignored.<\/p>\r\n<p>The question is: <em>do we really want to hold onto a dying language any longer\r\n\tthan we need to,<\/em> with any and all progressions and enhancements being so\r\n\textremely limited; or should we really start pushing to move to a much more\r\n\tflexible and beneficial alternative?<\/p>\r\n<h3 id=\"xhtml\">XHTML<\/h3>\r\n<p>Despite all prior claims of <a href=\"http:\/\/www.spartanicus.utvinternet.ie\/no-xhtml.htm\" title=\"No to XHTML - Spartanicus' Web tips\">XHTML\r\n\t\thaving no benefit whatsoever,<\/a> when it comes\r\n\tto extending the language with new elements, attributes and content models,\r\n\tthe benefits far out weigh the negatives. In fact, all claims that XHTML\r\n\t\thas no benefits over HTML only apply to XHTML 1.0 because the semantics\r\n\t\tof both document formats are identical.<\/p>\r\n<h4 id=\"xhtml-supposed-to-be\">What is XHTML Supposed to be?<\/h4>\r\n<p>XHTML is supposed\r\n\tto be an application of <abbr title=\"Extensible Markup Language\">XML<\/abbr> with\r\n\tvery strict parsing rules. Do I really need to continue? I will assume we\r\n\tall know what <abbr title=\"Extensible Markup Language\">XML<\/abbr> and XHTML are,\r\n\tso no need for me to reiterate it all. For anyone that doesn\u2019t, that\u2019s what\r\n\tsearch engines are for. \ud83d\ude42<\/p>\r\n<h4 id=\"what-xhtml\">What is XHTML Now?<\/h4>\r\n<p>Unfortunately, most XHTML on the web is nothing more than tag soup, or is\r\n\tat least not well-formed, served as <code>text\/html<\/code>. As previous surveys have shown,\r\n\ta majority of sites claiming to be XHTML don\u2019t even validate, and most would\r\n\tend up with browsers choking on them if the correct MIME type were used.<\/p>\r\n<p>Some of the other problems are: that XHTML is not implemented by <abbr title=\"Internet Explorer\">IE,<\/abbr> incremental\r\n\trendering for XHTML in Gecko doesn\u2019t yet work, scripts written for tag-soup\r\n\toften won\u2019t work in real <abbr title=\"Extensible Hypertext Markup Language\">XHTML,<\/abbr> style\r\n\tsheets need to be fixed, etc., etc\u2026 Most of this stuff is discussed in Ian\r\n\tHickson\u2019s document <a href=\"http:\/\/www.hixie.ch\/advocacy\/xhtml\">Sending XHTML\r\n\tas <code>text\/html<\/code> is\r\n\tConsidered Harmful<\/a> (which I\u2019m sure everyone has read by now) and elsewhere\r\n\ton the web.<\/p>\r\n<p>However, the major benefit of XHTML over HTML is that we do already have (mostly)\r\n\tvery strictly conforming <abbr title=\"Extensible Markup Language\">XML<\/abbr> parsers.\r\n\tWhile these do still have a few bugs, they can be fixed without any detrimental\r\n\teffect on legacy content. This fact alone allows much greater room for enhancement\r\n\tthan HTML ever will.<\/p>\r\n<h4 id=\"xhtml-improved\">How Can XHTML Be Improved?<\/h4>\r\n<p>With a proper understanding of how to use <abbr title=\"Extensible Markup Language\">XML<\/abbr> and <abbr title=\"Extensible Hypertext Markup Language\">XHTML,<\/abbr> there\r\n\tare really no limitations on how far XHTML can progress. We will not be held\r\n\tup by extreme browser bugs and limitations; there\u2019s no non-conformant behaviour\r\n\tthat will have to be replicated by future implementations, element content\r\n\tmodels can be changed for existing elements, and new elements can be added\r\n\tand supported very easily. And at least with full style sheet support they\r\n\twill not be rendered totally useless (as in HTML without a library of hacks)\r\n\tin existing XHTML <abbr title=\"User Agents\">UAs<\/abbr>.<\/p>\r\n<p>It is completely true that, if you are not using any of the <abbr title=\"Extensible Markup Language\">XML<\/abbr> only\r\n\tfeatures such as mixed namespace documents (e.g. XHTML+MathML), there are\r\n\talmost no benefits to be gained from using XHTML 1.0. However, there will be\r\n\tbenefits in using either XHTML 2.0 or the <abbr title=\"Web Hypertext Application Technology\">WHAT<\/abbr> Working Group&#8217;s <abbr title=\"(Extensible) Hypertext Markup Langauge\">(X)HTML<\/abbr> Applications,\r\n\tincluding Web Forms 2.0, Web Apps 1.0 and Web Controls 1.0, which I think should\r\n\tbe collectively known as HAppy 1.0 (for <abbr title=\"Hypertext Markup Language\"><strong>H<\/strong>TML<\/abbr> <strong>App<\/strong>lications),\r\n\tnot <abbr title=\"(Extensible) Hypertext Markup Langauge\">(X)HTML<\/abbr> 5.0.<\/p>\r\n<p>By using the XHTML variant of HAppy 1.0 (if that\u2019s what it gets called \u2013 with\r\n\tor without the uppercase A \u2013 let me know what you think ;-)) backwards compatibility\r\n\twith existing XHTML <abbr title=\"User Agents\">UAs<\/abbr> will be much easier, because at least style sheets will\r\n\twork and the new elements will simply behave like divs and spans. Backwards\r\n\tcompatibility with <abbr title=\"Internet Explorer\">IE<\/abbr> and other legacy <abbr title=\"User Agents\">UAs<\/abbr> will require a bit more work, though:\r\n\tyou will need to arrange for your XHTML document to be converted into <abbr title=\"Hypertext Markup Language\">HTML,<\/abbr>\r\n\tas serving this new version of XHTML as <code>text\/html<\/code> will be strictly forbidden.<\/p>","protected":false},"excerpt":{"rendered":"HTML is all but dead. It\u2019s been getting beaten to death ever since the early versions of Netscape and IE. It\u2019s been on life support and holding on by a thread (albeit a particularly strong, yet very much frayed, thread) ever since IE5\/Mac threw it a lifeline called DOCTYPE sniffing. Yet no attempt to revive it has been, or will ever be, successful in prolonging its life more than a few years past its use-by-date and it is almost time to let it rest in peace.","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[2,7],"tags":[],"_links":{"self":[{"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/posts\/68"}],"collection":[{"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/comments?post=68"}],"version-history":[{"count":0,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/posts\/68\/revisions"}],"wp:attachment":[{"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/media?parent=68"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/categories?post=68"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/tags?post=68"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}