A relatively new editor’s draft entitled HTML 5: The Markup Language has been proposed by Mike Smith. This draft is an attempt to define the vocabulary and syntax of HTML, without any implementation conformance criteria or associated DOM APIs. It’s being positioned as a replacement, normative definition of the language over the existing HTML 5 spec, and its proponents claim that it’s better for authors. But don’t be deceived; this draft isn’t what it claims to be, and is not really beneficial for the vast majority of web developers.
About the Draft
The document itself is largely generated from two primary sources, with some additional explanatory material included manually. It incorporates selected statements and conformance criteria from the spec itself, which is fine. This is a useful technique to help ensure that it and the spec stay relatively in sync with each other. But it also incorporates the RelaxNG schemas and regular expressions that are being developed for the HTML 5 Validator. This is part of the source code from one particular validator implementation, and it’s important to note that this code was not primarily written for human consumption, but rather machine processing.
Yet, despite this, it is being pushed as a suitable, human readable method for describing the conforming syntax and element content models of HTML. In a sense, it’s analogous to the DTDs used within the HTML 4.01 specification, except that it’s more difficult to read.
From past experience, we know that many web developers were not comfortable reading the DTD syntax, and preferred to check reference guides, tutorials, or ask others on mailing lists or forums to explain things. So the notion that such a document would be useful for the majority of web developers is, frankly, absurd.
But don’t just take my word for it. Let’s take a look at some examples of this notation and see for ourselves. This is the regular expression that describes the conforming DOCTYPE syntax:
doctype = <![dD][oO][cC][tT][yY][pP][eE]\s+[hH][tT][mM][lL]\s*>
If that’s not scary enough, how about this which defines the conforming values for the target attribute:
browsing-context-or-keyword = ()|([^_].*)|(_[bB][lL][aA][nN][kK])|(_[sS][eE][lL][fF])|(_[pP][aA][rR][eE][nN][tT])|(_[tT][oO][pP])
To be fair, it is accompanied by a plain text list of examples of the four predefined values, but simply looking at the examples alone doesn’t the reader anything about case insensitivity, nor indicate that other custom values are not allowed to begin with an underscore. The only way to deduce that is from the above RegExp.
Finally, take a look at the definition of the a element, or any other, and see if you can understand what it means. Personally, I know how the a element is defined in the spec, but even I can’t easily figure out what that schemas are trying to say.
The a element’s content model is actually defined as Transparent in the spec, which you can think of as basically meaning that its content model is inherited from the parent element. (This is a slight over simplification of its actual meaning, but we can ignore the subtleties for now.) i.e. When it’s included as a child of an element that only permits phrasing content, that applies to the a element too. But when it’s parent permits flow content, so does the a element. If you were able to decipher that on your own from the proposed draft, then well done. I couldn’t.
By now, you may be asking, if this proposal isn’t really suitable for web developers, then who is it suitable for? It’s a question that has been asked several times on the mailing list, and yet one that has not yet been adequately answered. I’ll do my best to explain how I see it shortly. But first, there’s a little background to cover.
The Spec Splitters
Within the working group, as expected, many people have a very diverse range of opinions. In particular, a number of individuals share the opinion that the current HTML 5 spec is far too monolithic and that it should be split. There’s nothing inherently wrong with that position, per se. There are indeed sections of the spec that nearly everyone agrees should be, or have already been, separated out into their own specifications.
For instance, XMLHttpRequest was, at one time, part of HTML5. This was taken out a long time ago and moved to the WebApps working group, where it has thrived independently from HTML5 ever since. More recently, the web sockets protocol and API have also been split into their own specs, as has the the content sniffing, HTTP Origin header, and more.
The issue is that a number of individuals want the spec split in ways that aren’t entirely sensible. This includes the idea of splitting the spec along the lines of a conforming, declarative language definition and separate implementation requirements. There are even those who would go so far as to say that only the former should be defined, effectively leaving the implementers to fend for themselves. But I’ll spare you from the horror of such extremes, as the group moved beyond that debate long ago, and merely deal with those who want to split the spec.
From high level perspective, the concept of splitting the spec along those lines looks reasonable. These two seemingly independent components intuitively feel like they could be defined separately. That is, until you start to appreciate just how intertwined these sections are, and where exactly they want to draw the line.
It is argued that the language spec should only describe the conforming syntax and content models of the HTML markup alone. This would omit any details about how such features are processed and provide limited information about what they do. It would also omit any and all details about the associated DOM APIs.
The semantics of elements and attributes are closely related to what functionality they provide, which is itself closely related to the implementation requirements. Consider, for example, the heading and sectioning elements. Their semantics are useful for providing hierarchical document structures, with varying levels of headings. This is very closely related to the processing requirements for creating an outline. Authors need to know how to mark up their heading structures, and implementers need to know how to interpret them.
Consider also, many of the DOM APIs for many elements reflect the values of the content attributes. The processing requirements for getting and setting such properties is very dependent upon the processing requirements for the attributes themselves, which is itself dependent upon the conforming values of those attributes.
There are many more examples of such interconnected dependencies, but I won’t try to list them all. Suffice it to say that the problem is that by splitting the spec, it becomes much harder to manage the integration points between these highly interconnected sections, and creates a greater risk of things not being defined well. Such a situation would inevitably lead to interoperability problems, which doesn’t only end up hurting implementers, but everyone involved including authors and users.
The Wedge Strategy
Despite the significant resistance to splitting out the language definition, there has still been a significant push for there to be a document that normatively defines it separately from the implementation requirements, and this draft has been put forth with the intention of doing just that.
However, since the spec has not been split in the way described above, and hopefully won’t be, we are left with a situation where we have two drafts, the HTML5 spec itself and this proposal, each claiming to normatively define the language.
But some people seem to be willing to use this to get their way, even if it means normatively defining the language twice, in two separate specs. This is of course absurd. With two normative documents, each defining things in their own way, will inevitably lead to conflicts between the two specs, which then raises the question of which takes precedence.
While people claim that it’s possible to define things normatively in two separate specs and keep them in sync, there is no evidence to support that situation and plenty of evidence against it. But suffice it to say that it won’t work and will lead to one of two possible outcomes:
- The conforming language definition is split from the main spec, leaving it to be defined only in this proposal. This, as I explained above, would be bad.
- The proposal becomes non-normative, leaving the spec itself as the single authoritative normative source. This is what I have been and will continue to push for.
The Audience of the Proposal
As I briefly explained above, given the content of the draft, it is not really suitable for the vast majority of web developers. In fact, its audience is, in practice, despite claims to the contrary, severely limited in scope to a small minority of people that are comfortable with reading complicated schemas and regular expressions, and whom actually have some use for them.
Schemas are primarily designed for the purpose of conformance checking. Specifically, tools that read the document and compare it with the grammar described in the schema. This is effectively what validators do, although it should be noted that schemas are not the only means of achieving this goal.
So it is somewhat useful for people writing tools with conformance checking features, since they can, if they choose, incorporate the schemas from the spec into their own tools, or use them as a guide for creating their own. However, it doesn’t provide all the information necessary for such developers, as they will still need to turn to the main spec for many implementation requirements, particularly parsing.
What about Web Developers?
Web developers certainly haven’t been forgotten. Their needs are just as important to address as implementers. But I and many others recognise that such developers, many of whom aren’t comfortable with normative spec language, need something specifically targeted at them. For this, there are now two separate, non-normative drafts, under development.
The first, currently entitled the HTML 5 Reference, really a reference guide for web developers that will explain the elements, attributes and their semantics, the syntax and DOM APIs, and provide plenty of explanatory material and examples showing how and why to use each feature. This is a draft that I’m working on and have recently started to make some significant progress with it.
The second is a new proposal by Dan Connolly, but which there is currently no draft available. This document is intended to be more of a step-by-step, cookbook-style guide to writing pages using HTML5, with a big focus on the multimedia aspects. e.g. It will provide things like:
- How to embed a video within a page and provide customised controls using the DOM API,
- How to indicate the completion status of a web application using a progress bar.
- How to markup images with captions