All posts by Lachlan Hunt

The Content-Language Pragma Directive

This rationale is written in defence of a technically sound and reasoned approach to dealing with the Content-Language pragma directive issue within the HTML Working Group. ISSUE-88 is a request for permitting multiple language tags to be used as the value of the Content-Language pragma directive. This article argues that this change proposal is unsupported by logic or reason, and resolving in its favour will have an overall negative effect for both authors and implementers.

Summary

This summary is presented as an overview of the arguments presented throughout this article. The supporting rationale in favour of these arguments is presented later.

  • The change proposal is based upon the false premise that the Content-Language HTTP header and pragma directive are equivalent.
  • The HTTP header is used to declare the languages of the intended audience; the only defined function of the pragma directive is to be used as a fallback language in the absence of the lang attribute.
  • The use of the pragma directive as part of server configuration is out of scope of HTML. Specific server side implementation choices need not affect the conformance definition.
  • The pragma directive only fulfils its purpose of providing a fallback language when one language tag is specified. Multiple language tags are, by definition of the implementation requirements, not useful or beneficial.
  • There are no reasons given for why it is beneficial to leave the pragma directive in the document when the lang attribute is present on the root element.
  • Failing to offer a warning about its presence in all cases would continue to mislead the author about its legitimacy.
  • The inconsistency of when warnings are issued would be confusing to authors. It is better to offer a consistent warning about the presence of a redundant feature.
  • The defined effect, per the implementation requirements, of declaring multiple language tags is identical to that of omitting the pragma directive entirely. No reasons are given to explain why declaring multiple language tags is useful.
  • The syntax of the Content-Language HTTP header field is not affected by the definition of the distinct Content-Language pragma directive in HTML, with which it only shares a common name and does not share significant functionality. It is reasonable for this distinct feature to use a distinct conforming syntax that is suitable for its purpose.
  • No reason is given explaining why only emitting the warning under specific circumstances, as opposed to the current specification requirement, would serve better in encouraging authors to use the lang attribute instead.
  • The proposed replacement specification text contains unjustified changes, inconsistencies, unimplementable requirements and is overall inappropriate for use in the specification.
  • The claimed positive benefits effects are unsupported by evidence and, in several cases, blatantly incorrect.
  • In practice, very few authors use multiple language tags in the pragma directive, and doing so is not useful. Restricting the syntax to one language would not have a significant negative impact.

Difference Between Content-Language HTTP Header and Pragma Directive

This premise of the change proposal is that the Content-Language HTTP header field is functionally equivalent to the Content-Language pragma directive using the meta element. This premise is used to support the idea that that both should share the same syntax and client side processing requirements. However, this premise is demonstrably wrong, and thus the change proposal is unsupported by evidence and must be rejected.

In order to demonstrate the differences between the HTTP header and the pragma directive, it is necessary to analyse the purpose and functionality of each and see how they compare.

Declaring the Language of the Intended Audience

The HTTP Content-Language header field is used by HTTP servers to announce the language of the intended audience for a given resource representation. This and other related information exchanged between the client and server can be used for content negotiation based on language. When the server does this, it is important for this information to be included in the HTTP header where it can be seen by both the client and other intermediary servers.

The information declared within the document using the pragma directive is unsuitable for this purpose, as it will not be parsed by intermediary servers that would otherwise utilise the information for caching purposes.

Server Configuration

It has been claimed that the information declared using a pragma directive within the document may be parsed by some server implementations, which subsequently process and echo the value in the Content-Language HTTP header field. Since this header field is allowed to contain multiple language values, it is claimed that this ability is limited by permitting only one language in the pragma directive. However, no evidence has been presented to demonstrate how widely used this feature is, nor why such a feature should even be defined within HTML.

This is a layering violation because information intended for server side processing, and specific implementation details thereof, should not unnecessarily affect the conformance definition of client side HTML. That is, it is out of scope for HTML, as a client side markup language, to define specific processing requirements or features to be used by servers for implementing HTTP features. There is also no inherent need for interoperability between different back end implementation details.

Defining the pragma directive in a way that is optimised for specific server implementation details would be analogous to, for example, defining an ASP specific feature within HTML for use on Microsoft IIS platforms. While server implementations are otherwise free to make any design decision, those design decisions need not affect HTML conformance requirements.

Default Document Language

In practice, Content-Language used within the meta element in the HTML serves as client side metadata. The functionality of Content-Language in this case is restricted entirely to the purpose of specifying a fallback language, to be used in the absence of the lang attribute. This purpose differs significantly from the purpose of declaring the languages of the intended audience.

Declaring multiple languages for the document’s intended audience makes sense in some cases. However, there can only be one default language. Thus, for this purpose, the functionality as defined requires that only a single language value be specified. While the HTTP Content-Language header field is also used for determining the fallback language in cases where it only has a single language value, that is not its primary purpose and is thus not a significant similarity between these two independent features.

Permitting multiple language values to be specified in the pragma directive is at odds with its implementation requirements. Thus, for the client-side metadata functionality of the pragma directive, it is not at all useful to have multiple languages specified, and so it does not make sense for multiple languages to be considered conforming.

These 3 aspects of the functionality — declaring the language of the intended audience, server side configuration and default document language — clearly illustrate that the premise of this change proposal — the shared functionality between the two features — is fundamentally flawed. The reality is that the in-document Content-Language pragma directive only shares its name with the HTTP header field, while its functionality is closer to that of the lang attribute. And since server side implementation details are out of scope of HTML, there is no need for the document conformance definition to permit multiple language values. The solution chosen for addressing this issue must take this into account, and thus reject this change proposal.

Arguments Against the Rationale

The rationale for this change proposal states:

[The current specification] offers no carrot for doing the right thing. while the fallback language effect stops as soon as the author adds lang on the root element, the spec requires conformance checker to continue whining until the http-equiv="Content-Language" meta element has been removed.

The rationale fails to explain the benefit gained by leaving the pragma directive in the document when a lang attribute has been specified on the root element. While leaving it in the document under those circumstances is mostly harmless, it is redundant metadata that the author does not need to include in their document. Failing to offer a warning would continue to mislead the author into thinking that the pragma directive is both acceptable and useful, which it is not.

That it prevents authors from legally using multiple values to replicate the language fallback effect of doing the same thing in a HTTP header — whether they want to replicate the effect of multiple tags or a single tag.

The language fallback effect from using multiple language tags within the value is that there is no default language. This is exactly the same effect as would be achieved by omitting the pragma directive, and so the given reason is blatantly wrong.

i.e. The effect of including a value with multiple languages, like the following:

<meta http-equiv="Content-Language" content="en, fr">

is identical to that of omitting this pragma directive entirely. This rationale also fails to provide a reason for wanting to replicate this effect by copying the same syntax.

That it underlines the confusion that may exist today, about the nature of lang versus Content-Language, by requiring:

  1. different syntax rules for features that are expected to be identical (HTTP and http-equiv)
  2. similar syntax rules for features that are different (http-equiv and lang)
  3. a warning message which asks authors to “use lang instead” – as if they were juxtaposable alternatives.

In actual fact, the confusion surrounding this issue is the idea that the HTTP header and pragma directive are equivalent, as clearly illustrated by this misguided change proposal. They are different. The HTTP header is used for declaring the languages of the intended audience, the pragma directive is used for specifying a default language.

The lang attribute, on the other hand, is an alternative to the pragma directive when a single language is specified. When multiple languages are specified, there is absolutely no defined effect, and so it serves no valid purpose at all. Therefore, the pragma directive is much closer in functionality to the lang attribute, than it is to the HTTP header, with which it shares its name.

Instead of the above, this change proposal propose:

  1. the Zero-edit proposal’s warning about using lang instead of Content-Language should be changed into a warning which informs that a fallback language measure has kicked in, and recommend that authors create a language declaration (via lang) rather than relying on the fallback feature. This warning should be shown regardless of whether the fallback comes from http-equiv or from the higher level (HTTP). Justification: Since it is a fallback feature, and with other semantics, there is no guarantee that the author has used it for the language effect.

From the authors perspective, the inconsistency of issuing the warning about the use of the pragma directive only when the lang attribute is absent would be confusing. The better alternative is to issue a consistent warning (or error) that simply says to remove the pragma directive and use lang instead.

  1. to hold the syntax rules of HTTP (which permits multiple language tags) as the conforming ones (rather than those of lang, which forbids multiple languages), will have the effect of underlining that lang and Content-Language have different purposes. For instance, since the fallback algorithm doesn’t kick in whenever multiple languages are used in the pragma or on the server, there would not be any warning in these cases.

The syntax requirements for the HTTP Content-Type header are not affected by the HTML implementation requirements. Since the lang attribute on the root element and the Content-Language pragma directive with a single language value do have the same effect, which differs significantly from the purpose of the HTTP Content-Language header, and because it is misleading to pretend otherwise, the syntax of the former does not need to match the syntax of the latter.

  1. a carrot: what we want from authors is that they rely on lang (and xml:lang) for specifying the language — when the author does that, he/she should get immediate reward in the form of removal of conformance warning.

This rationale fails to explain why that same effect of encouraging authors to use the lang attribute would not be achieved by a more consistent warning that states to use the lang attribute and remove the pragma directive. There is no benefit gained by leaving the directive in; and merely silencing the validator by inserting a lang attribute does little to discourage the use of the redundant and totally unnecessary pragma directive.

Arguments Against the Proposal Details

The change proposal suggests replacing the terminology for “pragma-set default language” with “pragma-set locale language”. None of the given rationale explains the need for this change in terminology.

The proposed specification text states:

This pragma contains a Content-Language list, whose semantics and syntax is defined in the HTTP spec.

The semantics of the Content-Language header field as defined in RFC 2616 states:

The Content-Language entity-header field describes the natural language(s) of the intended audience for the enclosed entity. Note that this might not be equivalent to all the languages used within the entity-body.

This semantic definition does not match the actual purpose of the Content-Language pragma directive, for specifying a “pragma-set locale language”. Therefore, referring to RFC 2616 for this semantic definition is inappropriate. The syntax requirements from RFC 2616 are also inappropriate, as it defines the following ABNF, which is not directly compatible with the syntax of the meta element with http-equiv and content attributes.

Content-Language = "Content-Language" ":" 1#language-tag
language-tag = primary-tag *( "-" subtag )
primary-tag = 1*8ALPHA
subtag = 1*8ALPHA

For these syntax requirements to be applicable at all, the specification would have to state that the value of the content attribute must match the ABNF production for language-tag. However, see below regarding the syntax defined in BCP 47.

An HTML5 parser processes this list into a known or unknown pragma-set locale language… The Content-Language list may also be defined in a HTTP header, and will then result in a known or unknown HTTP header-set locale language.

The proposed text fails to define what “known or unknown” means in that context. It is not clear how the implementation determines whether a value is known or unknown. The phrasing of the requirement seems to indicate that it would depend upon the result of parsing the value, rather than just the presence or absence or absence of said value. But the parsing requirements do not use such terminology, and so there is no way to determine whether a given value qualifies as known or unknown.

The parsing requirements for the value of this pragma directive are not specified by the change proposal. However, the change proposal also does not state that the existing parsing requirements in the specification are to be removed, replaced or modified in any way. Thus, by adopting the details of this change proposal, the specification would be left in an inconsistent state which says that multiple language values are supported, but where the parsing requirements abort when more than one value is used.

The aforementioned parsing requirements only focus on parsing the value of the pragma directive, and as such, there is no implementation requirement that sets the “HTTP header-set locale language”.

When a document is lacking a language declaration in the form of the lang or xml:lang attribute on the root element, the document’s locale language (pragma-set or HTTP-set) is consulted by the user agent and used as fallback value for the primary document language.

Assuming the value of the “HTTP header-set locale language” comes from the HTTP Content-Language header, this proposed text fails to specify the order of precedence of the values specified in the pragma directive or the HTTP header.

The use of the term “locale language” in this context clashes with the existing use of the term in the specification to refer to the language set by the user in the user agent’s preferences. This term is used in the table within step 7 of the algorithm to determine the character encoding.

The proposed text then goes on to state:

The following info about the HTTP semantics and Content-Language usage, is informative:

However, in the non-normative list given following that statement, RFC 2119 terminology is incorrectly used to describe what appear to be authoring requirements. In particular:

… authors should not define the Content-Language list according to its parser effect, but according to it semantics.

This non-normative example text also incorrectly states that “en-US” would not be parsed into a useful value. However, this value complies with the syntax requirements specified in RFC 2616, BCP 47 and also with the existing parsing requirements in the HTML5 specification.

The proposal states that the following requirement is to be removed:

Conformance checkers will include a warning if this pragma is used. Authors are encouraged to use the lang attribute instead.

The rationale provided does not adequately justify the removal of this warning, and nor does it adequately justify replacing it with a more limited warning to be issued only when the pragma directive is in the absence of the lang attribute.

The proposal then states to amend this requirement as follows:

the content attribute must have a value consisting of a valid BCP 47 language tag, or a comma separated list of two or more BCP 47 language tags.

However, the proposal stated earlier that the syntax for the value was defined by RFC 2616. This requirement now conflicts with that by stating that the syntax of the content attribute’s value is defined by BCP 47. This inconsistency negatively affects the quality of the specification.

The proposal states that this note is to be removed:

This pragma is not exactly equivalent to the HTTP Content-Language header, for instance it only supports one language.

The removal of this note would be misleading, because the note itself is factually correct as-is with the current specification, and with the details of this proposal, which, as stated above, leave the parsing requirements unchanged. The proposal fails to include any implementation requirements that actually permit multiple language tags to be used.

It has now been clearly demonstrated that the proposed specification text provided by this change proposal is thoroughly inadequate for its intended purpose. If the specification were to be amended as required by this change proposal, the inconsistency and lack of clarity would negatively affect the ability to read, understand and implement this specification. As such, this proposal should also be rejected on the basis that its proposal details are inadequate. However, if this working group does make the wrong decision to permit multiple language tags, then I ask that the editor be given full editorial discretion to phrase the requirements in a way that more clearly expresses the requirements, rather than being asked to accept the details of this proposal as written.

Arguments Against the Claimed Positive and Negative Effects

More positive: authors can get rid of the warning by adding something — <html lang="*"> — this is better than a focus on removal of the (over all) harmless Content-Language meta element.

Likewise, authors can get rid of the warning as required by the current specification by removing the meta element. No rationale is provided to explain why the act of removing the pragma directive is significantly more difficult than adding the lang attribute to the root element. Depending on the authoring tool or CMS, both of these actions are likely to be just as easy or just as difficult to perform. This purported benefit is thus unsubstantiated and invalid.

More stable: same syntax as before continue to be permitted.

As documented by the null change proposal, observation of the use of this pragma directive shows that only a very small minority of authors use multiple language values. However, the claimed benefit of continuing to use this syntax is nullified by the fact that, due to the implementation requirements, multiple language values are not at all useful.

More permissive: authors, CMS-es and browsers can continue to take advantage of HTTP-EQUIV’s ability to reference what the HTTP header is/was supposed to be, including replicating its fallback effect.

No rationale is provided to explain why that ability is in any way beneficial.

More correct: the difference between lang and Content-Language is pointed out, while the link between http-equiv and HTTP is emphasized.

As has been demonstrated, this is blatantly wrong. The lang attribute and the Content-Language pragma directive share more in common in terms of functionality, than to the pragma directive and the Content-Language HTTP header field.

More useful: a warning that a fallback feature has kicked in, is more useful than a warning which focuses on one of the places where the fallback language could potentially kick in from. Why tell the author to “please use lang instead” if the author has already made sure that the lang attribute is in place?

It seems more useful for authors to be informed about the presence of a redundant and useless feature, than to have them continue to mistakenly believe that the pragma directive is in any way useful. However, either way, both of these are highly subjective claims about what may or may not be useful to authors, which cannot be objectively evaluated without supporting data.

Has positive side effect: Encouragement to place a lang attribute on the starttag of the html element will lead authors to actually type in the html root element, instead of relying on the parser to generate it for them.

Relative to the status quo, the zero edit change proposal, or the proposal to make Content-Language non-conforming, the above is not a unique benefit. Both this and the other change proposals require validators to notify the author about the issue and encourage the use of the lang attribute.

More accurate because it does not conceal the problems by introducing an artificial technical and semantic difference between Content-Language from the HTTP header and Content-Language inside the http-equiv meta element.

This accuracy claim is undeniably wrong, given that the significant differences between the HTTP header and pragma directive have already been explained.

Conclusion

Based on the arguments presented in this article, it is clear that the change proposal arguing for multiple language tags to be permitted is misguided, and lacks any significant or valid supporting arguments. The overall effect of of the group accepting this change proposal would have a serious negative impact upon the quality of the specification. It is therefore my strongly reasoned opinion that the HTMLWG must reject this change proposal either in favour of the status quo, or in favour of making Content-Language entirely non-conforming.

Introducing WebM

Today, Google, in co-operation witt Opera, Mozilla, CoreCodec (Matroska developers) and a range of other companies, have announced at Google I/O 2010 that WebM is the new royalty free video codec for the web.

Earlier this year, Google purchased On2, the company that developed of a range of video codecs including VP3, VP6, VP7 and VP8. VP3 is a well known codec that formed the basis of Theora. VP6 is a codec supported by Adobe Flash, VP7 is used by Skype for video conferencing. Their latest offering, VP8, now forms the basis of the new WebM video format. The code for the VP8 codec has been released royalty free under the BSD licence.

WebM, which stands for Web Media, is a format based on 3 technologies:

  1. Container: A variation of Matroska called WebM.
  2. Video codec: VP8.
  3. Audio codec: Vorbis.

The Container Format

Matroska is a widely supported container format, which is able to contain a wide range of codecs, including, among others, h.264, VC-1, Theora, AAC, AC3 and Vorbis. This is due to the high degree of flexibility inherent in the design of Matroska.

Matroska itself if based on a binary markup language called EBML, the design of which was inspired by XML. In short, EBML files contain a header that declares the DocType and version information, followed by a tree of elements and data, marked up using a special binary notation. The Matroska specification defines a range of elements, and their binary notation, that can be used for marking up the data in Matroska files.

The WebM format is a subset of Matroska, which has been optimised for streaming over HTTP.

WebM, which uses the DocType “webm”, can be distinguished from Matroska, which uses the DocType “matroska”. Technically speaking, a valid WebM version 1 file supports a subset of elements from Matroska version 1, and WebM version 2 supports those in addition to some of the additional elements from Matroska version 2.

To further optimise WebM for use on the WebM, some additional formatting guidelines are imposed upon WebM files, over and above the Matroska counterpart. These guidelines include plaicing the indexing information at the beginning of the file, and keyframes stored at the beginning of clusters.

The WebM container is only permitted to contain the codecs VP8 and Vorbis, and browsers will not support any other codecs within WebM – not even Theora or h.264. Although there are no technical limitations with WebM that inherently prevent such codecs from being used, this was an intentional decision to improve the usability of WebM.

The idea being that if you have a player that supports WebM, you can be more confiden that the file will play without having to install additional codecs. This is a problem that has plagued container formats like AVI for years. You can’t easily determine what it contains until you start playing it. Some AVI files may contain DivX, Xvid, h.264 or a wide range of other codecs.

Benefits of Matroska

Matroska presented some nice benefits over competing container formats, sucha s MP4, commonly used with h.264, or even Ogg, which is supported by Opera, Firefox and Chrome for Theora and Vorbis. Like Ogg, Matroska is publicly specified and available to use freely, unlike, for example, MP4.

The main benefit of Matroska over Ogg is that the seeking information can be placed at the beginning, making it significantly easier to seek in a WebM file being transferred over HTTP. When the user tries to seek, if that part of the video hasn’t yet downloaded, then the browser needs to request that section from the server.

For Ogg, browsers have to do at least 2 separate requests when a video loads — one to get the beginning of the file and a range request to get the end — before the length of the video can be determined, and before seeking can occur, which then potentially results in additional requests.

For WebM, all the information is presented up front, meaning that if a user seeks the video, the browser knows exactly where in the video to go, or which part of the file to request from the server.

This is not to say that Ogg itself is a bad format. Quite the contrary, it’s just optimised for different use cases. Ogg is very good to use as a streaming container format where seeking is not required, or for storing your Vorbis encoded music collection locally, where the player isn’t subject to the overhead of HTTP requests.

WebM, on the otherhand, had to be specifically designed for use with the HTML video element served over HTTP, and as such, benefited from the design decisions of Matroska.

Audio and Video Codecs

The VP8 codec provides significant quality enhancements over its predecessors; most notably Theora. Comparisons between Theora and h.264 have shown that the quality of Theora is not up to scratch. Thanks to Google, VP8 has now been released freely.

There haven’t yet been any serious, independent comparisons between h.264 and VP8, so it’s difficult to say which is better. Although h.264 is certainly more mature than VP8, and has a lot more hardware support in existing devices, VP8 is likely to continually improve over the coming years.

The main limitation with VP8 at the moment is the lack of hardware acceleration. Firefox, Opera and Chrome all currently use software decoding of VP8, which means that it can increase CPU usage, particularly for high definition videos, and watching a lot of video will drain your battery more than hardware decoded h.264.

However, Google have announced that they are working with hardware partners, and its possible that we’ll see devices shipping with support within a year or two.

Vorbis, of course, has been supported by Firefox, Opera and Chrome for a while already, and so it was a natural choice to use in combination with VP8 in WebM.

YouTube

Over the past few weeks, YouTube has been working to convert many existing videos into WebM. To try this out using a browser that supports WebM, follow the instructions provided by the WebM Project. While not all videos have been re-encoded yet, thousands of videos are already available in WebM format, and will work in Opera, Firefox and Chrome.

Demo Time

Just so you can see for yourself what VP8 looks like, get yourself a copy of the preview releases of Opera, Firefox and Chrome, sit back, relax and watch Elephant’s Dream from the Orange Open Movie Project (website). I encoded this myself from the lossless source files using a special build of ffmpeg with libvpx_vp8 (the VP8 codec library).

Creating Your Own Videos

The absolute easiest way to create your own WebM video is to upload your source video to YouTube and wait for it to be encoded. Other services, including encoding.com and HD Could also offer transcoding services for a small fee.

If you want to encode the videos yourself, you need to get your hands dirty with a tool like ffmpeg with libvpx_vp8, or a commercial alternative. Google have released the source code for libvpx_vp8, and builds of ffmpeg with it should be available shortly. More information is available on the The WebM Project tools page

The Matroska developers have also been working on on updating their Matroska muxing software to support the WebM profile. New tools called mkvalidator and mkclean will help you to validate your WebM files, and to clean and remux files that aren’t valid. mkclean will also remux MKV files containing VP8/Vorbis to WebM.

Browser Support

Preview releases have been released for Opera, Mozilla Firefox and, of course, Google Chrome.

More details are available on WebMProject.org.

HTML 5: The Markup Language

A relatively new editor’s draft entitled HTML 5: The Markup Language has been proposed by Mike Smith. This draft is an attempt to define the vocabulary and syntax of HTML, without any implementation conformance criteria or associated DOM APIs. It’s being positioned as a replacement, normative definition of the language over the existing HTML 5 spec, and its proponents claim that it’s better for authors. But don’t be deceived; this draft isn’t what it claims to be, and is not really beneficial for the vast majority of web developers.

About the Draft

The document itself is largely generated from two primary sources, with some additional explanatory material included manually. It incorporates selected statements and conformance criteria from the spec itself, which is fine. This is a useful technique to help ensure that it and the spec stay relatively in sync with each other. But it also incorporates the RelaxNG schemas and regular expressions that are being developed for the HTML 5 Validator. This is part of the source code from one particular validator implementation, and it’s important to note that this code was not primarily written for human consumption, but rather machine processing.

Yet, despite this, it is being pushed as a suitable, human readable method for describing the conforming syntax and element content models of HTML. In a sense, it’s analogous to the DTDs used within the HTML 4.01 specification, except that it’s more difficult to read.

From past experience, we know that many web developers were not comfortable reading the DTD syntax, and preferred to check reference guides, tutorials, or ask others on mailing lists or forums to explain things. So the notion that such a document would be useful for the majority of web developers is, frankly, absurd.

But don’t just take my word for it. Let’s take a look at some examples of this notation and see for ourselves. This is the regular expression that describes the conforming DOCTYPE syntax:

doctype = <![dD][oO][cC][tT][yY][pP][eE]\s+[hH][tT][mM][lL]\s*>

If that’s not scary enough, how about this which defines the conforming values for the target attribute:

browsing-context-or-keyword = ()|([^_].*)|(_[bB][lL][aA][nN][kK])|(_[sS][eE][lL][fF])|(_[pP][aA][rR][eE][nN][tT])|(_[tT][oO][pP])

To be fair, it is accompanied by a plain text list of examples of the four predefined values, but simply looking at the examples alone doesn’t the reader anything about case insensitivity, nor indicate that other custom values are not allowed to begin with an underscore. The only way to deduce that is from the above RegExp.

Finally, take a look at the definition of the a element, or any other, and see if you can understand what it means. Personally, I know how the a element is defined in the spec, but even I can’t easily figure out what that schemas are trying to say.

The a element’s content model is actually defined as Transparent in the spec, which you can think of as basically meaning that its content model is inherited from the parent element. (This is a slight over simplification of its actual meaning, but we can ignore the subtleties for now.) i.e. When it’s included as a child of an element that only permits phrasing content, that applies to the a element too. But when it’s parent permits flow content, so does the a element. If you were able to decipher that on your own from the proposed draft, then well done. I couldn’t.

By now, you may be asking, if this proposal isn’t really suitable for web developers, then who is it suitable for? It’s a question that has been asked several times on the mailing list, and yet one that has not yet been adequately answered. I’ll do my best to explain how I see it shortly. But first, there’s a little background to cover.

The Spec Splitters

Within the working group, as expected, many people have a very diverse range of opinions. In particular, a number of individuals share the opinion that the current HTML 5 spec is far too monolithic and that it should be split. There’s nothing inherently wrong with that position, per se. There are indeed sections of the spec that nearly everyone agrees should be, or have already been, separated out into their own specifications.

For instance, XMLHttpRequest was, at one time, part of HTML5. This was taken out a long time ago and moved to the WebApps working group, where it has thrived independently from HTML5 ever since. More recently, the web sockets protocol and API have also been split into their own specs, as has the the content sniffing, HTTP Origin header, and more.

The issue is that a number of individuals want the spec split in ways that aren’t entirely sensible. This includes the idea of splitting the spec along the lines of a conforming, declarative language definition and separate implementation requirements. There are even those who would go so far as to say that only the former should be defined, effectively leaving the implementers to fend for themselves. But I’ll spare you from the horror of such extremes, as the group moved beyond that debate long ago, and merely deal with those who want to split the spec.

From high level perspective, the concept of splitting the spec along those lines looks reasonable. These two seemingly independent components intuitively feel like they could be defined separately. That is, until you start to appreciate just how intertwined these sections are, and where exactly they want to draw the line.

It is argued that the language spec should only describe the conforming syntax and content models of the HTML markup alone. This would omit any details about how such features are processed and provide limited information about what they do. It would also omit any and all details about the associated DOM APIs.

The semantics of elements and attributes are closely related to what functionality they provide, which is itself closely related to the implementation requirements. Consider, for example, the heading and sectioning elements. Their semantics are useful for providing hierarchical document structures, with varying levels of headings. This is very closely related to the processing requirements for creating an outline. Authors need to know how to mark up their heading structures, and implementers need to know how to interpret them.

Consider also, many of the DOM APIs for many elements reflect the values of the content attributes. The processing requirements for getting and setting such properties is very dependent upon the processing requirements for the attributes themselves, which is itself dependent upon the conforming values of those attributes.

There are many more examples of such interconnected dependencies, but I won’t try to list them all. Suffice it to say that the problem is that by splitting the spec, it becomes much harder to manage the integration points between these highly interconnected sections, and creates a greater risk of things not being defined well. Such a situation would inevitably lead to interoperability problems, which doesn’t only end up hurting implementers, but everyone involved including authors and users.

The Wedge Strategy

Despite the significant resistance to splitting out the language definition, there has still been a significant push for there to be a document that normatively defines it separately from the implementation requirements, and this draft has been put forth with the intention of doing just that.

However, since the spec has not been split in the way described above, and hopefully won’t be, we are left with a situation where we have two drafts, the HTML5 spec itself and this proposal, each claiming to normatively define the language.

But some people seem to be willing to use this to get their way, even if it means normatively defining the language twice, in two separate specs. This is of course absurd. With two normative documents, each defining things in their own way, will inevitably lead to conflicts between the two specs, which then raises the question of which takes precedence.

While people claim that it’s possible to define things normatively in two separate specs and keep them in sync, there is no evidence to support that situation and plenty of evidence against it. But suffice it to say that it won’t work and will lead to one of two possible outcomes:

  1. The conforming language definition is split from the main spec, leaving it to be defined only in this proposal. This, as I explained above, would be bad.
  2. The proposal becomes non-normative, leaving the spec itself as the single authoritative normative source. This is what I have been and will continue to push for.

The Audience of the Proposal

As I briefly explained above, given the content of the draft, it is not really suitable for the vast majority of web developers. In fact, its audience is, in practice, despite claims to the contrary, severely limited in scope to a small minority of people that are comfortable with reading complicated schemas and regular expressions, and whom actually have some use for them.

Schemas are primarily designed for the purpose of conformance checking. Specifically, tools that read the document and compare it with the grammar described in the schema. This is effectively what validators do, although it should be noted that schemas are not the only means of achieving this goal.

So it is somewhat useful for people writing tools with conformance checking features, since they can, if they choose, incorporate the schemas from the spec into their own tools, or use them as a guide for creating their own. However, it doesn’t provide all the information necessary for such developers, as they will still need to turn to the main spec for many implementation requirements, particularly parsing.

What about Web Developers?

Web developers certainly haven’t been forgotten. Their needs are just as important to address as implementers. But I and many others recognise that such developers, many of whom aren’t comfortable with normative spec language, need something specifically targeted at them. For this, there are now two separate, non-normative drafts, under development.

The first, currently entitled the HTML 5 Reference, really a reference guide for web developers that will explain the elements, attributes and their semantics, the syntax and DOM APIs, and provide plenty of explanatory material and examples showing how and why to use each feature. This is a draft that I’m working on and have recently started to make some significant progress with it.

The second is a new proposal by Dan Connolly, but which there is currently no draft available. This document is intended to be more of a step-by-step, cookbook-style guide to writing pages using HTML5, with a big focus on the multimedia aspects. e.g. It will provide things like:

  • How to embed a video within a page and provide customised controls using the DOM API,
  • How to indicate the completion status of a web application using a progress bar.
  • How to markup images with captions
  • etc.