Category Archives: Standards

Standards, protocols, recommendations and guidelines.

CSS Guru Explains A<b>use

A while back,Eric Meyer wrote an article regarding heading levels in which he mentioned the use of a <b> element within his markup. He has finally written his reasons for doing so.

In Eric’s current design, they are not much more than a left over from a previous design. He states:

The boldface element is actually a holdover from the previous designs of meyerweb. … The original idea was to provide an inline element as a hook on which I could hang some styles.

Then he goes on to explain how he’s using it to apply some styles which would not work well had they been applied heading containing it. No-one has complained about the usefulness of having an extra element to play with for styling purposes; however, most of us would have used a <span> element instead. But eric defends his decision by saying that:

… the element name is three letters shorter, so for every hook, I’m saving six characters. If there are, say, twenty such hooks on a page, that saves me 120 characters. It’s a small consideration, but by such incremental savings are document weights reduced.

The document weight may be reduced, but there are other ways to do that without sacrificing semantic purity. For this, I did a little experiment. I saved the markup for this the article (after 10 comments). Firstly, I converted the document to UTF-8. After saving the document as UTF-8 character encoding, excluding the BOM, this required changing all the character entities used in the file to UTF-8 character encodings. Each entity changed was documented in this comparison chart. This documents, for each entity, the Unicode character name, Hexadecimal character reference, original markup entity used, original byte count, new markup, UTF-8 Hexadecimal bytes, new byte count, saving per character, number of instances of the entity and finally, the total savings for the document. In total, this added up to 439 bytes. Afterwards, the <b> elements were converted to <span> elements. There were two in this document, totalling an additional 12 bytes.

Note: There were originally 2 additional opening tags, with no closing tags for <b> elements. However, these were in the comments and were meant to be encoded using &lt; and &gt;. Thus, these two markup errors were corrected before this experiment was started

Following this, the markup errors resulting from the XML syntax used on empty elements were corrected. There were a total of 23 errors of this kind in the document, each comprising an unnecessary space (U+0020) and solidus (aka. slash – U+002F). This removed an additional 46 characters from the files.

The sample files after these corrections were made are available. The markup errors were corrected in all files because the main focus of this file size comparison is about character encoding savings and the use of <b> instead of <span>. The files are:

Example Encoding File Size (Bytes)
Example 1 ISO-8859-1 22,375
Example 2 UTF-8 using <b> 21,936
Example 3 UTF-8 using <span> 21,948

That is a saving of 427 bytes. Additionally, by checking the HTTP response headers, it’s easy to see that the documents are being served as compressed using gzip encoding. When using gzip, the small amount of bytes saved by using <b> over <span> is miniscule compared with that achieved by using gzip. For that article, the original file size was 24,576 bytes (as served from meyerweb including markup errors). The Content-Length indicated by the HTTP response headers is 7726. Thus, even the file size saved by converting to UTF-8 is small by comparison, but much more than the difference of using <b> instead of <span>

Also, I should point out that many have commented that Eric has used a superfluous number of classes throughout his markup. I’m not going to judge his use of classes because it would take far too long to analyse how and why each class is used. I just wanted to point out that removing some could also reduce the file size.

“What about semantic purity?” you may ask. In my view, b and span have the same semantic value, which is to say basically none. They’re both purely presentational elements, with the difference that span doesn’t have any expected presentational effects in HTML.

The problem is not the fact that it has no semantics, but the fact that in visual user agents, it portrays semanics that it does not have. It relies on style sheets to remove that perception so that the semantics of actually having no additional semantics is perceived (well, that is, not perceived) correctly by the user in the absence of stylesheets. Thus, the use of stylesheet in this case is, in a kind of backwards way, being used for semantic purposes. This, as everyone should know, breaks the rules of separating structure, content and presentation.

Finally, in an ideal, CSS3, non-IE world; he could use ::outside to provide the extra box and applied the styles directly to the containing element. But those days are still a long way off yet, so as an interim solution, I recommend the use of <span> for such purposes.

application/xhtml+xml+google = File Format: Unrecognized

Yesterday afternoon, while checking where I was ranked by searching for my name, Lachlan Hunt in Google to see where my site was ranked, I was surprised to see not only was my site ranked 6th, only beaten by my blogger profile, two pages on MSDN’s Channel9 Wiki that I’ve edited, and two Bobby Watchfire accessibility checks of my homepage (I don’t know why! Who’d be linking to those for Google to find?), but the description for my site turned out to be:

File Format: Unrecognized - View as HTML

This is because currently, my homepage is only being served as application/xhtml+xml, and it was surprising for 2 reasons. Firstly, I thought that Google would have at least been designed to be able to parse XHTML, even if it were only doing it as tag-soup like everything else it searches. And secondly, the View as HTML link was still included, even though google had no idea what format it was, nor how to parse it. If you actually follow that link now, the page contains nothing except for the google branding and diclaimer, that it invalidly puts at the top of every cached and view as HTML page it generates. above any <html> element and/or <!DOCTYPE> in the file.

When will Google learn to start writing valid HTML for all their pages, and when will they support industry standards? I thought that only Internet Exploder was the only user agent lagging behind with standards!

Hixie’s Ruling and Composite Attribute

Over the last week, the hot topic of discussion among several bloggers has been the uproar over Safari’s proposed HTML extensions, and Hixie has finally put forward his opinion, which pretty much sums up the reasons why none of the five proposals are any good.

  1. New DashboardML DOCTYPE
  2. Namespaces in HTML
  3. Namespaces in XHTML served as text/html
  4. Namespaces in XHTML served as an XML MIME Type
  5. Replace the XHTML namespace with an Apple namespace

In usual Hixie style, he points out and backs up all the reasons why none of those options are acceptable solutions to Apple‘s extensions, and for the most part, I agree with everything he said. However, the main focus of this post is not for praising Hixie for being the very wise man that he is — maybe another time, but I definately agree that a new DOCTYPE is not a good solution unless it were confined only to the dashboard. As I commented on for one of Eric Meyer’s posts, if Microsoft and Netscape had created their own DTD’s, we’d have no interoperable websites on the net, and it would further encourage the use of proprietary, presentational elements.

One point that Hixie didn’t touch on much, except to say that it was presentational, is the composite attribute. Dave Hyatt discussed this again in response to several people’s comments, including my own, that it should be in CSS, and attempted to explain the reasons behind this unfortunate decision of polluting HTML with additional presentational mark up instead.

He mentions that due to past experience of introducing proprietary extensions to CSS such as -moz-image-region that he invented which, as he claims, was largely derided by the CSS WG, and thus didn’t want to make the same mistake again. Well, my response to this is, which is better?

  1. Adding a proprietary extension to a language which specifically allows this using a special syntax, such as -khtml-composite? Or
  2. Extending a language which does not using the methods mentioned earlier, which as Hixe explained, is not acceptable.

Dave also mentions:

“We didn’t want to introduce a new CSS property, however, because how to composite should be specifiable anywhere you use an image in CSS properties like content, background-image, or list-style-image. However, normal specification of the foreground URL of an img tag is done in theHTMLitself. That meant there was no consistent solution that could be employed.”

I’m struggling to interpret that due to his exceptionally poor grammar – I guess he was in a rush when he wrote it. I think it means that if such a property were added to CSS, then it should apply equally to any image consistently whether it’s in the foreground (eg. <img src="uri"/>) or background, such as using the mentioned CSS properties. Dave, If your reading this, I think some clarification would be useful.

Anyway, I don’t see why such a property would not work. Couldn’t the property be used to determine how any elements content, including any text, images or other content, were rendered? Why can it only apply to foreground images, specifically, the <img/> element?

I’m not exactly sure what visual effect the composite attribute is supposed to achieve – I couldn’t find much documentation on it, and none that I found was related directly to the Dashboard. All I found was about the new Core Image graphics processing system, but it just lists the value names, without actually describing what each does; so some clarification on those would be good too. Perhaps then I might be able to understand Dave’s explanation.

It would seem that none of these extensions have been accepted by the majority of the Blogging community (at least the one’s I read), and none of the proposed solutions are really any good. So, I would have to finish up by agreeing with Hixie. The best way to proceed is to move the discussion to an open discussion list and design some acceptable, interoperable solutions. So let’s get brainstorming people, how could any of these extensions be made to work? I may come up with some more ideas once I more fully understand what exactly these extensions are supposed to do.