{"id":119,"date":"2006-09-22T16:51:41","date_gmt":"2006-09-22T16:51:41","guid":{"rendered":"http:\/\/lachy.id.au\/log\/2006\/09\/xml-prolog"},"modified":"2006-10-29T18:42:34","modified_gmt":"2006-10-29T18:42:34","slug":"xml-prolog","status":"publish","type":"post","link":"https:\/\/lachy.id.au\/log\/2006\/09\/xml-prolog","title":{"rendered":"XML Prologue"},"content":{"rendered":"<p>One thing I come across frequently is incorrect terminology.  I\u2019ve written\r\n\tabout this topic once before (see <a href=\"http:\/\/lachy.id.au\/log\/2004\/12\/html-tags\">HTML\r\n\tTags<\/a>) and others have discussed similar\r\n\ttopics as well, particularly relating to elements, attributes and tags.  But\r\n\ta more specific area that deserves a little more attention is the distinction\r\n\tbetween the <code>DOCTYPE<\/code>, the XML declaration and the XML prolog and other things\r\n\twithin it.<\/p>\r\n<p>The <dfn id=\"dfn-xml-prolog\">XML Prolog<\/dfn> is the section at the beginning of an XML document which includes\r\n\teverything that appears before the document\u2019s root element.  The XML declaration,\r\n\tthe <code>DOCTYPE<\/code> and any processing instructions or comments may all be a part of\r\n\tit.  The following figure illustrates this concept.<\/p>\r\n<p><img decoding=\"async\" loading=\"lazy\" src=\"\/lib\/images\/2006\/XML-Prolog-20060922.png\" alt=\"The diagram highlights the XML Prolog at the beginning of a sample XHTML 1.0 document containing the XML declaration, a processing instruction, a comment and the DOCTYPE.\" width=\"500\" height=\"200\" \/><\/p>\r\n<p>In fact, the XML Prolog is always present in every XML document, though it\r\n\tmay in fact be empty because all of those are optional in some circumstances.<\/p>\r\n\r\n<h3 id=\"prolog-xml-decl\">The XML Declaration<\/h3>\r\n<pre><code>&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;<\/code><\/pre>\r\n<p>The <dfn id=\"dfn-xml-decl\">XML declaration<\/dfn>, if present, must occur at the very beginning of the file. \r\n\tIt may not be preceded by anything except for a possible Byte Order Mark (depending\r\n\ton the character encoding).  It is mostly used to provide XML version information\r\n\tand to declare the character encoding of the document.  There is another thing\r\n\tcalled the standalone document declaration; but since it\u2019s rarely needed or\r\nused and its purpose is not easy to explain, just ignore it.<\/p>\r\n<p>Presently, only <a href=\"http:\/\/www.w3.org\/TR\/xml\/\">XML 1.0<\/a> and <a href=\"http:\/\/www.w3.org\/TR\/xml11\/\">XML\r\n\t\t1.1<\/a> are defined.  Either may be used, but\r\n\tthe decision should not be made lightly.  Do not just use <code>version=&quot;1.1&quot;<\/code> because\r\n\tit is higher version number.  For most authors these days, <code>version=&quot;1.0&quot;<\/code> should\r\n\tbe used.  In fact, unless you have a specific reason that requires the use of\r\n\tXML 1.1 features, you should stick with 1.0.<\/p>\r\n<p>The encoding declaration, if present, must declare the encoding of the document. \r\n\tAuthors may use any encoding supported by user agents, but are encouraged to\r\n\tuse charsets registered with IANA (preferably UTF-8 or UTF-16).  If the declaration\r\n\tis not present, the document must be encoded as UTF-8 or UTF-16 (unless it specified\r\n\tby a higher level protocol, like HTTP).<\/p>\r\n\r\n<h3 id=\"prolog-pi\">Processing Instructions<\/h3>\r\n<pre><code>&lt;?xml-stylesheet type=\"text\/css\" href=\"\/style\/design\"?&gt;<\/code><\/pre>\r\n<p><dfn id=\"dfn-pi\">Processing Instructions<\/dfn> are used to provide instructions to applications processing\r\n\tthe document.  The example of the <code>xml-stylesheet<\/code> <abbr title=\"Processing Instruction\">PI<\/abbr> given in the above diagram\r\n\tis used to instruct an application to apply a stylesheet to the document.<\/p>\r\n<p><abbr title=\"Processing Instructions\">PIs<\/abbr> can be used almost anywhere within the document.  Though, only those that\r\n\tappear prior to the root element are considered part of the prolog.<\/p>\r\n\r\n<h3 id=\"prolog-comments\">Comments<\/h3>\r\n<pre><code>&lt;!-- This is a comment --&gt;<\/code><\/pre>\r\n<p>Most people know what comments are, there\u2019s not much I need to say about them. \r\n\tHowever, like <abbr title=\"Processing Instructions\">PIs<\/abbr>, they\u2019re only considered part of the prolog if they appear\r\n\tbefore the root element.<\/p>\r\n\r\n<h3 id=\"prolog-doctype\">The Document Type Declaration<\/h3>\r\n<pre><code>&lt;!DOCTYPE html PUBLIC \"-\/\/W3C\/\/DTD XHTML 1.0 Strict\/\/EN\"\r\n    \"http:\/\/www.w3.org\/TR\/xhtml1\/DTD\/xhtml1-strict.dtd\"&gt;<\/code><\/pre>\r\n<p>Many authors will have seen and used a <code>DOCTYPE<\/code> in their documents, although\r\n\tthere are still many who don\u2019t.  The <code>DOCTYPE<\/code> is used to reference a Document\r\n\tType Definition and is mostly used for validation purposes.<\/p>\r\n<p>Many people know that using specific <code>DOCTYPE<\/code>s will trigger standards mode\r\n\tin browsers, but this does not apply to XML documents.  <code>DOCTYPE<\/code> sniffing only\r\n\tapplies to HTML documents (i.e. any document served as <code>text\/html<\/code>).  Browsers\r\n\thave, thankfully, not introduced it into XML processing.  Henri Sivonen explains\r\n\tmore about this in <a href=\"http:\/\/hsivonen.iki.fi\/doctype\/\">Activating the\r\n\tRight Layout Mode Using the Doctype Declaration<\/a>.<\/p>\r\n","protected":false},"excerpt":{"rendered":"A simple explanation of the XML prolog, XML declation, DOCTYPEs and related markup.","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[2,7],"tags":[],"_links":{"self":[{"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/posts\/119"}],"collection":[{"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/comments?post=119"}],"version-history":[{"count":0,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/posts\/119\/revisions"}],"wp:attachment":[{"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/media?parent=119"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/categories?post=119"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lachy.id.au\/log\/wp-json\/wp\/v2\/tags?post=119"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}