XHTML - What's that all about?
XHTML is a reformulation of HTML as XML. It aims to make a clean separation between document structure and presentation. To make XHTML you have to follow these rules:
- The document must start with a doctype. There are 4 common doctypes:
- XHTML 1.0 Transitional
- XHTML 1.0 Strict
- XHTML 1.0 Frameset
- XHTML 1.1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
- The html element must specify a default namespace:
<html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
- Script and Style elements must wrap their content in CDATA sections. This is because javascript can contain characters like < and & which are not allowed in xml. To stop older browsers from choking, a javascript comment, //, is put before the CDATA.
<script type="text/javascript"> //<![CDATA[ var i = 0; while (++i < 10) { // ... } //]]> </script>
- All elements and attributes must be in lower case. (XML is case sensitive.)
- Attribute values must be in quotes (single or double).
- This leads to the rather bizarre syntax for checked, readonly, disabled and selected:
<input type="checkbox" checked="checked" /> <input readonly="readonly" /> <input disabled="disabled" /> <option selected="selected" />
- This leads to the rather bizarre syntax for checked, readonly, disabled and selected:
- Elements must be closed.
- In particular for the empty elements line break and horizontal rule use <br /> and <hr />. Note that using this compact form rather than the also legal <br></br> gives better compatibility with older browsers. Note also the space between before the /> again for older browsers.
- Elements must nest correctly.
- The id attribute should be used, instead of the name attribute, to identify elements: <a>, <form>, <img>, and <map>.
Most of these rules are basically saying that the document must be valid xml, but with the added twist of making it work with older browsers. You can check the validity of your XHTML using the W3C validator.
Html, head, title and body are mandatory
In XHTML the html, head, title and body elements are mandatory. So a minimal html document is:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Controversial Title</title>
</head>
<body>
<h1>Interesting Content</h1>
</body>
</html>
Difference between Strict and Transitional
The strict doctype removes the presentation elements and attributes from the document. In particular the <center>, <font> and <iframe> elements may not be used.
Special Rules
The general rule is that:
- Block-level elements may contain inline elements and block elements.
- Inline elements may contain only data and other inline elements.
- <p> and <h1>...<h6> are block level elements, but may only contain inline elements. I guess what this is trying to say is that paragraphs and headings are the lowest block level elements in the document.
- <blockquote>, <body>, and <form> elements may only contain block elements as direct descendants (strict doctype).
- <form> must not contain other <form> elements at any level.
- Other more esoteric rules.