XHTML, the eXtensible Hyper Text Markup Language
XHTML was aimed to be extensible. One of the benefits of its extensibility was that you should be able to use your own elements and entities in your documents.
That is theory.
In reality, this extensibility is unfortunately not compatible with all the major browsers, each reacting differently, which makes it very difficult to extend XHTML. When you get something working in one browser, it usually break in another browser. By the past, it made me give up on XHTML extensibility quite a few times.
Why would I extend XHTML ?
There are various reasons and benefits about why you would like to extend XHTML. Here is a non exhaustive list :
- To clarify and simplify the website markup. Wouldn’t it be great if instead of a <div id=”menu”> you could use directly a <menu> element ?
- To protect content in your document using custom entities. You may want to protect some links, or avoid that spam bots scan your page for email addresses.
- Because you can. Yes, that’s a good reason.
- [insert your own good reason to extend XHTML]
The traditional way
It is possible to use your own elements and entities in your XHTML document. For example, let’s show how to add your own entities to the XHTML DTD to obfuscate your email. The original idea of email obfuscation using XML entities comes from Pablo (http://p4bl0.net/blog/post/Protection-d-email-contre-le-spam-en-XHTML.html). Have a look at example 1 :
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [ <!ENTITY mailto "mailto:"> <!ENTITY username "gabriel"> <!ENTITY arobase "@"> <!ENTITY hostname "gabsoftware"> <!ENTITY tld ".com"> <!ENTITY email "&username;&arobase;&hostname;&tld;"> ]> <html xmlns="http://www.w3.org/1999/xhtml"> <head <title>Extending XHTML - Example 1</title> </head> <body> <p>My email is &email;</p> </body> </html>
This should result in the email being displayed in the browser. If not, then please read the next section.
Let’s now see how to add your own elements to the XHTML DTD :
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [ <!ELEMENT menu (#PCDATA)> ]> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Extending XHTML - Example 2</title> </head> <body> <menu>This is the menu</menu> </body> </html>
This should result in the browser not complaining when it encounters the new <menu> element, and displaying its inner content.
Let’s finally see how to add your own elements using XML namespaces :
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:custom="http://your-namespace-url.com"> <head> <title>Extending XHTML - Example 3</title> </head> <body> <custom:menu>This is the menu</custom:menu> </body> </html>
This should also result in the browser not complaining when it encounters the <custom:menu> element and displaying its inner content.
Limitations of the traditional way
There are several limits to the previous approaches of extending XHTML. The first that you may have noticed is that Internet Explorer does not render the .xhtml files, instead, it proposes them to download. But that is not the only one. Here is a list of all the limitations I know :
- The document must be served as application/xhtml+xml or application/xml
- The document extension cannot be .html or .htm, it must be .xhtml or .xml
- It is not cross-browser compatible
- You cannot add easily an anchor link inside a custom element, and if you manage to do it, the link will not be clickable in Internet Explorer and possibly Webkit-based browsers
- It is not convenient to add a namespace prefix to the custom elements
- It makes the document very difficult to render identically in the different browsers
If that is not enough for you to give up about using the traditional ways to extend XHTML, for me it was. So, isn’t there a way to to that in a ways that is cross-browser compatible and without namespace prefixes ? Fortunately, yes, we are not going to get stuck with those limitations.
A different approach to XHTML extensibility : XSL transformations
XSL transformations are very useful. They can transform an input XML document into another. They are quite easy to understand and to use. For us, this is the perfect solution.
Known limitation :
for Chrome and possibly Safari, the XSL transformations cannot be used if the document is a local document (not served by a web server). However it works flawlessly when served by a web server. This limitation cannot be circumvented, but it is not really a limitation if you plan to serve your document using a web server.
Making it work
In order to do what we want using XSL transformations, we simply write our XHTML document as we want it to be. It has to be a valid and well formed XML document. Let’s see how to make example 1 working using XSLT in example 4:
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet title="xslformating" type="text/xsl" href="example4.xsl"?> <!DOCTYPE html [ <!ENTITY mailto "mailto:"> <!ENTITY username "gabriel"> <!ENTITY arobase "@"> <!ENTITY hostname "gabsoftware"> <!ENTITY tld ".com"> <!ENTITY http "http://"> <!ENTITY www "www."> <!ENTITY email "&username;&arobase;&hostname;&tld;"> <!ENTITY website "&http;&www;&hostname;&tld;"> ]> <html xml:lang="en"> <head> <title>Extending XHTML - Example 4</title> <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" /> </head> <body> <p>This email is obfuscated : <a href="&mailto;&email;">&email;</a></p> </body> </html>
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="#default" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xml" indent="yes" media-type="application/xhtml+xml" omit-xml-declaration="no" encoding="utf-8" version="1.0" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" /> <!-- All tags --> <xsl:template match="//*"> <xsl:element name="{name()}"> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:element> </xsl:template> </xsl:stylesheet>
Hurray ! The obfuscated email is now displaying correctly in all the browsers.
Let’s now see how to add our own elements in example 5. For example, we will add a <bold> element :
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet title="xslformating" type="text/xsl" href="example5.xsl"?> <!DOCTYPE html> <html xml:lang="en"> <head> <link rel="stylesheet" href="example5.css" type="text/css" media="screen" /> <title>Extending XHTML - Example 5</title> <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" /> </head> <body> <p>This version of XHTML has been extended. For example it can make text <bold>bold</bold>.</p> </body> </html>
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="#default" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xml" indent="yes" media-type="application/xhtml+xml" omit-xml-declaration="no" encoding="utf-8" version="1.0" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" /> <!-- All tags --> <xsl:template match="//*" priority="10"> <xsl:element name="{name()}"> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:element> </xsl:template> <!-- BOLD tag --> <xsl:template match="//bold" priority="20"> <xsl:element name="strong"> <xsl:copy-of select="@*"/> <xsl:copy-of select="node()"/> </xsl:element> </xsl:template> </xsl:stylesheet>
strong { font-weight: bold; }
Notice how we added the priority attributes to the XSL stylesheet. This is not always necessary, but it is a precaution to ensure that <bold> elements are processed instead of being copied by the first “all tags” rule.
Another example : let’s obfuscate a link with a special new element <hiddenlink> in example 6 :
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet title="xslformating" type="text/xsl" href="example6.xsl"?> <!DOCTYPE html [ <!ENTITY mailto "mailto:"> <!ENTITY username "gabriel"> <!ENTITY arobase "@"> <!ENTITY hostname "gabsoftware"> <!ENTITY tld ".com"> <!ENTITY http "http://"> <!ENTITY www "www."> <!ENTITY email "&username;&arobase;&hostname;&tld;"> <!ENTITY website "&http;&www;&hostname;&tld;"> ]> <html xml:lang="en"> <head> <title>Extending XHTML - Example 6</title> <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" /> </head> <body> <p> This link is obfuscated : <hiddenlink destination="&website;" caption="Link title">&website;</hiddenlink> </p> <p> This obfuscated email uses an obfuscated link : <hiddenlink destination="&mailto;&email;" caption="Link caption">&email;</hiddenlink> </p> </body> </html>
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="#default" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xml" indent="yes" media-type="application/xhtml+xml" omit-xml-declaration="no" encoding="utf-8" version="1.0" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" /> <!-- All tags --> <xsl:template match="//*" priority="10"> <xsl:element name="{name()}"> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:element> </xsl:template> <!-- hiddenlink tag --> <xsl:template match="//hiddenlink" priority="20"> <xsl:element name="a"> <xsl:if test ="@destination"> <xsl:attribute name="href"> <xsl:value-of select="@destination"/> </xsl:attribute> </xsl:if> <xsl:if test ="@caption"> <xsl:attribute name="title"> <xsl:value-of select="@caption"/> </xsl:attribute> </xsl:if> <xsl:apply-templates/> </xsl:element> </xsl:template> </xsl:stylesheet>
The “hidden link tag” rule is quite powerful. Its goal is to generate a standard <a> link from the <hiddenlink> tag. It is quite unlikely that spambots and also search engines react to the <hiddenlink> tag and its destination and caption attributes. Note the second occurence of <hiddenlink> which obfuscate the link to the obfuscated email.
And Javascript ?
All this is quite useful, but you probably want to include some javascript either in the XML source or in the XSL stylesheet. You will quickly find, after countless tries, that the perfect solution may not exist… That’s not without saying that no solution found on the web did work for me so far. As you know, the Javascript code should be included in a <script> tag and appear like this in the generated code to maximize browser compatibility :
<script type="text/javascript"> //<![CDATA[ function isGreaterThan(a, b) { alert(a > b ? "a > b" : "a < b"); } //]]> </script>
There is several problems with this Javascript code when it comes to use it in conjunction with XSLT :
- The CDATA section is likely to be interpreted by the XSLT processor and wiped out from the output
- The “<” and “>” are likely to be escaped to their entity equivalent < and >
- If they are not, then the browser will complain about a wrong starting tag, because of problem 1
- It is not easy to generate the CDATA section without it to be escaped
- Each browser behave differently on whatever you try
Quite luckily, after hours spent to make this work and cross-browser compatible, I am happy to share my solution. Please have a look at example 7 :
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet title="xslformating" type="text/xsl" href="example7.xsl"?> <!DOCTYPE html> <html xml:lang="en"> <head> <title>Extending XHTML - Example 7</title> <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" /> <script type="text/javascript"> //<![CDATA[ function isGreaterThan(a, b) { alert(a > b ? "a > b" : "a < b"); } //]]> </script> </head> <body onload="isGreaterThan(10, 20);"> <p>Example 7</p> </body> </html>
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="#default" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xml" indent="yes" media-type="application/xhtml+xml" omit-xml-declaration="no" encoding="utf-8" version="1.0" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" /> <!-- All tags --> <xsl:template match="//*" priority="10"> <xsl:element name="{name()}"> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:element> </xsl:template> <!-- script tag --> <xsl:template match="//script" priority="20"> <xsl:element name="script"> <xsl:copy-of select="@*"/> <!-- Beginning of CDATA section --> <xsl:text disable-output-escaping="yes"><![CDATA[//<]]></xsl:text><xsl:text disable-output-escaping="yes">![CDATA[</xsl:text> <!-- original javascript --> <xsl:value-of select="." disable-output-escaping="yes"/> <!-- End of CDATA section --> <xsl:text>//]]</xsl:text><xsl:text disable-output-escaping="yes"><![CDATA[>]]></xsl:text> </xsl:element> </xsl:template> </xsl:stylesheet>
This should make the javascript included in your XML source file to work without any major problem in all browsers, including Firefox which sadly ignores the “disable-output-escaping” attribute. Note how I decomposed the CDATA section in order to prevent the XSLT processor to interpret the first “]]>” which should be outputed. Note also the <xsl:value-of> element, which is the only choice available to make this solution to work as expected.
Adding Javascript in the XSL stylesheet
You may want, for whatever reason, to add some javascript in the XSL stylesheet which is to be executed by the browser, not by the XSLT processor. How to do that ? Let’s see example 8 which extends example 7 :
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet title="xslformating" type="text/xsl" href="example8.xsl"?> <!DOCTYPE html> <html xml:lang="en"> <head> <title>Extending XHTML - Example 8</title> <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" /> <script type="text/javascript"> //<![CDATA[ function isGreaterThan(a, b) { alert(a > b ? "a > b" : "a < b"); test(a, b); } //]]> </script> </head> <body onload="isGreaterThan(10, 20);"> <p>Example 8</p> </body> </html>
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="#default" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xml" indent="yes" media-type="application/xhtml+xml" omit-xml-declaration="no" encoding="utf-8" version="1.0" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" /> <!-- All tags --> <xsl:template match="//*" priority="10"> <xsl:element name="{name()}"> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:element> </xsl:template> <!-- script tag --> <xsl:template match="//script" priority="20"> <xsl:element name="script"> <xsl:copy-of select="@*"/> <!-- Beginning of CDATA section --> <xsl:text disable-output-escaping="yes"><![CDATA[//<]]></xsl:text><xsl:text disable-output-escaping="yes">![CDATA[</xsl:text> <!-- original javascript --> <xsl:value-of select="." disable-output-escaping="yes"/> <!-- we can add some Javascript here --> <xsl:text disable-output-escaping="yes"> <![CDATA[ function test(a, b) { alert( "This Javascript was embedded in the XSL stylesheet : " + (a > b ? "a > b" : "a < b") ); } ]]> </xsl:text> <!-- End of CDATA section --> <xsl:text>//]]</xsl:text><xsl:text disable-output-escaping="yes"><![CDATA[>]]></xsl:text> </xsl:element> </xsl:template> </xsl:stylesheet>
Hopefully, this should also work without any problems using any browser. And what would it look like to see a bigger example ?
Mixing the whole thing together
In this example, we are going to create a FAQ using a <faqlist> element which can contain several <faq> elements. A <faq> element will contain a <question> element and a <answer> element. Also, we will add the the content of the previous examples to show how to make this work together. We will also add some “id” and “class” attributes to our custom elements to show how to style them. So let’s see example 9 :
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet title="xslformating" type="text/xsl" href="example9.xsl"?> <!DOCTYPE html [ <!ENTITY mailto "mailto:"> <!ENTITY username "gabriel"> <!ENTITY arobase "@"> <!ENTITY hostname "gabsoftware"> <!ENTITY tld ".com"> <!ENTITY http "http://"> <!ENTITY www "www."> <!ENTITY email "&username;&arobase;&hostname;&tld;"> <!ENTITY website "&http;&www;&hostname;&tld;"> ]> <html xml:lang="en"> <head> <link rel="stylesheet" href="example9.css" type="text/css" media="screen" /> <title>Extending XHTML - Example 9</title> <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" /> <script type="text/javascript"> //<![CDATA[ function isGreaterThan(a, b) { var a=10; var b=20; alert( "This javascript was embedded in the xml file : " + (a < b ? "a < b" : "a > b") ); test(a, b); } //]]> </script> </head> <body onload="isGreaterThan(10, 20);"> <div id="header"> <h1>Extending XHTML - Example 9</h1> </div> <div id="page"> <p> This version of <a href="http://www.w3.org/TR/xhtml10">XHTML</a> has been extended. For example it can make text <bold id="monid">bold</bold> and display <a href="#faqsection">FAQs</a>. </p> <p> This email is obfuscated : <a href="&mailto;&email;">&email;</a> </p> <p> This link is obfuscated : <hiddenlink destination="&website;" caption="Link title" id="linkemailid">&website;</hiddenlink> </p> <p> This obfuscated email uses an obfuscated link : <hiddenlink destination="&mailto;&email;" caption="Link title" id="linkid">&email;</hiddenlink> </p> <faqlist id="faqsection"> <faq> <question>What does <a href="http://en.wikipedia.org/wiki/FAQ">FAQ</a> mean ?</question> <answer>FAQ stands for Frequently Asked Question.</answer> </faq> <faq> <question>What can you find in a FAQ section ?</question> <answer>A list of question people often ask and their answer.</answer> </faq> <faq> <question>What makes this FAQ so special ?</question> <answer>Just check the source of this page :)</answer> </faq> </faqlist> </div> </body> </html>
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="#default" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xml" indent="yes" media-type="application/xhtml+xml" omit-xml-declaration="no" encoding="utf-8" version="1.0" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" /> <!-- All tags --> <xsl:template match="//*" priority="50"> <xsl:element name="{name()}"> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:element> </xsl:template> <!-- script tag --> <xsl:template match="//script" priority="100"> <xsl:element name="script"> <xsl:copy-of select="@*"/> <!-- Beginning of CDATA section --> <xsl:text disable-output-escaping="yes"><![CDATA[//<]]></xsl:text><xsl:text disable-output-escaping="yes">![CDATA[</xsl:text> <!-- original javascript --> <xsl:value-of select="." disable-output-escaping="yes"/> <!-- we can add some javascript here --> <xsl:text disable-output-escaping="yes"> <![CDATA[ function test(a, b) { alert( "This javascript was embedded in the xsl stylesheet : " + (a > b ? "a > b" : "a < b") ); } ]]> </xsl:text> <!-- End of CDATA section --> <xsl:text>//]]</xsl:text><xsl:text disable-output-escaping="yes"><![CDATA[>]]></xsl:text> </xsl:element> </xsl:template> <!-- BOLD tag --> <xsl:template match="//bold" priority="50"> <xsl:element name="strong"> <xsl:copy-of select="@*"/> <xsl:copy-of select="node()"/> </xsl:element> </xsl:template> <!-- HIDDENLINK tag --> <xsl:template match="//hiddenlink" priority="50"> <xsl:element name="a"> <xsl:if test ="@destination"> <xsl:attribute name="href"> <xsl:value-of select="@destination"/> </xsl:attribute> </xsl:if> <xsl:if test ="@caption"> <xsl:attribute name="title"> <xsl:value-of select="@caption"/> </xsl:attribute> </xsl:if> <xsl:if test ="@anchorname"> <xsl:attribute name="name"> <xsl:value-of select="@anchorname"/> </xsl:attribute> </xsl:if> <xsl:if test ="@id | @class"> <xsl:copy-of select="@id | @class"/> </xsl:if> <xsl:apply-templates/> </xsl:element> </xsl:template> <!-- FAQLIST tag --> <xsl:template match="//faqlist" priority="50"> <xsl:element name="div"> <xsl:copy-of select="@*"/> <xsl:attribute name="class">faqlist</xsl:attribute> <xsl:element name="h2"> <xsl:element name="abbr"> <xsl:attribute name="title">Frequently Answered Questions</xsl:attribute> <xsl:text>FAQ</xsl:text> </xsl:element> </xsl:element> <xsl:for-each select="faq"> <xsl:element name="dl"> <xsl:attribute name="class">faq</xsl:attribute> <xsl:element name="dt"> <xsl:attribute name="class">question</xsl:attribute> <xsl:apply-templates select="question/node()"/> </xsl:element> <xsl:element name="dd"> <xsl:attribute name="class">answer</xsl:attribute> <xsl:apply-templates select="answer/node()"/> </xsl:element> </xsl:element> </xsl:for-each> </xsl:element> </xsl:template> </xsl:stylesheet>
body { width: 800px; border: 5px solid #CC9; font-family: 'Palatino Linotype', 'Book Antiqua', Palatino, Arial, "Lucida Console", Serif, Sans-Serif; } #header { padding: 10px; background-color: #FFC; border-bottom: 5px dotted #CC9; } #header h1 { text-align: center; } #page { background-color: #FFC; padding: 10px; } strong { font-weight: bold; } .faqlist { display: block; background-color: #CC9; padding: 10px; -moz-border-radius: 15px; border-radius: 15px; } .faqlist h2 { text-align: center; margin: 0px; background-color: #FFA; -moz-border-radius: 10px; border-radius: 10px; } .faq { display:block; background-color: #FFC; margin: 10px 0px 0px; -moz-border-radius: 10px; border-radius: 10px; } .question, .answer { margin: 0px; padding: 10px; } .question { font-weight:bold; color: #C00; } .question:before { content: 'Q: '; } .answer { color: #090; } .answer:before { content: 'A: '; }
This should result in a quite nice document with a FAQ section and everything that the previous examples demonstrated.
Conclusion
You can see how easy it is to extend XHTML the XSLT way. We have come with a nice and cross browser solution which even works with older version of Internet Explorer (or at least it worked for us with version 7).
While we are doing the XSL transformation on the client side in our examples, you could of course do it on the server side. However, you will loose the ability to obfuscate your links as the client will receive the transformed document. Be also careful that the XSL processor does not interpret the email entities and output them clearly, but that should not happen.
The end
You reached the end of this document, we hope you enjoyed to read it as much as we enjoyed to make it, and of course that this document will help you !
Find here an older pure Javacript approach I implemented back in 2005
http://guti.bitacoras.com/index.php?entry=entry050124-221032
Hello,
Nice solution, although it requires the user to use a Javascript-enabled browser, whereas mine does just require an XHTML capable browser. Still, thank you for sharing !
nice piece, thanks!
très bon , je cherchais à étendre les balises dipso.
Merci, content que ça puisse vous aider !
Thank you! Nice post, I was looking for this for quite long time.
I can provide you a good reason to extend xhtml, I have to make a template for a newsletter that acts perfect in any browser like a html for the marketing department, in the same time to be compliant with a webmaster knowledge and of course to be a readable and formated by a template parser :))
I’m glad this helped you !