O’Reilly XML Pocket Reference: don’t bother

the tarsier featured on the cover of Learning ...
Image via Wikipedia

Just bought this on Amazon and had it delivered

Wish I hadn’t bothered. I am sure the copy is technically correct but as it is a reference book that refers to numbered paragraphs throughout and none of the paragraphs (or even chapters) have been printed with numbers, it is close to useless.

In fact it really is disgraceful that O’Reilly didn’t pulp it and start again – especially as all the signs are that it was their (as opposed to the authors’) mistake. After all the authors have numbered their paragraph references, so it looks to me like some later editor deleted all the numbering.

XSLT conundrum solved

I had a moment of epiphany today about my XSLT problem – the answer is not to seek to append stuff to the end of the document, but to get the template right in the first place eg:

<xsl:template match=”/dataroot”>
<div title=”Summaries”>
<xsl:apply-templates select=”Headlines” />
<div title=”Fulltext”>
<xsl:apply-templates select=”Headlines/Headline” /></div>
<xsl:template match=”Headlines”>
<h4><xsl:value-of select=”Headline” /></h4>
<br /><xsl:value-of select=”Summary” /><br />
<br /><a>
<xsl:attribute name=”href”>#
<xsl:value-of select=”Headline” />
Full Text
<br /><br />
<xsl:template match=”Headline”>
<br /><br /><a>
<xsl:attribute name=”name”>
<xsl:value-of select=”.” />
<xsl:value-of select=”../FullText” />

How can I do this in XSLT?

Diagram of the basic elements and processing f...
Image via Wikipedia

I have a very simple (single relation) database, of news stories that I export to xml, so I get something like this:

<top>This is a story </top>
<summary>Some summary of the story</summary>
<full_text>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed pharetra sagittis risus a ultrices. In a lectus eu nunc scelerisque gravida ac elementum felis. Phasellus. </full_text>

And so on….

What I want to do is use an XSL stylesheet to convert this into some HTML where the top line is printed, then the summary and then a link to the full text, which is at the bottom of the document.

Writing the XSL that will extract the top line and the summary is easy, but how can I get a link to what is, in effect, some appended text at the end of the document? Is it even possible?

I have a huge and authoritative tome on XSLT  – XSLT: Mastering XML Transformations. For Beginners and Advanced Users – which I will now consult, but anyone know before I delve in?

530 lines of Javascript

I have just written that amount of code in what I persist in thinking of as a toy language (it was actually somewhat longer until I refactored the code to group some common functions together),

I had to do this for a coursework exercise – a lot of effort to be honest for what is at stake – processing a rather large XML file with some XSL. The Javascript essentially manages the parameters.

At the end my conclusion is that I don’t really see why anybody would want to write that much client code if they could possibly help it. Of course it transfers the computational burden to the client – but at the cost of hundreds of lines of interpreted code which is essentially under the control of the people who write the engines in the Firefox and IE browsers. In the real world that points towards a support nightmare.

Having written a fair bit of Perl (and AJAX) stuff in the past the whole thing felt unnatural – dozens of lines, much of it designed to handle the differences between the browser engines, that could have been handled simply on the server side.

One thing that I was convinced of was the potential power of XSLT: though I was not quite prepared for the revelation that it is Turing complete (ie it would be possible to write some XSL that would process any algorithm/task solvable through a finite number of mechanical steps). Though I shudder to think of how big a stylesheet would be required to handle all but the smallest of task.

But the potential power of XSLT is not the same of thinking of many practical uses for it!

Dealing with 0x80600001 errors

Maybe you have just seen a message like this:

Error: uncaught exception: [Exception… “Component returned failure code: 0x80600001 [nsIXSLTProcessor.importStylesheet]”  nsresult: “0x80600001 (<unknown>)”  location: “JS frame :: file:///home/adrian/webtech/cia.html :: fulltable :: line 47”  data: no]

If you have then chances are you are working on some XSL/XSLT (the above comes from a piece of coursework I am working on which manipulates an XML representation of data from the CIA World Fact Book).

The error indicates that your XSL is broken and non-compliant and the problem is that Firefox/Mozilla is much stricter about what is broken than it is likely your command line XSLT processor is: the piece of XSL which generated the above message seemed to fly through xsltproc on my Linux box.

The best way to fix this is to take out the lines, one by one, from your XSL and look for the one that breaks the transformation. To avoid being inadvertently tied up in some issue of plagiarism later on I cannot post the XSL I was working on when this came up, but I had a line like this:

<xsl:apply-templates match="//item"/>

That is bad XSL – the match should be a select but as xsltproc happily covered up for my mistake and generated the XHTML I was looking for I could not understand why Firefox was flagging the error on this line of Javascript xsltproc.importStylesheet(xmlxsl).

It also took me a while to find an online explanation, so I wrote this.

XML: any use?

I stumbled across the site XMLSucks.com just now when reading a comment on slashdot about the idea that there was an FBI mandated “backdoor” in OpenBSD.

Right now I am working on some coursework with XML and so the site has my sympathy. For sure, XML has its uses – SVG seems like a pretty good idea to me and I have used it recently to generate graphics to represent the processes running on a Linux box.

But freely mixing it with HTML on the web? I am inclined to (mostly) agree with the statement on the site:

XML is bloated. XMLis fugly. XML is only “human-readable” if you’re willing to stretch the definition of “human-readable.” The same goes for the proposed bloatware of HTML5. Anyone looking at the spec must be shaking their heads. Sure, it’s better than the now-abandoned xhtml 2.0, but that’s not saying much. I