The last two days i faced two unique problems with XML parsing and manipulation.

Problem 1: Python + libxslt

In my previous post i referred to a Time Vs Space problem that i had, to pick an install method for libxml and libxslt. But it seems XML related libraries had a few issues more. Consider the following code:
import libxml2
import libxslt

def applyxslt_params(input, output, xslt, params):
	styledoc = libxml2.parseFile(xslt)
	style = libxslt.parseStylesheetDoc(styledoc)
	doc = libxml2.parseFile(input)
	result = style.applyStylesheet(doc, params)
	style.saveResultToFilename(output, result, 0)
	style.freeStylesheet()
	doc.freeDoc()
	result.freeDoc()
The above code can be considered a rather typical way to transform an XML file (input) with a specified xslt (xslt), using a specific set of parameters (params) and finally save it in output.

The problem with that one is the parameter passing. You can call the method as:
applyxslt_params("foo.xml", "foo.output", "foo.xslt", { "foo-param" : "'foo-param-value'" })
This is the proposed way to work with the library to perform a transformation. But with two hidden problems:
  1. The value of a parameter should be enclosed in single quotes 'value'.
  2. The library seems to dislike unicode characters.
In my case, i used something like:
k = "foo"
applyxslt_params(input, output, addon, { "addOnName" : "'%s'" % ( str(k), ) } )
Anything that not taken in mind the above two rules, produced an empty result, without event producing any error.

Problem 2: Java + XML

My supervisor Diomidis Spinellis implemented an educational tool named jarpeb, in order to assist students to study Java, giving them an environment that provides personalised exercises through various Java topics (regular expressions, XML, Sockets etc).

In exercise 10, each student should create an XML document according to an XML Schema specification. The exercise worked perfectly for the past two years, but suddenly validation of the document failed.

We investigated the issue thoroughly and finally solved it by enabling namespace awareness on the DocumentBuilderFactory class.

Conclusions

It is not the first time that similar problems occurred to me. As I stated before, I am fan of XML, but sadly i must confess that all the software packages (libraries etc.) are really far from mature. The reason for that ??? With all this hype regarding XML the last decade, i think that we were more busy on writing specifications than implementing them. Sad, but true.