Category Archives: Micro XML

Micro XML and Namespaces

Micro XML is an attempt by James Clark, John Cowan and Uche Ogbuji to simplify XML and get rid of all that extra baggage that currently surrounds it. DOCTYPE and PIs are both removed, UTF-8 is mandatory, draconian error handling is no longer a must, and–perhaps most controversially–namespaces are gone, too.

Uche Ogbuji held a brilliant talk about Micro XML at the recent XML Prague 2013 conference, so rather than reiterating his arguments, I suggest you watch the presentation once it’s made available at the XML Prague website.

What I did want to comment about is this namespaces business. Of everything proposed in the Micro XML spec, the removal of namespaces is clearly the most controversial, as indicated by the many tweets following Uche’s talk. But should you be upset? I mean, really?

I’ve done some fair bit of XML stuff involving namespaces lately (yes, I know, there’s no way to avoid it, really). There’s a Relax NG compact schema that I wrote that uses several, including a default “”. There are conversions from external XSD-based XML to that Relax NG-based XML using XSLT 2.0, and there are conversions from the Relax NG schema to (an obviously not namespace-aware) DTD to satisfy the needs of an editor that does not know what Relax NG is. (And I can’t bring myself to write XSDs; they are the spawn of Satan.) And there are XProc-based pipelines that glue these things together, and they obviously need to be aware of the namespaces in addition to the ones they use themselves.

Lots of namespaces, in other words. And I’m not exaggerating when I tell you that a vast majority of the problems I had and the weirdness I encountered had to do with namespaces.

Nothing coming out from the transformation? A forgotten implied default namespace in the source XML. Namespace declarations in the target XML messing up validation? That same default namespace. The wrong prefix for the XLink namespace in the target XML? No explicit namespace declaration in the source. An unwanted and disallowed XLink namespace declaration being complained about in the root element of an XML document in the process of being checked out from a repository? A web service helpfully adding a seemingly missing namespace declaration to a root element into content in a SOAP envelope, resulting in a document that could not be opened but that did not show any problems in the repository itself, only on its way out…

These are just a few select examples from my plight, and while I may have some of the details slightly wrong here, you probably get the idea. The list goes on.

And why is this all happening? Because someone at some point thought that wouldn’t it be nice if you could share your XML with everyone on the globe with no risk of name collisions and clashing semantics? Wouldn’t it be cool if the conflicting schemas could all be identified using a URI? We could have a throwaway name prefix attached to that URI and implement processing that could hide the prefix for the end user, simplifying things further…

Of course, that someone’s idea of backwards compatibility was simply that to a DTD, the revolution would be hidden in an extra attribute and an element type name containing a colon.

The fact is that I have yet to be helped by namespaces when using XML from the other side of the globe. In fact, I have yet to encounter a situation where I need to process unknown XML where potential clashes in semantics can do harm without me spotting the problem well in advance and taking care of it. The fact is that I don’t often need to use XML from the other side of the globe, out of the blue. It tends to happen in a context, in a controlled manner.

But when I do process that XML, knowing full well the source semantics and how they can map to my needs, it is always the namespaces that cause me grief.

Namespaces are among the least understood features of modern-day XML and among the most abused. The tools range from helpful to disastrous to completely ignorant or just plain wrong, and there are as many reasons for this as there are XML parser implementations out there. You know right from the start that you will have problems, so you’d better resupply the medicine cabinet well in advance or get ready for that headache.

So, Micro XML? Yes, please. Now?

An Even-Simpler Markup Language?

in his blog, Norman Walsh writes about an even-simpler-than-Mixro-XML markup language, inspired in part by John Cowan’s XML Prague poster and by James Clark’s Micro XML ideas. His ideas are well worth a serious consideration–Norm’s ideas are always worth considering–but the purist in me cringes at the idea of allowing more than one root element. I have to say that I find the idea attractive but I’m not really big on change so maybe that is why I hesitate.

The pragmatist in me, on the other hand, also cringes at Norm’s not doing away with namespaces when he has the chance. in my experience they always create more problems than they solve, but on the other hand, my experience tends to be more about strictly controlled environments where the issues one usually wishes to solve using namespaces can be dealt with using other means.