Category Archives: XML Prague

XML Prague 2012

Speaking of XML conferences, XML Prague 2012 has been announced and will take place a month earlier than the last few times, on February 10-12. The venue is also new, a good thing since the last two events were sold out.

Looking forward to this one already.

An Even-Simpler Markup Language?

in his blog, Norman Walsh writes about an even-simpler-than-Mixro-XML markup language, inspired in part by John Cowan’s XML Prague poster and by James Clark’s Micro XML ideas. His ideas are well worth a serious consideration–Norm’s ideas are always worth considering–but the purist in me cringes at the idea of allowing more than one root element. I have to say that I find the idea attractive but I’m not really big on change so maybe that is why I hesitate.

The pragmatist in me, on the other hand, also cringes at Norm’s not doing away with namespaces when he has the chance. in my experience they always create more problems than they solve, but on the other hand, my experience tends to be more about strictly controlled environments where the issues one usually wishes to solve using namespaces can be dealt with using other means.

Until Next Year, XML Prague

This year’s XML Prague is over and I miss it already. For a markup geek, XML Prague is heaven. There is always so much to learn, so many great minds and cool new ideas, not to mention Czech beer and the friendly atmosphere of a smaller conference. This was my third consecutive year attending and I very much look forward to the fourth.

Some notes of interest:

  • XML Prague is a great success. The conference sold out before the sessions were announced so next year, it will move to a larger venue.
  • HTML5, last year’s hot topic, was pronounced dead more than once.
  • Michael Kay announced (and demo’d) Saxon Client Edition that allows you to run XSLT 2 on the browser. Very cool. Saxon CE is in alpha but available for testing at www.saxonica.com.
  • JSON seems to be hot this year. I should probably spend some time learning it, especially since I am planning to use it in the CMS we develop at Condesign.
  • George Bina from SyncRO Soft Ltd, the company that makes Oxygen, presented some ideas regarding advanced XML development. While Oxygen is at the centre of many of these, his point was that there should be a standardised way to do it all. Dave Pawson suggested expanding XML catalog files for the job via Twitter, an idea I find plausible.
  • Murata Makoto, a personal hero of mine thanks to his work with Relax NG, presented EPUB3. What those of us who were there will remember, however, is his introduction, expressing his grief over the on-going catastrophe in Japan.

See www.xmlprague.cz for more.

Back from XML Prague

I’m back home from XML Prague. It’s been a fabulous weekend with many interesting talks and lots of good ideas, and I’m still trying to sort my impressions. So many things I want to try, so many technologies I want to learn. The feedback from my talk on Film Markup Language alone is enough to keep me busy for a few weeks.

More later, but for now, suffice to say that I’m already thinking of a subject for a presentation next year.

It’s Quite Possible to Lose Your Way in Prague

I drove to Prague for XML Prague, yesterday. I left Göteborg on Wednesday evening, taking the ferry to Kiel, and then spent most of Thursday on the Autobahn. It all went without a hitch; not that I’m that good but my GPS is. I would probably have ended up in Poland without it because I often miss the road signs when on my own. Some of my business trips before the GPS era were truly memorable.

So today I took a walk around central Prague, shopping gifts and seeing the sights. And a wonderful city it is, one of my favourite cities in Europe. All that history, all that architecture, the bridges… and small, narrow streets that are never straight. They are practically organic (and probably feed from the gift shops since they are everywhere), and it’s very difficult to find your way. It’s a labyrinth we are talking about.

Yes, I lost my way. The third time I came back to that innocent-looking Kodak shop (and there are a lot of shops with Kodak signs in central Prague, I might add), I knew I was in trouble. I was walking in circles, my feet aching while a particularly wet mixture of snow and rain poured down, and had no idea where I was. And I kept thinking about my GPS, safely tucked away back in my hotel room, remembering that I actually considered bringing it along for the walk but then shrugging, thinking “how hard can it be?”

I found a shelter in a mall I hadn’t seen before (well, I think I hadn’t seen it before) and considered my next move while high-heeled ladies tried lipsticks and wondered what the out-of-place stranger was doing in the cosmetics department. I could ask someone, I suppose, some friendly local…

Then I remembered: I have a GPS in my mobile. It took a few minutes for it to find the satellites it required but after that, I only had to walk for a few more minutes to find a familiar landmark. In a counter-intuitive direction, I might add.

The wisdom in this story? Thank goodness for GPS devices. Oh, and XML Prague starts tomorrow morning.

Automating Cinemas at XML Prague

I’ve been busy writing my presentation and some example XML documents for my presentation on Automating Cinemas Using XML at XML Prague in about a week and a half. I’m slightly biased, I know, but I think the presentation actually does make a good case for XML-based automation of cinemas. I know how primitive today’s automation is, in spite of the many technological advances, and I know where to improve it. The question I’m pondering right now is how to explain the key points to a bunch of XML people who’ve probably never seen a projection booth, and do it in twenty minutes.

The opposite holds true, of course, if I ever want to sell my ideas to theatre owners. They know enough about the technology (I hope) but how on earth will I be able to explain what XML is?

There’s still have stuff to do (for one, it would be nice to finish the XSLT conversions required and be able to demonstrate those, live, at the conference) but the presentation itself is practically finished and the DTD and example documents are coming along nicely. I suppose I need to update the whitepaper accordingly and publish it here, when I’m done.

See you at XML Prague!

Was XLink A Mistake?

This morning, I read Robin Berjon’s little something on XML Bad Practices, originally a whitepaper he presented at XML Prague 2009. I was there, presenting right after he did, and I remember that I nervously listened to his presentation while preparing my own (not my finest hour but that’s a story for another blog entry), wanting to address some of his points. While a lot of what he said made good sense, some didn’t then and certainly don’t now.

In Reusing the Useless, Robin discusses XLink, a recommendation that remains my personal favourite among W3C’s plethora of recommendations. Apparently it’s no-one else’s, at least if Robin is to be believed. “Core XML specification produced by the W3C such as XSLT or XML Schema don’t use it even though they have linking elements,” he says, adding that very few have implemented anything but the rudimentary parts of it. But I get ahead of myself; let’s see what Robin says. He starts out with this:

That feeling (and a general sense that reuse is good) leads people to want to reuse as many parts of the XML stack as possible when creating a new language. That is a good feeling, and certainly one that should be listened to carefully — there are indeed many good and useful technologies to reuse.

This, of course, makes a lot of sense. We are in the standardisation business so we don’t want to reinvent the wheel every time. Me, I’ve done so time and again, and the one W3C recommendation I have used again and again is… XLink. It provides me with a neat way of defining link semantics without enforcing a processing model, from very simple point-to-point relations to multi-ended link abstractions. Yes, I have used both; Simple XLinks are present in most of my DTDs requiring cross-referencing, images or indeed any point-to-point semantics, and Extended XLinks were a useful and necessary addition to the aftermarket document structures of a major car manufacturer, among other things.

But again, I get ahead of myself. Here’s what raised my eyebrows for the first time, this morning:

But that only works if everyone plays, and furthermore the cost of using XLink has to be taken into account. First, a whole new namespace is needed.

This is interesting, to say the least. I thought this was one of the main points of introducing namespaces in the first place, to avoid name collisions.

The basic idea behind namespaces is extremely simple: you use one DTD (well, maybe it’s a schema since DTDs aren’t namespace-aware; there’s a lot I would like to say on that topic, too, so either this is going to be a very long post or I need to start writing down my ideas for blog posts) but in your instances you need to include content created using other schemas. One solution is to only use unique names, but this is a pipe dream and in reality, there’s only so many names you can give, say, a paragraph (p, ptxt, para, …) or a cross-reference (ref, href, link, …), without resorting to silliness. Inevitably, your elements and attributes will have the same names as someone else’s, and that can be a huge pain. Namespaces are a neat way of getting around this problem, and as an added bonus you’ll eventually always get that question, “what does the namespace URL stand for?” from your audience when presenting your work.

My point, and the simple question I would like to ask here, is why is it suddenly a bad thing to introduce a namespace for XLink when practically every recommendation, suggestion, and badly written XML configuration file seems to use one these days? Yes, they all come at a cost, among them that if you actually want to validate that included content from that other namespace, you need to implement something doing the work, somehow. You need to validate it against the right schema and so you need all kinds of lookup mechanisms and stuff. But if you can implement one namespace, shouldn’t you be able to implement several, especially if your imported namespace provides you with a useful mechanism, say, a standardised linking mechanism?

Namespaces aren’t my favourite W3C recommendation but it is what we have. In his blog and whitepaper, Robin points out several bad practices when implementing namespaces and I fully agree with them (perhaps excepting some of the discussion on a “default” namespace for attributes without a prefix), but they are mostly outside the topic at hand because I fail to see why they’d make XLink an undesired recommendation while still encouraging various others.

Robin continues:

Second, the distinction between href and src requires a second attribute.

To be perfectly honest, I’m not sure what this means. First of all, what, exactly, is, the distinction between href and src? According to the XLink recommendation, href “supplies the data that allows an XLink application to find a remote resource,” adding that when used, it must be a URI. In simple XLinks, href‘s are all you need; the source and a reference to it are (or rather, can be) the same thing. (Yes, there is some verbosity since you’ll need that namespace declaration and the XLink type, that sort of thing, but if you use XML Schema, you’ll be far more verbose than this anyway.)

When discussing extended XLinks, though, yes, there is a difference between a “source” and a “reference” to that source (provided I understand the objection correctly). It’s one of the really neat things with extended XLink because it allows us to leave out the linking information from the document instances. We can create complicated, multi-ended, linking structures between resources without the resources ever being aware of them being part of a link. The links can instead be described out-of-line, outside the resources, centrally in a linkbase.

To do this well, there needs to be a clear distinction between pointing out link ends and creating link arcs between them. Certainly, it requires more than one attribute, and in the XLink recommendation, it could easily require three (the pointer to the source, the source’s label, and the actual link arc).

Is this the only way to do multi-ended links? No, certainly not, but it does provide us with a standardised way, one that a group of people put considerable thought into. It is possible to redo the work and maybe even do it better, but unless you have a lot of time on your hands, why should you? It’s a perfectly serviceable recommendation, with far fewer side effects than, say, namespaces on older XML specs, and it does most of the things you’ll ever need with links.

(Granted, XLink, just as any post-namespaces spec, will cause havoc for any system that includes badly implemented XML parsers wanting to interpret everything before and including the colon in an element or attribute name as throwaway strings, but that’s not an XLink problem; it’s a namespaces problem and above all an implementation problem. XML allows colons in QNames; don’t use a parser that tries to redefine what was meant, once upon a time.)

Not everyone agreed with the XLink principles and so left them out in specs that followed, but I have a feeling that what happened was at least partly political (the linking in XHTML comes to mind, with the, um, discussions that ensued), plus that the timing could have been better. At the time, implementing XLink could be something of a pain.

An aside: around the time the XLink recommendation came out, I was heavily involved in implementing large-scale extended XLinks in a CMS for a well-known car manufacturer. Extended XLink solved many of our key problems; being able to define multiple relationships between multiple resources in multiple contexts using a central linkbase made, for the first time, actual single-source publishing possible for the company, and they had been using SGML for years.

The system almost wasn’t, however, for a very simple reason. The XML editor of choice (not my choice, by the way; I was presented with it as a fact of life) and its accompanying publishing solution could not handle the processing of inline link ends or indeed any kind of inline link elements beyond ID/IDREF pairs for page references. The editor and the publishing solution chosen would simply not allow us to access and process them, no matter what we did. This was before XSL-FO was finished or in widespread use, mind, and before most editors (including this one) would offer complete APIs for processing the XML.

I won’t go into details but the solution was ugly and almost voided the use of extended XLinks. No alternative linking solution would have fared any better, however; the problem was that we were slightly ahead of what was then practical to implement and several of the tools available then just didn’t cut it.

Getting back to Robin’s blog entry, he also says:

And then there are issues with parts of XLink being useless for (or detrimental to) one’s needs, which entails specifying that parts of it should be used but not others, or that on such and such element when one XLink attribute isn’t present it defaults to something specific not in the XLink specification, etc.

It’s hard to address the specifics here since there are none. I don’t have a clue of what parts of XLink are useless or detrimental to Robin’s work and can only address his more general complaints.

Most “standards” are like this. There is a basic spec that you need to adapt to, with the bare essentials, and there are additions that you can leave out if you don’t need them. XLink makes it easy to implement a minimal linking mechanism while offering a standardised way to expand that mechanism to suit future needs. It also deliberately leaves out the processing model, allowing, for example, for a far more flexible way to define “include” links than XInclude, a linking mechanism that in my mind is inferior in almost every respect to the relevant parts of the XLink spec.

Central here is that with XLink, I can use one linking mechanism for all my linking needs, from cross-references to images to include links, and still be able to define a single processing model for all of them, one that fits my needs. I suspect it would have been very difficult to define anything sufficiently consistent (yet flexible) in the spec itself, so why force one into it?

To me, this is akin to the early criticism DTDs received for lacking data typing. XML Schema added this capacity, resulting in a huge specification with a data typing part that either remained unused or was used for all the wrong reasons. In a document-centric world, data typing is mostly unnecessary which is a good reason to why it wasn’t included in DTDs. (In the few cases where data typing was useful, it was easy enough to add an attribute for the element(s) in question, containing either a regular expression or some other suitable content definition, and add the necessary processing for the applications as needed. There was no need to write a novel for the data types no-one needed, anyway.)

As you might guess, my point is that not including the processing model in the spec is a strength, not a weakness, because a sufficiently complete, general-purpose, processing model for a complete linking mechanism is most likely too complex to do well. It would only serve to create conflicting needs and make the spec less useful. Why not leave it to implementation?

Which brings me to Robin’s next point:

Core XML specification produced by the W3C such as XSLT or XML Schema don’t use it even though they have linking elements.

I don’t pretend to know why this is; I have an idea of why XHTML didn’t, and in my mind it had very little to do with any technical merits or lack of same, and a lot to do with politics and differing fractions in the W3C. Could it be the same with XML Schema and XSLT? It might; I know that XLink could have addressed the linking needs of both specs. Certainly, XML Schema is “costly” enough to not be bothered by an extra namespace among those already included. Maybe someone close to the working groups would like to share, but what’s the point now?

In Robin’s blog, the above statement leads to:

I don’t believe that anyone implements much in the way of generic link processing.

I’ve implemented a lot in this respect, starting from about the time XML became an official spec. XLink has proved to be very useful, allowing me to benefit from my earlier work while still being flexible enough to encourage some very differing link implementations.

Granted, most of my work has been document-centric, with my clients ranging from companies very small to the armed forces of my native country, but in all of these, XLink has proven to be sufficiently useful and flexible. A friend of mine, Henrik MÃ¥rtensson, now a business management guru, wrote a basic XLink implementation more than a decade ago (yes, long before XLink was a finished spec; we were both involved in implementing XLink in various places back then), with everything that was required to create useful links, be they cross-references, pointers to images, or something else. This implementation is still in use today, and while I and others have changed a lot of stuff surrounding it, the core and the basic model remain unchanged. My presentation at XML Prague 2009, right after Robin’s, touched on some of this work, and had my computer been healthier, he would have witnessed at least one XLink implementation.

Which (sort of) leads to Robin’s last point:

Reuse of other languages should be done where needed, and when the cost does not exceed that of reinvention.

I agree with the basic notion, obviously, but not with his conclusions. XLink, to me, is exactly the kind of semantics that is far easier to reuse than to reinvent. Yes, it is possible to simply write “href CDATA #IMPLIED” (or the schema equivalent) and be done with it, but anything more complex than that will benefit from standardisation, especially if you ever envision having to do it again. XLink is a terrific option when it comes to anything having to do with linking.