< What is Pythonic?
Practical experience with Zope 3 >

[Comments] (10) XML, context and nuance:

Originally this was buried in a comment to an article on Uche's weblog, but he suggested I post it on my own weblog, here goes. The history of this long running discussion is here.

I think for PJE to speak about people detecting (or not) 'context and nuance' in what amounts to a rant is a bit confusing. I appreciated the rant quite a bit, but the exact thing that was missing from it was context and nuance; it wouldn't be a good rant otherwise. As he says in a comment to my article, his position is more nuanced than his tone was.

The one bit of context I got before he actually announced it was that he was probably looking at the Chandler source code...

I'd seen the XML bit of his "Python is not Java" post quoted by a few other other Python programmers, and I was responding to people who picked up on that bit. The problem I have is more with the tone than the content. XML is certainly far from a panacea, but it cannot be ignored either. My problem with the rant, and the way the rant was being picked up, is that his tone is giving an excuse to other Python programmers to ignore XML with disdain. I don't believe that's an attitude that is useful.

Finally, to Uche: I hope the backlash to XML can be kept from being too viscious. Viscious backlash to me has a connotation with uninformed lashing out, while what is needed is constructive, though strong, criticism.


Comments:

Posted by Laurent Szyster at Thu Sep 01 2005 02:04

Here is my bit of "constructive, though strong, criticism" about XML, Python and Uche views on both topics.

Uche has a strange affection for over-sophisticated software and has yet to produce any good Python code to process XML.

Am I too harsh?

Well, Greg Stein came up with qp_xml.py five years ago:

http://www.lyra.org/greg/python/qp_xml.py

His simple and general purpose design has since then been widely adopted then optimized as cElementTree. It is an unorthodox DOM, but very Pythonic and provides "one obvious way to do it".

While Uche has written countless articles about XML and Python, but you will not find him in the contributors to the libxml bindings for Python:

http://xmlsoft.org/python.html

and I don't see him in the list of expat developpers either:

http://sourceforge.net/project/memberlist.php?group_id=10127

who did a wondefull job bringing expat in Python 2.0.

Python developpers do not "ignore XML".

They ignore Uche and his useless 4Suite library.

Posted by Martijn Faassen at Thu Sep 01 2005 12:00

This I count as 'viscious critism', and isn't even about XML but about Uche.

Whether you may agree with design and implementation decisions or not (I certainly don't always), Uche has contributed in countless valuable ways to the Python/XML community and infrastructure.


Posted by Dominic Fox at Thu Sep 01 2005 15:11

The only thing that makes XML worth bothering with is that there exists a widely-implemented toolset for doing things with XML.

As a format, considered in isolation from its social context, XML has nothing in particular to recommend it over any of dozens of other formats you or I could dream up over the course of an afternoon.

As a "data type", the XML infoset is overkill for simple requirements (config files, serialization of resultsets from database queries) and woefully inadequate for complex ones (programming languages, databases). For all that it is touted as a solution for "semi-structured" data, it actually encodes a very narrow set of assumptions about how data is structured - the types of validity and integrity constraints that might apply, the types of relationships that might exist between data elements.

What keeps XML afloat, and will carry on doing so for years to come, is the power law attendant on widespread adoption of the standard XML toolset. Using XML will make you/your application popular with other XML users. If you need to be popular with other XML users, you need XML.

Otherwise, I agree with PJE: Python has better ways (for Python) of doing the simple things - lists and dictionaries for data, pickles for persistence - and, when it comes to doing the hard things, will get little help from XML.

Posted by Martijn Faassen at Thu Sep 01 2005 16:37

I agree that the social context, which includes toolset, community experience and knowledge, and available specifications, is what makes XML worth bothering.

You could also say TCP/IP is 'worth bothering' because everybody uses it and builds on it. TCP/IP is not perfect either, after all. (though far less imperfect than XML)

By the way, you forget to mention a whole class of reasons to use XML by the way -- documents. In the whole data versus document discussion on XML I tend to lean slightly towards documents myself, and feel applying XML makes sense more often than for data.

I also think XML can help in the "hard things" space somewhat more often than you do -- transformation and query/addressing for instance, where Python has less of a story. I also still think that domain specific languages is another area where XML may be useful, though that does require lots of care, as does all language design.

Posted by Fredrik Corneliusson at Fri Sep 02 2005 10:42

I don’t really understand this XML bashing. Hating XML because you can dream up dozens of other formats in a afternoon is like refusing to use English just because you can use a imaginary language.

Posted by Dominic Fox at Fri Sep 02 2005 11:55

I agree that XML is a tolerable solution for documents; very much more so than for data, in fact. It can be helpful to think in terms of "tags" marking-up sections of content rather than a tree of data elements some of which just happen to contain text; XML's support for mixed content is useful there.

I also agree, now you mention it, that Python doesn't have much of a "story" for tree transformation. Not all data transformation can be most helpfully seen in terms of tree transformation, of course, but in cases where that approach has traction it's useful to have a DSL for expressing such transforms. Just a pity that DSL has to be XSLT, really.

Query/addressing is another matter. Again, XPath is fine if your data is tree-structured and you want to use a path-based addressing scheme on it. And Python by itself is not spectacularly great (compared to Haskell, say) at expressing more complex ways of taking data apart. I do worry that turning to XML to solve one's query/addressing needs will tend to mean shoe-horning one's data into a hierarchical model, though. Of course, in some cases that might be a perfect fit.

Posted by Fredrik Corneliusson at Fri Sep 02 2005 15:05

>They ignore Uche and his useless 4Suite library.
You mean the library that enables you to use DOM in Python for big documents that Minidom cant handle? In that case I would not say it's useless, at least not for me.

ElementTree is good for some stuff but I’ve found it awkward for some types of data manipulation with it’s tag tail data.

Posted by N. Nis at Fri Sep 02 2005 19:41

You think the Python people are too harsh on XML? Why don't you ask the opinion of the relational database people, or the Lisp people, or the grep/sed/awk Unix people, or ...

XML is good for document interchange (e.g. OpenOffice) and hierarchical data with arbitrary depth. Otherwise there are better solutions.

"Flat is better than nested"

Posted by Martijn Faassen at Fri Sep 02 2005 20:06

Maybe they have a disdain for XML as well -- I don't know. I just know the Python community best. I think the Java community, for instance, has more respect for the technology, which has resulted in Java having kick-ass support for XML technologies (besides going overboard in using it, which was what PJE was complaining about).

I'll repeat again that XML has many flaws and is not useful in many cases.

However, the attitude that "XML is useful in some places but otherwise there are better solutions and it's only grudgingly that I use it" is what I'm objecting to.

As stated before, one of the great advantages of XML is that it is used and built on by others, so there's a network effect in play. Granted someone can can think of an equal or superior data or document format, but in that case the advantages of the network effect are lost. The network effect is an important reason why using XML is frequently a good choice. HTTP isn't the perfect network transport protocol either, but everybody using it and building on it, so that makes it attractive.

Perhaps the core of the argument is the following -- some people seem to give the impression the world would be better off, or just as well, without XML being there. I disagree and would request people to kindly give XML technologies the credit and respect that they deserve. And then they can go off and not use it where it doesn't fit, which it doesn't many many times. :)

Posted by Laurent Szyster at Sat Oct 01 2005 01:12

"Flat is better than nested"

Indeed.

XML is to be used for what it is: a simple markup language for text, a practical protocol that is often trivial to implement:

sys.stdout.write (
''
'>'
''
)

As for parsing it, expat has more than enough for anybody to bother with yet another SAX or ElementTree API.

In my not so humble opinion, what cannot be handled with expat, qp_xml.py or cElementTree is better handled by something else than Python, XSLT for instance.

It may be fun to rewrite an XML transformation language, a full DOM, or yet another XML templating language, but for each of those there is at least one stable open source C implementation with Python bindings.

XML is good. Python support for XML is first class.

Here is my own version of Greg Stein's XML standard:

http:/svn.berlios.de/wsvn/allegra/allegra/xml_dom.py

with a practical serialization API for asynchronous I/O

http://svn.berlios.de/wsvn/allegra/allegra/xml_unicode.py
http://svn.berlios.de/wsvn/allegra/allegra/xml_utf8.py

just enough to write XML with less code and read Python instance from XML following one obvious way to do it.


[Main]

Unless otherwise noted, all content licensed by Martijn Faassen
under a Creative Commons License.