Python Secret Weblog

Categories: weblog | xml

XML


[Comments] (2) lxml and (c)ElementTree:

I saw a blog entry by Julien Anguenot praising the ElementTree+ (and cElementTree in particular) XML processing library, and also contrasted it with lxml, as in "why didn't I use lxml?". Since I created lxml, I thought I'd chip in and give my perspective on how it relates to ElementTree, and also give some context around Julien's statements about lxml in his blog entry.

[More]

[Comments] (10) Guido and XML:

I think Guido's post on XML is a good occasion to point again to my rant about the disdain for XML among Python programmers, posted almost exactly a year ago on this blog.

[More]

[Comments] (1) the why of lxml:

Today I read an article about libxslt on O'Reilly's xml.com. It demonstrates the power of libxslt; it's a cool library. It also demonstrates why I wrote lxml: writing Python code that correctly uses libxml2/libxslt's bindings directly is difficult.

[More]

lxml should now compile with gcc 4.0:

Recently I started getting reports that lxml does not compile with gcc 4.0. Investigating this an issue with Pyrex was quickly identified -- it generates C code that is in fact illegal, was accepted by older gcc versions, but gcc 4.0 refuses to.

[More]

[Comments] (3) the Clarity Template Language:

The ClearSilver templating language does not have a very pleasant syntax for people familiar with the TAL notation of Zope Page Templates. That's not to say ClearSilver's syntax is awful; it's deliberately simple, and I'm sure one could get used to it pretty quickly. Still, I started wondering what ClearSilver syntax would look like if it were more like TAL. Let's call such a theoretical TAL-like ClearSilver "Clarity". Perhaps this is a bit confusing, as it's the same name as the ClearSilver integration package I talked about before, but it's a nice name. :)

[More]

[Comments] (5) lxml 0.5.1 released:

I've just released version 0.5.1 of lxml, the Pythonic binding to the libxml2 and libxslt libraries. This because I got feedback to 0.5 which pointed to a critical bug in the way unicode was handled. This kind of feedback is why I should've released lxml long ago!

[More]

[Comments] (6) lxml released at last:

I've finally found the time to release lxml. So here then is lxml, release 0.5!

[More]

Criteria for evaluating specifications:

As Andrew Tannenbaum said, "The nice thing about standards is that there are so many to choose from." Apparently he followed this up by: "And if you really don't like all the standards you just have to wait another year until the one arises you are looking for."

[More]

lxml relax NG tweaks:

The Relax NG support seemed to be working for lxml, until I tried it with a complicated case: a modularized XHTML Relax NG schema.

[More]

lxml performance progress:

Such progress a few days can bring. Just last week the lxml.etree performance figures on ElementTree operations like findall lost out badly to pure Python code. So badly, it was pretty embarassing:

findall('//v') on ot.xml

ElementTree: 0.13 s
cElementTree: 0.11 s
lxml.etree: 1.9 s
[More]

lxml progress:

Since some people seem to be actually reading this and some progress has been made, I thought I'd give a report of what's been happening with lxml.

[More]

lxml findall and xpath performance:

Update: lxml got quite a bit faster since this entry, see here

[More]

lxml parser performance:

In a discussion with Fredrik Lundh about his (c)ElementTree parser performance benchmarks on the lxml.etree implementation.

[More]

lxml.etree is getting there:

The lxml.etree implementation of ElementTree, on top of libxml2, is getting there now. It features automatic memory management and quite a bit of ElementTree compatibility. Not all of the ElementTree API has been implemented yet, but enough for many use cases.

[More]


[Main]

Unless otherwise noted, all content licensed by Martijn Faassen
under a Creative Commons License.