(2) Fri Feb 24 2006 10:23 lxml and (c)ElementTree:
I saw a blog entry by Julien Anguenot praising the ElementTree+ (and cElementTree in particular) XML processing library, and also contrasted it with lxml, as in "why didn't I use lxml?". Since I created lxml, I thought I'd chip in and give my perspective on how it relates to ElementTree, and also give some context around Julien's statements about lxml in his blog entry.
[More]
(10) Wed Feb 01 2006 12:07 Guido and XML:
I think Guido's post on XML is a good occasion to point again to my rant about the disdain for XML among Python programmers, posted almost exactly a year ago on this blog.
[More]
(1) Fri Aug 05 2005 11:33 the why of lxml:
Today I read an article about libxslt on O'Reilly's xml.com. It
demonstrates the power of libxslt; it's a cool library. It also
demonstrates why I wrote lxml: writing Python code that correctly
uses libxml2/libxslt's bindings directly is difficult.
[More]
Fri Jun 17 2005 10:31 lxml should now compile with gcc 4.0:
Recently I started getting reports that lxml does not compile with gcc 4.0. Investigating this an issue with Pyrex was quickly identified -- it generates C code that is in fact illegal, was accepted by older gcc versions, but gcc 4.0 refuses to.
[More]
(3) Sat Apr 16 2005 00:27 the Clarity Template Language:
The ClearSilver templating language does not have a very pleasant syntax for people familiar with the TAL notation of Zope Page Templates. That's not to say ClearSilver's syntax is awful; it's deliberately simple, and I'm sure one could get used to it pretty quickly. Still, I started wondering what ClearSilver syntax would look like if it were more like TAL. Let's call such a theoretical TAL-like ClearSilver "Clarity". Perhaps this is a bit confusing, as it's the same name as the ClearSilver integration package I talked about before, but it's a nice name. :)
[More]
(5) Sat Apr 09 2005 21:51 lxml 0.5.1 released:
I've just released version 0.5.1 of lxml, the Pythonic binding to the libxml2 and libxslt libraries. This because I got feedback to 0.5 which pointed to a critical bug in the way unicode was handled. This kind of feedback is why I should've released lxml long ago!
[More]
(6) Fri Apr 08 2005 20:56 lxml released at last:
I've finally found the time to release lxml. So here then is
lxml, release 0.5!
[More]
Tue Mar 01 2005 18:53 Criteria for evaluating specifications:
As Andrew Tannenbaum said, "The nice thing about standards is that
there are so many to choose from." Apparently he followed this up by:
"And if you really don't like all the standards you just have to wait
another year until the one arises you are looking for."
[More]
Tue Jan 25 2005 20:01 lxml relax NG tweaks:
The Relax NG support seemed to be working for lxml, until I tried it with a complicated case: a modularized XHTML Relax NG schema.
[More]
Mon Jan 17 2005 22:31 lxml performance progress:
Such progress a few days can bring. Just last week the lxml.etree
performance figures on ElementTree operations like findall lost
out badly to pure Python code. So badly, it was pretty embarassing:
findall('//v') on ot.xml
ElementTree: 0.13 s
cElementTree: 0.11 s
lxml.etree: 1.9 s
[More]
Fri Jan 14 2005 19:15 lxml progress:
Since some people seem to be actually reading this and some progress has been made, I thought I'd give a report of what's been happening with lxml.
Since last week, I've added a lot more of the ElementTree API, such
as the .find() function and friends, by directly using the code
from ElementTree.
I actually am running the ElementTree and cElementTree test suites
now. I still need to disable some tests, but a significant fraction
is indeed running.
I've improved the way libxml2's parser functionality gets used, in
order to implement libxml2's top-level parse() function.
I've added XPath support to lxml.etree! An example of what you can
do:
>>> from lxml import etree
>>> tree = etree.parse('ot.xml')
>>> tree.xpath('(//v)[5]/text()')
[u'And God called the light Day, and the darkness he called Night.
And the evening and the morning were the first day.\n']
or, say, this, modifying the elements returned:
>>> result = tree.xpath('(//v)[5]')
>>> result[0].text = 'The day and night verse.'
>>> tree.xpath('(//v)[5]/text()')
[u'The day and night verse.']
I've added the start of XSLT support to lxml.etree. An example:
test.xslt
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="*" />
<xsl:template match="/">
<day><xsl:value-of select="(//v)[5]" /></day>
</xsl:template>
</xsl:stylesheet>
>>> from lxml import etree
>>> style_xml= etree.parse('test.xslt')
>>> style = etree.XSLT(style_xml)
>>> ot = etree.parse('ot.xml')
>>> result = style.apply(ot)
>>> style.tostring(result)
u'<?xml version="1.0"?>\n<day>And God called the light Day, and the
darkness he called Night. And the evening and the morning were the
first day.\n</day>\n'
>>> result.getroot().tag
u'day'
[More]
Thu Jan 13 2005 20:33 lxml findall and xpath performance:
Update: lxml got quite a bit faster since this entry, see here
[More]
Wed Jan 12 2005 20:07 lxml parser performance:
In a discussion with Fredrik Lundh about his (c)ElementTree parser
performance benchmarks on the lxml.etree implementation.
[More]
Sat Jan 08 2005 11:12 lxml.etree is getting there:
The lxml.etree implementation of ElementTree, on top of libxml2, is
getting there now. It features automatic memory management and quite a bit of ElementTree compatibility. Not all of the ElementTree API has been implemented yet, but enough for many use cases.
[More]
 | Unless otherwise noted, all content licensed by Martijn Faassen under a Creative Commons License. |