The ZODB is a powerful object database for Python objects. It's very mature - it's been around for more than a decade. It is transactional, has advanced features like clustering (ZEO), blob support, and yes, it can be used independently from Zope. Zope 2, Zope 3 and Grok all use the ZODB as its default data storage, and it's seen a lot of battle testing. As a result of various discussions in the past, I realized that some smart, informed people, seem to think the ZODB doesn't do what it actually does. The ZODB is really an object database. It really does get references between objects right. It's not an object store where references have to be indirect (a string, for instance). Somehow this misconception about the ZODB is widespread. What do I mean when I say the ZODB "gets references right"? Let me give you an example with a lot of a, b and c. If you have object a that points to c, and object b` that also points to c, updating c will really matter to both a and b. You will reach the updated version of c through the reference in both a and b. That kind of example sounds rather abstract, so here is some code that demonstrates it: Let's use this code: So what's special here? There's nothing special! All this is the way you'd expect it from Python. The ZODB's mission is to take normal Python objects and persist them. This means that when you restart the application, all your objects and the reference between them will still be there. There are a few extra requirements to make sure objects get persisted which I'll go into below, but in essence, the above example is complete. Quite a few smart people seem to be under the impression the ZODB does far less than this. They believe references like this won't work properly in the ZODB. They believe, perhaps, that c will be persisted twice, once for a, once for b. This may be the case if c doesn't inherit from the special Persistent superclass, but if it does, there really will only be that instance. The ZODB offers transparent object persistence. It's almost exactly like a pool of normal Python objects. They can reference each other just fine. The only requirements I know of are: The misapprehension that the ZODB somehow does less than it really does seems to be an easy one for people to develop. One reason is because in real-world Zope or Grok-based applications hard references like this are relatively rare. The reason people don't use hard references like this all the time in an application is that sometimes you want back references, and sometimes you want looser coupling between objects. So that's when things are referenced by a string or using some other form of lookup. It's no different in Python programs, though. For the same reason, you sometimes put Python objects in a dictionary and look them up with a key. The wide application of such soft references seems to give people the impression that normal Python references somehow don't work. Let's look at a complete example now. The only thing you need installed to make this work is ZODB3, which you can retrieve from the Python package index here. It demonstrates some of the details, such as how to set up a database and how to use the root dictionary: How could we do something about such misapprehensions? It would be good if the ZODB had a single, up to date to date web site that people could go to learn more about it. The ZODB is one of the coolest, most powerful libraries in the Python world, but it's less well known than it should be. I believe a good ZODB site, with some examples like the one above, would also help grow the ZODB community. The ZODB community is currently in a healthy enough state, with new developments always in progress, but it's a shame more people aren't aware of it. Unfortunately the ZODB developers themselves seem to be too busy to put up this web site. It wouldn't be much work as it's mostly a matter of collecting existing information and redacting it. So, I hope that they will actually do it soon, so that I have some good hyperlinks to put at the end of this article. This wiki page seems inadequate, but it's what Google thinks is the most relevant when I search for "ZODB". The ZODB PDF file is very useful, though I wish I knew of a better way to link to it than to the Subversion repository. A recent good introduction was created by Brandon Rhodes and
Noah Gift for IBM developerworks.
(5) Fri Jun 20 2008 15:12 A misconception about the ZODB:
from persistent import Persistent
class Source(Persistent):
def __init__(self, ref):
self.ref = ref
class Target(Persistent):
def __init__(self, message):
self.message = message
>>> c = Target("First message")
>>> c.message
'First message'
>>> a = Source(c)
>>> b = Source(c)
>>> c.message = "Second message"
>>> a.ref.message
'Second message'
>>> b.ref.message
'Second message'
from ZODB import FileStorage, DB
from persistent import Persistent
import transaction
class Source(Persistent):
def __init__(self, ref):
self.ref = ref
class Target(Persistent):
def __init__(self, message):
self.message = message
def getroot():
# open the database
storage = FileStorage.FileStorage('/tmp/mystorage.fs')
db = DB(storage)
conn = db.open()
dbroot = conn.root()
return dbroot
def main():
dbroot = getroot()
if 'a' not in dbroot:
print "Filling database"
fill_database(dbroot)
else:
print "Reusing existing database"
# reset to first message
dbroot['c'].message = 'First message'
a = dbroot['a']
b = dbroot['b']
c = dbroot['c']
print "message through a:", a.ref.message
print "message through b:", b.ref.message
print "ref is the same:", a.ref is b.ref
print "ref is indeed c:", a.ref is c
print "changing message c to: Second message"
c.message = 'Second message'
print "message through a:", a.ref.message
print "message through b:", b.ref.message
# commit any changes to the database
transaction.commit()
def fill_database(dbroot):
dbroot['c'] = c = Target('First message')
dbroot['a'] = a = Source(c)
dbroot['b'] = b = Source(c)
if __name__ == '__main__':
main()
- Comments:
Posted by Fernando Correa at Fri Jun 20 2008 16:22
Hey Martijn,Very nice article!
Extremely simple and gets to the point right away. It'd be nice to have it contributed in the zodb section of the upcoming zodb.org.
For most smart guys, when you say something about databases in general, references is one of the first things that they argue.
By showing how easy it is on python/ZODB, people may start thinking differently.Posted by Duncan McGreggor at Fri Jun 20 2008 19:58
Another great post, Martijn!
Posted by Kevin Teague at Sat Jun 21 2008 02:13
Well, I knew that. But I only realized that fact very recently! :PPart of this misconception might come from people using Plone's Archetypes, which has a special Reference field for managing references. This field is to enable descriptions of references (constraints, widgets, backrefs, etc.) but when using it you can easily make the assumption, "I guess this field is here because the ZODB doesn't handle references properly".Of course, there are lots of good reasons for wanting to use a higher level abstraction for references. In the case of the app I was working on, I wanted a persistent cache object to have an attribute that refer's to a model object, e.g.: model_obj = app['mymodel']
app['cache'].last_modified_obj = model_objHowever, with transparent object references, this works the same as you would expect in normal Python code. The wrinkle being that if you delete the model_obj: del app['mymodel']There is a still a reference to it in the cache object. So you can still pull the object from the database with: model_obj = app['cache'].last_modified_objThe solution is to either ensure that you delete the cache object as well, or use a referencing engine. I've started experimenting with the lovely.relation package in my Grok app which seems to be working well so far - it's a bit overkill for a simple use-case like the one above, but I will be wanting to also query for back references later on.Posted by Daniel Nouri at Sat Jun 21 2008 02:40
Some of this misconception about references between objects in the ZODB might come from the fact that they break Acquisition wrapping and ITraversable in Zope 2. You generally need to traverse to objects implementing ITraversable in their original location. Otherwise you'll break methods like absolute_url() and getPhysicalPath().There's a similar effect when you use Python properties of Acquisition wrapped classes; they'll lose their wrapping.Kevin: Take a look at persistent.wref -- this gives you weak references as found in the Python weakref module.
Posted by Noah Gift at Sat Jun 21 2008 19:04
Martijn,I agree nice article. I think there is a common misperception that ZODB is embedded deep inside of Zope, that it is complicated, etc. You have cleared this up very succinctly.I would love to see people using ZODB to solve problems where it is a better fit than an object relational database. ZODB is cool stuff, and there needs to be more written about it. I also like the idea of mixing ZODB and SQLAlchemy in web application projects, because there are some things that an object database might make easier, like recursing through a highly nested data structure.
