Overview
RelStorage is a storage implementation for ZODB that stores pickles in a relational database. PostgreSQL 8.1 and above (via psycopg2), MySQL 5.0.x (via MySQLdb), and Oracle 10g (via cx_Oracle) are currently supported.
RelStorage replaces the PGStorage project.
Features
- It is a drop-in replacement for FileStorage and ZEO.
- There is a simple way to convert FileStorage to RelStorage and back again. You can also convert a RelStorage instance to a different relational database.
- Designed for high volume sites: multiple ZODB instances can share the same database. This is similar to ZEO, but RelStorage does not require ZEO.
- According to some tests, RelStorage handles high concurrency better than the standard combination of ZEO and FileStorage.
- Whereas FileStorage takes longer to start as the database grows due to an in-memory index of all objects, RelStorage starts quickly regardless of database size.
- Supports undo and packing.
- Free, open source (ZPL 2.1)
Installation in Zope
You can install RelStorage using easy_install:
easy_install RelStorageIf you are not using easy_install (part of the setuptools package), you can get the latest release at PyPI? (http://pypi.python.org/pypi/RelStorage), then place the relstorage package in the lib/python directory of either the SOFTWARE_HOME or the INSTANCE_HOME. You can do this with the following command:
python2.4 setup.py install --install-lib=${INSTANCE_HOME}/lib/pythonBefore you can use RelStorage, ZODB must have the invalidation polling patch applied. Get it from Subversion (http://svn.zope.org/relstorage/trunk/). There are two versions of the patch: one for ZODB 3.7.1 (which is part of Zope 2.10.5) and one for ZODB 3.8.0 (which is part of Zope 2.11). The patch has no effect on ZODB except when using RelStorage. Hopefully, a future release of ZODB will include the feature.
You need the Python database adapter that corresponds with your database. Install psycopg2, MySQLdb 1.2.2+, or cx_Oracle 4.3+. Note that Debian Etch ships MySQLdb 1.2.1, but that version has a bug in BLOB handling that manifests itself only with certain character set configurations. MySQLdb 1.2.2 fixes the bug. Also, MySQL 5.1.23 has major bugs that lead to loss of data when packing, so MySQL 5.1 is not recommended at this time.
Finally, modify etc/zope.conf of your Zope instance. Remove the main mount point and add one of the following blocks. For PostgreSQL:
%import relstorage <zodb_db main> mount-point / <relstorage> <postgresql> # The dsn is optional, as are each of the parameters in the dsn. dsn dbname='zodb' user='username' host='localhost' password='pass' </postgresql> </relstorage> </zodb_db>For MySQL:
%import relstorage <zodb_db main> mount-point / <relstorage> <mysql> # Most of the options provided by MySQLdb are available. # See component.xml. db zodb </mysql> </relstorage> </zodb_db>For Oracle (10g XE in this example):
%import relstorage <zodb_db main> mount-point / <relstorage> <oracle> user username password pass dsn XE </oracle> </relstorage> </zodb_db>
Migrating from FileStorage
You can convert a FileStorage instance to RelStorage and back using a utility called ZODBConvert.
Migrating from PGStorage
The following script migrates your database from PGStorage to RelStorage 1.0 beta:
After you do this, you still need to migrate from 1.0 beta to the latest release.
Migrating to a new version of RelStorage
Sometimes RelStorage needs a schema modification along with a software upgrade. Hopefully, this will not often be necessary.
To migrate from version 1.0 beta to version 1.0c1 through 1.0.1, see:
To migrate from version 1.0.1 to version 1.1, see:
Optional Features
Specify these options in zope.conf.
poll-interval
Defer polling the database for the specified maximum time interval. Set to 0 (the default) to always poll. Fractional seconds are allowed.
Use this to lighten the database load on servers with high read volume and low write volume. A setting of 1-5 seconds is sufficient for most systems.
While this setting should not affect database integrity, it increases the probability of basing transactions on stale data, leading to conflicts. Thus a nonzero setting can hurt the performance of servers with high write volume.
pack-gc
If pack-gc is false, pack operations do not perform garbage collection. Garbage collection is enabled by default.
If garbage collection is disabled, pack operations keep at least one revision of every object. With garbage collection disabled, the pack code does not need to follow object references, making packing conceivably much faster. However, some of that benefit may be lost due to an ever increasing number of unused objects.
Disabling garbage collection is also a hack that ensures inter-database references never break.
pack-batch-timeout
Packing occurs in batches of transactions; this specifies the timeout in seconds for each batch. Note that some database configurations have unpredictable I/O performance and might stall much longer than the timeout. The default timeout is 5.0 seconds.pack-duty-cycle
After each batch, the pack code pauses for a time to allow concurrent transactions to commit. The pack-duty-cycle specifies what fraction of time should be spent on packing. For example, if the duty cycle is 0.75, then 75% of the time will be spent packing: a 6 second pack batch will be followed by a 2 second delay. The duty cycle should be greater than 0.0 and less than or equal to 1.0. Specify 1.0 for no delay between batches.
The default is 0.5. Raise it to finish packing faster; lower it to reduce the effect of packing on transaction commit performance.
pack-max-delay
This specifies a maximum delay between pack batches. Sometimes the database takes an extra long time to finish a pack batch; at those times it is useful to cap the delay imposed by the pack-duty-cycle. The default is 20 seconds.cache-servers
Specifies a list of memcache servers. Enabling memcache integration is useful if the connection to the relational database has high latency and the connection to memcache has significantly lower latency. On the other hand, if the connection to the relational database already has low latency, memcache integration may actually hurt overall performance.
Provide a list of host:port pairs, separated by whitespace. "127.0.0.1:11211" is a common setting. The default is to disable memcache integration.
cache-module-name
Specifies which Python memcache module to use. The default is "memcache", a pure Python module. An alternative module is "cmemcache". This setting has no effect unless cache-servers is set.
Development
You can checkout from Subversion using the following command:
svn co svn://svn.zope.org/repos/main/relstorage/trunk RelStorageYou can also browse the code:
http://svn.zope.org/relstorage/trunk/The best place to discuss development of RelStorage is on the zodb-dev mailing list.
Probable FAQs
Q: How can I help?
A: The best way to help is to test and to provide database-specific expertise. Ask questions about RelStorage on the zodb-dev mailing list.Q: Can I perform SQL queries on the data in the database?
A: No. Like FileStorage and DirectoryStorage, RelStorage stores the data as pickles, making it hard for anything but ZODB to interpret the data. An earlier project called Ape attempted to store data in a truly relational way, but it turned out that Ape worked too much against ZODB principles and therefore could not be made reliable enough for production use. RelStorage, on the other hand, is much closer to an ordinary ZODB storage, and is therefore much safer for production use.Q: How does RelStorage performance compare with FileStorage?
A: According to benchmarks, RelStorage with PostgreSQL is often faster than FileStorage, especially under high concurrency.Q: Why should I choose RelStorage?
A: Because RelStorage is a fairly small layer that builds on world-class databases. These databases have proven reliability and scalability, along with numerous support options.Q: Can RelStorage replace ZRS (Zope Replication Services)?
A: In theory, yes. With RelStorage, you can use the replication features native to your database. However, this capability has not yet been tested.
PGStorage migration? --ksmith, Wed, 30 Jan 2008 18:23:37 +0000 reply
I have a large database in PGStorage, is there a migration path to RelStorage, or should it "just work"?
PGStorage migration --Shane Hathaway, Tue, 05 Feb 2008 08:49:16 +0000 reply
A schema change will be required, but it should be simple. Tell me which version of PGStorage you're migrating from and I'll tell you what to do.
PGStorage migration --ksmith, Tue, 05 Feb 2008 18:07:27 +0000 reply
Thanks for your reply. I'm not quite sure, I grabbed it from CVS, but frankly my CVS skills are now non-existant since moving to svn and bzr. pgstorage.py shows working revision 1.6, scripts.py 1.3 and pgconnection.py 1.3, hope that's enough to go on.
Data migration from Data.fs to Oracle --patty123, Thu, 07 Feb 2008 13:27:37 +0000 reply
I am done with cx_oracle and ZCXOracle? installation. Now I want to migrate my existing data in Data.fs to Oracle DB. Can anyone help me out in doing that.
PostgreSQL user defined "python pickle data type" --JerryS?, Thu, 28 Feb 2008 16:11:53 +0000 reply
Just wanted to point this out, in case you hadn't seen it:
http://blog.melhase.net/articles/2006/06/06/a-tale-of-postgresql-types-and-python
Migrating from Data.fs --Shane Hathaway, Fri, 29 Feb 2008 07:03:32 +0000 reply
patty123, use the ZODBConvert utility, now included with RelStorage.
pickle data type --Shane Hathaway, Fri, 29 Feb 2008 07:06:45 +0000 reply
JerryS?, yeah it's kind of cool that you can define a pickle type, but I haven't yet seen a reason why I would want to. :-)
Re: pickle data type --JerryS?, Tue, 04 Mar 2008 22:50:27 +0000 reply
In response to "kind of cool that you can define a pickle type, but I haven't yet seen a reason why I would want to."
See 2nd "Q" under "Probable FAQs?", above.
Re. Re: pickle data type --JerryS?, Fri, 07 Mar 2008 16:26:45 +0000 reply
I'm guilty of committing my own worst "pet peeve", posting a cryptic remark, as "an exercise for the reader". The point of a pickle data type is to treat code as data. If documents are data, code certainly is. Python has been endowed with very high level capabilities, approaching the realm of lambda calculus. The ultimate in code reuse may be revealed when we arrive at an effective means of reasoning about the logic represented in code, and recognizing patterns of usage. Storing code in a relational DB sounds like a great place to start.
Last "pickle data type", I promise --JerryS?, Fri, 07 Mar 2008 17:21:46 +0000 reply
In the article, "Functional Programming for the Rest of Us", after showing in lucid examples how functional programs can readily be proved to be correct, validated via unit tests, and updated / optimized on the fly, the author presents us with this gem:
"An interesting property of functional languages is that they can be reasoned about mathematically. Since a functional language is simply an implementation of a formal system, all mathematical operations that could be done on paper still apply to the programs written in that language. The compiler could, for example, convert pieces of code into equivalent but more efficient pieces with a mathematical proof that two pieces of code are equivalent. Relational databases have been performing these optimizations for years. There is no reason the same techniques can't apply to regular software."
Now, imagine rendering Python code as a formal system, in functional terms, in order to achieve bullet proof logic validation, testing, etc., with the ability to swap in highly optimized forms where a code pattern is recognized as a suitable match.
See: http://www.defmacro.org/ramblings/fp.html
Cool stuff, JerryS? --Shane Hathaway, Wed, 12 Mar 2008 03:18:23 +0000 reply
You should get involved in the PyPy? project. :-)
BLOBS --Laurence Rowe, Thu, 13 Mar 2008 11:10:39 +0000 reply
Have you thought of adding BLOB support to RelStorage? Does the ZODB 3.8 BLOB support work with it? Having file data in a separate table would presumably allow the connection cache to be freed up more quickly, stop large files from flushing the connection cache and do away with the need for nfs mounting the blob filesystem. Would limit things to 1GB in postgres though.
BLOBs --Shane Hathaway, Tue, 29 Apr 2008 05:35:28 +0000 reply
After pondering BLOB support for a while, I've come to the tentative conclusion that NFS or some other shared filesystem is a better choice than putting BLOBs in the database. The main reason is that BLOB handles cross threads and remain open long after ZODB connections die; database connections don't like to behave that way. NFS really isn't bad if you manage the network well. Shared filesystems on a SAN are even better. I could also see MogileFS? as a good way to store BLOBs.
how do I back out an easy install? --dbearex, Fri, 09 May 2008 19:06:00 +0000 reply
running easy_install RelStorage put all the modules in my python home -- How do I reverse this in get them all out?
more on BLOBs --Rob Miller, Wed, 21 May 2008 15:53:32 -0400 reply
I understand your reasoning on keeping BLOBs out of the database, but I wonder if you could still let us know whether or not RelStorage would work w/ ZODB 3.8's BLOB support? That already has an option for telling ZEO clients that there's a shared filesystem path to get to the BLOB files, instead of using the database connection; it'd be nice if this mechanism could continue to work w/ RelStorage.
Extensible backend --Charlie_X, Tue, 27 May 2008 08:45:16 -0400 reply
Hi Shane, this looks like an exciting product that should be extensible for other backends: MS SQL, DB2 especially if this largely involves just the SQL statements. What do you think of moving the SQL into separate text files with loading controlled through some form of utility?
Replies --Shane Hathaway, Tue, 27 May 2008 13:54:24 -0400 reply
dbearex: Unlike apt/rpm/portage/etc., easy_install has no uninstall capability; that is not its responsibility. You'll have to do something like 'rm -rf /usr/lib/python2.4/site-packages/relstorage'.
Rob Miller: Blob support over NFS probably works already. Let me know.
Charlie: I have found that it's better to generate most of the SQL code rather than put it in text files. That makes it easier to make changes to all supported databases. SQLAlchemy? or similar might be good, but I haven't tried that yet.
SQL generation --Charlie_X, Wed, 28 May 2008 05:30:08 -0400 reply
Shane: we would like to be able to add support for mxODBC which would require separation from the driver and the SQL - most of the SQL generators won't let you do this which can cause problems.
Unless you're expecting to change the SQL a lot, it's not an unmanageable set of statements and knowing exactly what the SQL is oftens solves a lot of problems.
Main database is Big --Alexandre Silva, Wed, 16 Jul 2008 13:14:40 -0400 reply
Why main database with relstorage and postgres is 41MB and with FileStorage is 8MB. With the pack with relstorage the database remain with the same size. The FileStorage with pack was with 4MB. The pack with Relstorage work in undo history, but not with the database size. why? Sorry for my english.
Main database is Big --Laurence Rowe, Sun, 20 Jul 2008 14:17:39 -0400 reply
Have you tried VACUUM in postgres? See http://www.postgresql.org/docs/8.3/static/routine-vacuuming.html
Main database is Big --Alexandre Silva, Mon, 21 Jul 2008 07:35:33 -0400 reply
Yes, i tried. But i have a datafs of 40MB, if datafs grow up 10MB because of a new object(50MB), and i delete the object and use PACK and VACUUM, datafs return only to 1MB.(the datafs has 49MB). With this it's continues to grow :( .
Main database is Big --Laurence Rowe, Mon, 21 Jul 2008 07:54:06 -0400 reply
You must use VACUUM FULL to return space to the filesystem.
Main database is Big --Alexandre Silva, Mon, 21 Jul 2008 07:58:05 -0400 reply
Yes, i used VACUUM FULL, without this does not return any space, only with the pack and vacuum until the datafs increases
Database packing --Shane Hathaway, Thu, 24 Jul 2008 12:49:52 -0400 reply
Alexandre, packing in RelStorage followed by "vacuum full" should indeed reduce the space consumed by PostgreSQL. If it doesn't, then there's a bug somewhere.
**zope 2.9.5 ** --luca.tebaldi, Fri, 05 Sep 2008 11:03:04 -0400 reply
relstorage work's on zope 2.9.5? it's necessary the invalidation polling patch?