FileSystemSynchronizationProposal
File-System Synchronization
Status: IsProposal (much is now implemented; see status below)
Author
Problem
Zope 2 has two different development models: through-the-web (TTW), and file-system (FS). Each has its advantages and disadvantages.
TTW development is convenient. Code can be developed from anywhere with just a web browser or an FTP or WebDAV client. It isn't necessary to restart Zope to see the effect of changes. Software changes are automatically replicated transactionally using ZEO. All TTW code is untrusted, which is a blessing and a curse. It's difficult to ignore security, but security checks slow down execution and development. A major problem with TTW development is that it's difficult or impossible to use third-party development tools that rely on file-system storage of source code.
FS development makes it easy to leverage traditional development tools. Because file-system code is trusted, it's easier to ignore security and FS code runs faster without many security checks. For Python programmers, FS development provides a more familiar, module-based model.
It is difficult to choose which model to use. Making matters worse, it is extremely difficult to switch betweed TTW and FS development.
Proposed solution
A key aspect of TTW development is ZODB management of the software. ZODB management of the software and TTW development can be decoupled. It isn't necessary for ZODB-managed software to be untrusted if it isn't edited TTW. The level of trust of TTW code might vary depending on the nature of the web connection. The level of trust should be based on a flexible policy.
The main idea of this proposal is to provide a unified development model based on software that executes from the ZODB but that can be easily moved between the ZODB and the file-system through a synchronization process. This development model includes some key features:
- Lossless synchronization between the ZODB and the file system
- Flexible policies for TTW editability and execution trust. Note that Zope 3's component architecture already provides the former. Security proxies, for better or worse, make the difference between trusted and untrusted execution less significant.
- Persistent modules are Python modules that can be stored in the ZODB. These will provide a familiar development model. They will provides transactional update semantics, making it unnecessary to restart Zope when module source is updated.
This proposal focusses on file-system synchronization. Persistent modules are an ongoing project at PythonLabs.
Actor, Goals, and Use Cases
Actor: Developer (this is an abstraction of Site Developer and Component Developer)
Goals and use cases
- Use familiar development tools
These tools may include a version control systems, editors, and file-analysis utilities (e.g. grep). Some of these tools, especially the version control system may be mandated by the developer's organization.
- Copy Zope objects to the file system
Use case: Check out one or more Zope objects to a directory
If the objects had been checked out before, merge changes from the object database into the checked out files.
Provide a simple report that shows affected files.
- Copy files from the file system to Zope objects
Use case: Check in one or more Zope objects from files in a directory
If any of the objects in the database have changed since the last checkout, then report conflicts and abort the checkin.
An option can be used to force data on the file system to override the database.
- Identify and resolve conflicting updates.
Use case: Find and resolve conflicts
A tool can be used to identify files that contain unresolveable conflicts.
- Find out what's changed in Zope or on the file system
since the last synchronization.
Use case: Generate summary and detailed status reports
- Copy Zope objects to the file system
Approach
A tool will be written that uses IObjectFile and
IObjectDirectory interfaces to serialize objects as files:
class IObjectEntry(Interface):
"""File-system object representation
"""
def extra():
"""Return extra data for the entry.
The data are returned as a mapping object that allows *both*
data retrieval and setting. The mapping is from names to
objects that will be serialized to or from the file system.
"""
def typeIdentifier():
"""Return a dotted name that identifies the object type.
This is typically a dotted class name.
This is used when synchronizing from the file system to the
database to decide whether the existing object and the new
object are of the same type.
"""
def factory():
"""Return the dotted name of a factory to recreate an empty entry.
The factory will be called with no arguments. It is usually
the dotted name of the object class.
"""
class IObjectFile(IObjectEntry):
"""File-system object representation for file-like objects
"""
def getBody():
"""Return the file body"""
def setBody(body):
"""Change the file body"""
class IObjectDirectory(IObjectEntry):
"""File-system object representation for directory-like objects
"""
def contents():
"""Return the contents
A sequence of (name, value) tuples is returned.
The value in each pair will be syncronized.
"""
Normally, one IObjectFile or IObjectDirectory is provided via an
adapter.
The synchronization mechanism uses an administration directory,
named @@Zope, in much the same way that CVS, uses a directory named CVS and
Subversion, uses a directory
named .svn.
As currently envisioned, the @@Zope directory will have:
- An
Entries.xmlfile , akin to CVS'sEntriesfile and Subversion'sentriesfile, this contains information about the files or directories stored in the directory containing the@@Zopedirectory. - An
Annotationsdirectory that contains annotation data. Annotations are meta data, such as security assertions or Dublin-code meta-data that are associated with an object. Each object that has annotations will have a subdirectory inAnnotationscontaining the object's annotation data.The subdirectories in
Annotationscontained synchronized data, so they have@@Zopedirectories too.The synchronization tool uses the IAnnotations interface to get annotation data for objects.
- An
Extradirectory that contains content data that doesn't fit in a useful file-system representation. As with annotation data, there will be a directory inExtrafor each object that has extra data.The subdirectories in
Extracontained synchronized data, so they have@@Zopedirecties too.The extra method allows data to be stored that doesn't fit well into the file-system representation. The data returned by the extra method will be saved as files in the
@@Zopeadminstration directory. - An
Originaldirectory that contains copies of all files (not directories) as of the last synchonization time. The originals are used to figure out if anything has changed in the object database or on the file system since the last synchronization and to show and merge differences.
To see how all of this fits together, consider the following
example. A Zope folder, folder, contains two objects, a file,
data.xml, and a page template, index.html. The folder and the
file have security assertions. Folders and page templates don't
have extra data, but files do. Files have to store a content type
in addition to their body data.
After synchronizing the folder object to the directory tmp, the
tmp directory will contain the following files and
subdirectories (shown using a variant on Unix's ls format):
tmp:
---------------------------------------
drwxrwxr-x 4096 folder
drwxrwxr-x 4096 @@Zope
tmp/@@Zope:
---------------------------------------
drwxrwxr-x 4096 Annotations
-rw-rw-r-- 406 Entries.xml
tmp/@@Zope/Annotations:
---------------------------------------
drwxrwxr-x 4096 folder
tmp/@@Zope/Annotations/folder:
---------------------------------------
drwxrwxr-x 4096 @@Zope
-rw-rw-r-- 20 zope.app.security.AnnotationPrincipalRoleManager
-rw-rw-r-- 57 zope.app.security.AnnotationRolePermissionManager
tmp/@@Zope/Annotations/folder/@@Zope:
---------------------------------------
-rw-rw-r-- 1009 Entries.xml
drwxrwxr-x 4096 Original
tmp/@@Zope/Annotations/folder/@@Zope/Original:
---------------------------------------
-rw-rw-r-- 20 zope.app.security.AnnotationPrincipalRoleManager
-rw-rw-r-- 57 zope.app.security.AnnotationRolePermissionManager
tmp/folder:
---------------------------------------
-rw-rw-r-- 30 data.xml
-rw-rw-r-- 86 index.html
drwxrwxr-x 4096 @@Zope
tmp/folder/@@Zope:
---------------------------------------
drwxrwxr-x 4096 Annotations
-rw-rw-r-- 780 Entries.xml
drwxrwxr-x 4096 Extra
drwxrwxr-x 4096 Original
tmp/folder/@@Zope/Annotations:
---------------------------------------
drwxrwxr-x 4096 data.xml
tmp/folder/@@Zope/Annotations/data.xml:
---------------------------------------
drwxrwxr-x 4096 @@Zope
-rw-rw-r-- 106 zope.app.security.AnnotationRolePermissionManager
tmp/folder/@@Zope/Annotations/data.xml/@@Zope:
---------------------------------------
-rw-rw-r-- 548 Entries.xml
drwxrwxr-x 4096 Original
tmp/folder/@@Zope/Annotations/data.xml/@@Zope/Original:
---------------------------------------
-rw-rw-r-- 106 zope.app.security.AnnotationRolePermissionManager
tmp/folder/@@Zope/Extra:
---------------------------------------
drwxrwxr-x 4096 data.xml
tmp/folder/@@Zope/Extra/data.xml:
---------------------------------------
-rw-rw-r-- 87 contentType
drwxrwxr-x 4096 @@Zope
tmp/folder/@@Zope/Extra/data.xml/@@Zope:
---------------------------------------
-rw-rw-r-- 338 Entries.xml
drwxrwxr-x 4096 Original
tmp/folder/@@Zope/Extra/data.xml/@@Zope/Original:
---------------------------------------
-rw-rw-r-- 87 contentType
tmp/folder/@@Zope/Original:
---------------------------------------
-rw-rw-r-- 30 data.xml
-rw-rw-r-- 86 index.html
Notes on the example:
- The security assertions show up as files in
Annotationssubdirectories. For example, role-permission assignments made ondata.xmlshow up in the filezope.app.security.AnnotationRolePermissionManager, in the directorytmp/folder/@@Zope/Annotations/data.xml. - The
data.xmlcontent type was stored in the filecontentTypein the extra directory fordata.xml,tmp/folder/@@Zope/Extra/data.xml.
The initial tool will be a command-line tool modelled loosly on CVS and subversion and using the ZODB to access objects for synchronization. Later efforts might provide graphical user interfaces or utilize network protocols (other than ZEO) to access data.
A default adapter is provided that saves data via an XML pickle. (Note that a much-improved XML pickler, relative to the XML pickler in Zope 2, will be provided that is fairly friendly to tools like diff and CVS.)
Risks
- Lossy serialization due to incomplete adapters
For the most part, the solution relies on specialized adapters provided for specific content types. A poorly-written adapter could cause data loss or even improperly constructed content that violates its invariants. The risk is mitigated in part by the fact that the annotation framework frees the content-type developer from dealing with meta-data such as security asserions.
- Hierarchical model
File systems are hierarchical, and, for the most part, so is the Zope object system. Cyclical or other non-hierarchical references can cause problems without special care. For example, when we fall back to XML pickles for objects that don't have specilized adapters, and those instances have persistent classes, we'll need to be careful not to include the classes in the serialization.
- Interface adapters are too brittle
Because of this, a separate service was implemented that maps classes to adapters that implement IObjectEntry (and one of its subclasses) and does not automatically map subclasses.
Status
- A new implementation exists; the command line tools are checked in as .../src/zope/fssync/, and a README.txt file there explains the project status in more detail. For usage documentation, run the main.py script with a -h option. I don't plan to keep this Wiki up to date. (Guido van Rossum)
- andym (Sep 30, 2002 1:25 am; Comment #1)
- I followed this with a proposal: CommonFileSystemAPI
- deb_h (Sep 30, 2002 7:26 am; Comment #2)
- file-system synchronization model diagram
