ZIP Import Support for Zope 3
ZIP Import Support for Zope 3
Status
Author
FredDrake?
Problem
Zope application code typically uses many non-code files stored as data within a package. This files commonly include configuration data (ZCML), page templates, and resources.
Making this work for packages in ZIP archives requires cooperation between the Zope application server and packages in the ZIP archive. The ZIP archive alone is not able to take care of the issues while still following the Zope development patterns which have become common practice.
Goals and Assumptions
- It is desirable to load Zope code from within ZIP archives; this includes Python Eggs.
- It is preferrable to do so without forcing ZIP archives to be unpacked on the filesystem.
Supporting Zope code in ZIP archives
Many operations in Zope 3 access files provided by packages using direct filesystem access. For Python code installed directly on the filesystem, that works and is easy for any Python developer to work with. The paths used are normally computed using the __file__ or __path__ from files in the package providing the data. Unfortunately, accessing files stored in ZIP archives using this method will not work even after the archives are placed on sys.path; the paths computed simply do not exist on the filesystem.
There are a couple of different approaches to dealing with this. One is to unpack ZIP archives used as import sources and to replace the sys.path entry with the corresponding filesystem location. This offers the advantage that entirely conventional packages can be used directly; no changes are necessary. It has the drawback that some location that contains the unpacked versions needs to be managed to some degree. It also has a dependency on import order: ZIP files that exist on sys.path before the storage manager is initialized may already have provided modules; importing additional modules from the provided packages could cause them to come from a different source. While it isn't clear that this is necessarily a problem, it invites fragility.
An alternative is to load the data files directly from the ZIP archives. This will generally require the packages understand that they may be used as ZIP archives, and use some alternate approach to conventional path computation and the built-in filesystem access functions. Bob Ippolito's pkg_resources module is one popular support module for this usage.
Fred Drake spent most of his time at the sprint adding support for packages contained in ZIP archives. This document describes the status of that work at the end of the sprint. The implementation is currently on the zipimport-support branch of Zope 3; the goal is to make it easier to use ZIP-packaged code in both Zope 3 and Zope 2 with Five. As hoped-for side benefit is to make it easier to use Python Eggs with Zope.
The approach taken is that resources are loaded directly from the ZIP archives rather than exploding the archives and modifying sys.path. This approach was chosen to avoid having to support management of a temporary directory storing exploded copies of ZIP files, though that offers advantages as well.
To support that approach, ZConfig and zope.configuration were modified to work directly with references to files contained in packages.
ZConfig's <import> element in schema and %import statement both directly support any configuration component that can be loaded using a loader's get_data() method (see PEP 302). Loaders from the zipimport module support this method; other loaders that support this method and add themselves to the package's namespace as __loader__ should be supported without further changes.
The changes for zope.configuration are somewhat different. This is needed because the loading of referenced resources is frequently performed by directive handlers (often by calling other APIs) rather than the zope.configuration package itself. To support use of resources by code outside the zope.configuration package which is often not itself part of the configuration system, a new package has been added to support working with file references instead of simple paths.
zope.filereference
The new package, zope.filereference, provides an API based substantially on that of the os.path module, with the addition of these new functions:
- new(path, package=None, basepath=None)
- Return a new path reference, encapsulating information about the source package. If package is specified and non-None, it must be a package module which allows access to the __path__ and __loader__ attributes (where applicable). Reference objects are str subclasses that provide additional methods. The value of the reference will be the absolute file name as best as it can be determined given the information provided. When the path refers to a file on the filesystem, the reference can be passed directly to the built-in open() function for code that does not use the zope.filereference package itself.
- open(ref, mode="r")
- Return an opened file object based on the given path reference. If ref is an IFileReference, the result of it's open() method is returned, otherwise ref is interpreted as a simple string path, for which the result of calling the built-in open() function with ref as the argument is returned. The mode argument must specify a read-only mode. This function is not suitable as a replacement for the built-in open() function.
- packageReference(path, package=None)
- Return a package-relative reference. If package is None, this uses the context of the caller to generate a package-relative reference to path, which should be a relative path name. If package is not None, it may be either a package name (as a string) or a package module. That package will be used instead of the caller's package context.
The functions exists(), getmtime(), isdir(), and isfile() all mirror their os.path counterparts.
Compatibility issues
The result is that use of file-references in un-aware code still allows everything that worked before to continue working; code can become "aware" by using functions from zope.filereference instead of the built-in open() and the os.path functions exists(), getmtime(), isdir(), and isfile(). It is important to avoid unnecessary path-manipulation calls since path references will be treated as strings by those functions, resulting in the loss of package-relative information (and complete failure to open files contained in ZIP archives).
Remaining tasks
Several packages have been updated to have at least limited support for resources provided by packages. Some things remain to be done:
- Directory resources from zope.app.publisher and zope.app.onlinehelp are not supported. This will require that IFileReferenceAPI grow a directory-listing API mirroring os.listdir().
- Need to locate other places that need updates. Any place that accepts paths from ZCML needs to be adjusted to avoid unnecessary path manipulation and use the zope.filereference APIs for existence checks and actually opening files.
- zope.filereference needs to grow APIs parallel to os.walk() and os.listdir().
- Additional thought is needed to determine how to deal with eggs and the pkg_resources.requires() calls. That has not yet been considered, but needs to be arranged so that it happens before the zope.conf file is processed, since it substantially affects when modules can be imported at configuration time. Chris McDonough's Basket product demonstrates one approach in the context of Zope 2.
- The right approach to dealing with ZCML "slugs" in ZIP-based packages needs to be determined.
- The updated ZConfig needs to be tagged and knit into the zipimport-support branch.
Historical Note
This proposal and the initial implementation steps were made at the Goldegg Packaging Sprint in San Jose, California. Thanks to Goldegg for supporting this work!
