It is very common for Zope production servers to be run behind the Apache webserver.
- Why should I run Zope behind Apache?
- How should I select which way to go?
- Introduction to RewriteRules and VirtualHostMonster
- Prerequisites for Apache
- Getting your rules
- Dissecting a rule
- What the VHM does with URLs
- Typical use cases
- Examples
- Apache configuration extras
- Debugging, Common Pitfalls, Problems
- End Notes
Why should I run Zope behind Apache?
- You may need both Zope based content and non-Zope based content in the same site. Or even Zope based sites and non-Zope based sites on one server.
- You should not really trust that Zopes Zserver can stand the heat when people send you weird and invalid http requests in hacking and denial of service attacks (ZServer is pretty vulnerable to DOS attacks).
- You want to use Apaches SSL. (There is also ZopeSSL, but Apaches SSL is well-tested and full featured).
- You may want to have a cache of the finished pages to speed up access.
How should I select which way to go?
Apache is a very flexible beast, so it offers you several ways of running Zope behind it. There are basically three ways:
- RewriteRules - this is what people use
- ProxyPass - much less flexible, no real advantage, so rarely used nowadays
- PCGI/FastCGI - deprecated, don't use
CGI is a slow and crappy way of doing it, so I will not cover this in this document. It has been officially depreciated by the Zope project (because there is nobody who volunteered to support it in the code). If you are absolutely sure that you want to use this outdated, complicated and slow way, instructions for this are found at: http://www.devshed.com/c/a/Zope/Using-Zope-With-Apache/
Introduction to RewriteRules and VirtualHostMonster
Modern versions of Zope come pre-configured with a Virtual Host Monster object in the root of the ZODB. This is all it takes from that side, the VHM (as the Virtual Host Monster is sometimes called) does not need configuration except in rare and special cases. Instead you use RewriteRules in your apache config file to instruct apache to pass information to the VHM inside Zope.
With RewriteRules you tell apache to "rewrite" an incoming URL. A request for http://www.example.org/zopesite might get turned into http://127.0.0.1:8080/VirtualHostBase/http/%{SERVER_NAME}:80/VirtualHostRoot/_vh_zopesite/ when it gets from apache to Zope. This new URL may be less good looking, but it is hidden from the user and gives extra information to Zope. In fact this mystic URL helps to hide the technicalities behind the scene of your site from the user.
RewriteRules is flexible in doing this translation by using pattern matching rules instead of straight replacement (like ProxyPass does). You also can opt not to use the proxy functionality.
Prerequisites for Apache
To use RewriteRules you need to configure Apache so that both mod_rewrite and mod_proxy is installed. This can be done by compiling them in statically. You can see what static modules there are by running "httpd -l".
You can also compile the mod_proxy and mod_rewrite modules to dynamic libraries and load them with the following httpd.conf commands:
AddModule mod_rewrite.c AddModule mod_proxy.c LoadModule rewrite_module /usr/lib/apache/1.3/mod_rewrite.so LoadModule proxy_module /usr/lib/apache/1.3/libproxy.so
You also need a "RewriteEngine on" statement in the configuration file to actually switch the rewriting of URLs on.
The complete reference to using RewriteRules are available from http://httpd.apache.org/docs/mod/mod_rewrite.html. Thankfully you don't have to understand the complete syntax, because with Zope only a small subset of the syntax is needed, that subset can be explained with a few examples, and someone even wrote an automated tool to supply you with the proper rule.
Getting your rules
For the most common cases of RewriteRules you can turn to the RewriteRule Witch to generate the proper rules for you. We will also show some of the most common cases in the examples below.
Dissecting a rule
For further insight we will look at a typical request and how it get's rewritten by a commonly seen rule. Let's look at a rule that you may use to send visitors in one part of your site to Zope:
RewriteRule ^/zopesite/(.*) \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{SERVER_NAME}:80/VirtualHostRoot/_vh_zopesite/$1 [L,P]
The first part of this (RewriteRule?) starts the command, after that come three distinct parts:
- the expression that tries to match on the original request URL ^/zopesite/(.*)
- the expression that rewrites this URL when a match exists http://127.0.0.1:8080/VirtualHostBase/http/%{SERVER_NAME}:80/VirtualHostRoot?/_vh_zopesite/$1
- the parameters that allow to alter some of the behaviour [L,P]
These "expressions" are so called "regular expressions" (aka regex), commonly used in many programming languages and editors for programmers. ^/zopesite/(.*) tries to "grab" onto the start of the URL (which at this point in its life inside apache has shed any of the domain specific parts (e.g. http://www.example.org is gone). So if the user requested http://www.example.org/zopesite/railway apache sees /zopesite/railway. Our little regex matches this, as ^ is the start of the string and /zopesite/ comes right after. Then we have the funny part with the parenthesis (.*), which will match any number of characters and "keep them around" for further use.
The second expression uses this "stored" part. Basically it is just a long static string. Two parts of it are dynamically altered: At %{SERVER_NAME} apache will insert the proper name of your server, and at $1 our "stored" bit is inserted, which is railway in our example. We will end up with a string like this:
http://127.0.0.1:8080/VirtualHostBase/http/www.example.org:80/VirtualHostRoot/_vh_zopesite/railway
which is then handed over to Zope (and in turn to the Virtual Host Monster).
What the VHM does with URLs
The Virtual Host Monster extracts the parts of this string, by looking for things like VirtualHostBase?, VirtualHostRoot?, and parts beginning with _vh_.
It uses these to set the base of a page rendered from Zope, so that links pointing back to the same site look and work properly from the source of the page that is sent to the browser.
Hint: A very common symptom for a slightly broken RewriteRule? is that your site loads, but does not find its style sheet or deeper linked pages. This happens because the "pointing back" that the VHM does for you is confused. The solution is usually to get a proper RewriteRule? (recommendation: simply get it from the witch).
If you want to know more about what the VirtualHostBase? and VirtualHostRoot? parts do, then read the explanation available in the "Help" Tab of the Virtual Host Monster object.
Typical use cases
There are a few common cases how Zope is run behind apache.
- Virtual hosting with several sites in one Zope instance.
- Virtual hosting with one site per Zope instance.
And mixed in with these:
- All of the site content is inside Zope.
- A lot of site content is in apache, one folder is served by Zope.
- A lot of site content is in Zope, some parts are served directly by apache.
Examples
Virtual hosting with several sites in one Zope instance
This is the most common setup. You run Apache and Zope on the same server. Apache on port 80 and Zope on port 8080. The folder http://localhost:8080/www_example_com/ should be accessible as http://www.example.com/.
Set up the VirtualHost? block in the Apache httpd.conf on www.virtualdomain.com like this. (We have split the ultra long RewriteRule? line by escaping the newline, when using this in apache config files, keep in mind that there should be no whitespace in front of the new line after the escaped line break):
<VirtualHost *>
ServerName www.example.com
RewriteEngine On
RewriteRule ^(.*) \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{SERVER_NAME}:80/www_example_com/VirtualHostRoot$1 [L,P]
ErrorLog /where/you/store/your/weblogs/www.example.com-error_log
TransferLog /where/you/store/your/weblogs/www.example.com-access_log
ProxyVia on
</VirtualHost>
Or, use this rewrite rule instead so that you don't have to add a new Virtual Host to the apache conf each time (you may need to add ServerAlias? host2 host3 ... depending on your domain setup).:
RewriteRule ^/(.*) \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{HTTP_HOST}:80/%{HTTP_HOST}/VirtualHostRoot/$1 [L,P]
Virtual hosting with one site per Zope instance
You are running several virtual hosts in your Apache, and you want one of these to be the Zope server on port 8080. The "base" of the published site is the same as the root of the ZODB of your Zope instance. That is often not the best choice, but YMMV. This Zope server only serves one site:
<VirtualHost *>
ServerName www.example.com
RewriteEngine On
RewriteRule ^(.*) \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{SERVER_NAME}:80/VirtualHostRoot$1 [L,P]
ErrorLog /where/you/store/your/weblogs/www.domain.com-error_log
TransferLog /where/you/store/your/weblogs/www.domain.com-access_log
ProxyVia on
</VirtualHost>
All of the site content is inside Zope
The basic VirtualHost? configuration in these is covered in the above examples. As far as concerns the URL rewriting business, they vary only in the part that specifies where the site has its "base" in the Zope ZODB.
A lot of site content is in apache, one folder is served by Zope
In this setup we can specify a "folder" that is to be served by Zope, while all of the rest of the site is served directly by apache. This is often called "inside-out hosting". A typical set of RewriteRules for this looks like this:
RewriteRule ^/dynamic$ \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{SERVER_NAME}:80/www_example_com/VirtualHostRoot/_vh_dynamic/ [L,P]
RewriteRule ^/dynamic/(.*) \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{SERVER_NAME}:80/www_example_com/VirtualHostRoot/_vh_dynamic/$1 [L,P]
...or, if we want to have the base of the public site coincide with the Zope root:
RewriteRule ^/dynamic$ \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{SERVER_NAME}:80/VirtualHostRoot/_vh_dynamic/ [L,P]
RewriteRule ^/dynamic/(.*) \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{SERVER_NAME}:80/VirtualHostRoot/_vh_dynamic/$1 [L,P]
A lot of site content is in Zope, some parts are served directly by apache
For this, we use the RewriteRule? where everything is inside Zope, but slap a RewriteCond? in front of it:
RewriteCond %{REQUEST_URI} !^/(stats|manual|static_images)
RewriteRule ^(.*) \
http://127.0.0.1:8080/VirtualHostBase/\
http/%{SERVER_NAME}:80/www_example_com/VirtualHostRoot$1 [L,P]
Which will route everything to Zope, except for request to resources that live inside one of the folders /stats, /manual, or /static_images. See Rewriting All... but a few for some details.
Apache configuration extras
None of this actually has much to do with Zope, but with general Apache virtual hosting , but I'll take it up here anyway, since it may be of interest to you.
Separate Logs for each virtual host
Zope will log everything in the same log, regardless of what virtual host was accessed. If you want separate logs for each virtual host you can let Apache create them instead. Add commands like this in each VirtualHost? block:
ErrorLog /where/you/store/your/weblogs/www.domain.com-error_log TransferLog /where/you/store/your/weblogs/www.domain.com-access_log
Setting REMOTE_ADDR to the client IP
When using Apache as a proxy in front of Zope the REMOTE_ADDR in Zope is the IP address of the Apache server. This IP-address is also the one stored in the Zope access logs. If you need the real IP address instead, you can use the ProxyVia? directive:
ProxyVia on
This only works in Apache 1.3.2 and later. For Earlier Apache versions there is a patch available from http://www.zope.org/Members/unfo/apache_zserver_ssl
In most cases you have absolutely no need for this. If you want access logs with the correct IP you can instead use Apaches logs (see above).
Redirecting access to /manage to an SSL virtual host
Logging in via a non-ssl page isn't really advisable. With RewriteRules we can redirect log-ins to /manage to an SSL enabled virtual host:
RewriteRule ^/(.*)/manage(.*) \ https://www.example.org/$1/manage$2 [NC,R=301,L]
Of course for this to work the URL of the virtual host with path to Zope needs to be right.
Another example, when you run a ZWiki with restricted access, you might also want to redirect access to /editform, again the paths have to be adjusted to your setup:
RewriteRule ^/(.*)/editform$ \ https://www.example.org/zope/wikifolder/$1/editform [NC,R=301,L]
Don't forget that these rules go into your non-SSL VirtualHost setup!
Debugging, Common Pitfalls, Problems
Even when having proper RewriteRules, things can go wrong, and sometimes you don't know if your rules are all that good. Here are some pointers for where to start looking. First of all you have to find out what's going on:
Make sure you have RewriteEngine? On in your VirtualHost? setup
set up logging of URL rewriting to a temporary file, check out what's happening there, putting these statements before your RewriteRules will do that:
RewriteLog "logs/rewrite.log" RewriteLogLevel 9
When copy/pasting rules from this page or from the witch, and when the rules you copy use line escaping with \, you have to make sure to have no whitespace at the start of the new line after the escaped line break. If this happens you will usually get Syntax errors in relation to the RewriteRules when you check the config file with apachectl configtest (as you should do). Sometimes whitespace at the start of a continued line is allowed (when there was supposed to be whitespace at the continuation point anyway), so this may confuse you.
As mentioned above, a usual symptom of RewriteRules that are "almost, but not really" right is style sheets not loading and files linked from you page not found, also the frameset of the ZMI loading, but the framed pages themselves not. In that case get a proper set of RewriteRules from the witch or debug your own.
Often problems are not directly in your RewriteRules, but in the apache setup, especially the VirtualHost? block. Some common problems there:
- The subtle differences between VirtualHost? *:80, VirtualHost? *, and VirtualHost? 123.4.5.6 (where 123.4.5.6 would be your servers IP). We're not going into the details here (consult apache docs), but we want to point out that whatever your VirtualHost? block uses, usually has to match your NameVirtualHost? directive.
- Yes, mod_proxy and mod_rewrite really have to be enabled, see "Prerequisites for Apache" above.
- Even if all your content is inside Zope, you might need a Location block in apache's httpd.conf to allow access to visitors.
- Sometimes you have to explicitly enable proxying, but make very sure that you do not end up as an open proxy. Consult your apache documentation, and double check that your host does not proxy for all of the Internet.
End Notes
- This document is not valid if you are using EasyPublisher?.
- If you want instructions on how to install Zope and SSL you can find this here:
- Zope and Apaches SSL: http://www.zope.org/Members/unfo/apache_zserver_ssl
- Zope and SSL without apache: http://www.zope.org/Members/Ioan/ZopeSSL
yes, this stuff is hard --simon, Fri, 22 Jul 2005 18:08:58 -0700 reply
http://people.apache.org/~rbowen/presentations/apacheconEU2005/hate_apache.pdf
apache has caching problems --simon, Tue, 02 Aug 2005 15:11:17 -0700 reply
Both apache 1 and 2's mod_proxy have had bugs which prevent proper caching and make your zope sites slower than they should be. Here are some starting points for this topic.
- http://issues.apache.org/bugzilla/show_bug.cgi?id=32950
- http://issues.apache.org/bugzilla/show_bug.cgi?id=33512
- http://plone.org/events/regional/nola05/collateral/Geoff%20DAvis-Caching.pdf
- plone-dev list
how to strip www. to get a canonical hostname --simon , Mon, 06 Feb 2006 11:56:22 -0800 reply
Before you proxy to zope, selecting the folder based on HTTP_HOST, you can strip the www. with this (tested in apache 1.3):
RewriteCond %{HTTP_HOST} ^(www\.)(.*) [NC]
RewriteRule ^/(.*) http://%2/$1 [R]
Log File Effect --beren, Fri, 22 Dec 2006 03:53:26 -0800 reply
After enabling this on my Zope/Apache server, my Z2 log file no longer has the username of the authenticated user in it. This screws up my web usage statistics and stuff. has anyone else had this issue or know how to resolve it?
beren
Log File Effect --beren, Fri, 22 Dec 2006 04:13:27 -0800 reply
Here's a bit more information. I've configured apache rewrite rules using the guide above. I have two local zope (v. 2.9.2) instances that I'm serving up with apache 2. I have one folder in each zope server: TestSite1? and TestSite2?. Everything is running locally on my test box. Here's my apache config:
<VirtualHost? > ServerName? www.testsite1.com RewriteEngine? On RewriteLog? "c:temprewrite_log1" RewriteLogLevel? 1 RewriteRule? ^/(.) http://127.0.0.1:9080/VirtualHostBase/http/www.testsite1.com:80/TestSite1/VirtualHostRoot/$1 [L,P]? ProxyVia? on </VirtualHost?>
<VirtualHost? > ServerName? www.testsite2.com RewriteEngine? On RewriteLog? "c:temprewrite_log2" RewriteLogLevel? 1 RewriteRule? ^/(.) http://127.0.0.1:9081/VirtualHostBase/http/www.testsite2.com:80/TestSite2/VirtualHostRoot/$1 [L,P]? ProxyVia? On </VirtualHost?>
With this I get the correct rewrite action happening, however when I look at the Z2.log I get a record like this:
127.0.0.1 - Anonymous [22/Dec/2006:07:02:14 -0400]? "GET /VirtualHostBase?/http/www.testsite2.com:80/TestSite2?/VirtualHostRoot?/ HTTP/1.1" 200 19661 "" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"
What's with the "Anonymous" thing and the URL too? I'm logged into the site so how come it does not write my userid correctly? Same question on the URL. Is there a way to get it to write a URL to the log that matches what the user sees? I'd like all this for doing logfile analysis on hits and stuff.
Any advice would be great.
beren
proxyvia on doesn't work --yurj, Wed, 23 May 2007 13:58:17 +0000 reply
I still get the apache ip as client ip, not the client ip directly :(
proxyvia on doesn't work --kowb, Fri, 09 Nov 2007 08:29:24 +0000 reply
http://www.plope.com/Books/2_7Edition/VirtualHosting.stx#2-6
After Four Days I saw the Light --kowb, Fri, 09 Nov 2007 08:31:32 +0000 reply
While the above article was very helpful in assisting with my understanding, I finally got understandable results after reading:
http://www.plope.com/Books/2_7Edition/VirtualHosting.stx#2-6
It's concise, well-written and leaves out the least amount of information that I have seen on this subject. :)
After Four Days I saw the Light --betabug, Fri, 09 Nov 2007 08:43:38 +0000 reply
Of course we assume that you already know the Zope book by heart. :-)
The solution regarding the ProxyVia? --peterbe, Thu, 12 Jun 2008 06:36:59 -0400 reply
Like the comments above point out, setting ProxyVia? to On doesn't mean you get the remote_addr passed through to Zope. Setting ProxyVia? On just means you get an extra header passed through helpful only to the curious.
- The solution is to set trusted-proxy in etc/zope.conf to the same IP address that you're proxying through. So, if you set your virtual host to be:
- ProxyPass / http://127.0.0.1:8080/VirtualHostBase/http/example.com:80/VirtualHostRoot/
- Then you need to have this in zope.conf
- trusted-proxy 127.0.0.1 trusted-proxy example.com
Problems with exUserFolder logins --Toni, Wed, 02 Jul 2008 20:23:47 -0400 reply
I am migrating my web application to a new server and I am trying to configure Zope + Apache as described in this wiki. I previously used fast-cgi to communicate Apache with Zope and it worked just fine. I use exUserFolder to store my users in a postgresql database. Everything seems to work fine except when I try to login to a restricted area through port 80. After debugging a bit (unfortunately I'm not any Zope guru) I've noticed that the username and password variables which are posted from the login page are located at the HTTPRequest?.stdin property when the page is accessed through the 8080 port, but they are not present when the page is accessed though apache.
Anyone has any clue about what might be going on? Thanks a lot.