Commit 3b2b0b74 authored by William A. Rowe Jr's avatar William A. Rowe Jr
Browse files

  By popular demand, the beginnings of an explanation of how the request
  phase runs in 2.0 (compared with 1.3.)


git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@91111 13f79535-47bb-0310-9956-ffa450edef68
parent 3800a516
Loading
Loading
Loading
Loading
+3 −0
Original line number Diff line number Diff line
@@ -27,6 +27,9 @@
   <dt><a href="API.html">Apache 2.0 API Notes</a></dt>
      <dd>Overview of Apache's Application Programming Interface.</dd>
   <dt><a href="hooks.html">Apache Hook Functions</a></dt>
      <dl>
      <dt><a href="request.html">Request Processing in Apache 2.0</a></dt>
      </dl>
   <dt><a href="modules.html">Porting Apache 1.3 Modules</a></dt>
   <dt><a href="debugging.html">Debugging Memory Allocation</a></dt>
   <dt><a href="documenting.html">Documenting Apache 2.0</a></dt>
+217 −0
Original line number Diff line number Diff line
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<TITLE>Request Processing in Apache 2.0</TITLE>
</HEAD>

<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
<BODY
 BGCOLOR="#FFFFFF"
 TEXT="#000000"
 LINK="#0000FF"
 VLINK="#000080"
 ALINK="#FF0000"
>

<!--#include virtual="header.html" -->

<h1>Request Processing in Apache 2.0</h1>

<p>Warning - this is a first (fast) draft that needs further revision!</p>

<p>Several changes in Apache 2.0 affect the internal request processing
   mechanics.  Module authors need to be aware of these changes so they
   may take advantage of the optimizations and security enhancements.</p>

<p>The first major change is to the subrequest and redirect mechanisms.
   There were a number of different code paths in Apache 1.3 to attempt
   to optimize subrequest or redirect behavior.  As patches were introduced
   to 2.0, these optimizations (and the server behavior) were quickly broken
   due to this duplication of code.  All duplicate code has been folded
   back into <code>ap_process_internal_request()</code> to prevent the
   code from falling out of sync again.</p>

<p>This means that much of the existing code was 'unoptimized'.  It is 
   the Apache HTTP Project's first goal to create a robust and correct
   implementation of the HTTP server RFC.  Additional goals include
   security, scalability and optimization.  New methods were sought to
   optimize the server (beyond the performance of Apache 1.3) without
   introducing fragile or insecure code.</p>

<h2>The Request Processing Cycle</h2>

<p>All requests pass through <code>ap_process_request_internal()</code>
   in request.c, including subrequests and redirects.  If a module doesn't 
   pass generated requests through this code, the author is cautioned that
   the module may be broken by future changes to request processing.</p>

<p>To streamline requests, the module author can take advantage of the
   hooks offered to drop out of the request cycle early, or to bypass
   core Apache hooks which are irrelevant (and costly in terms of CPU.)</p>

<h2>The Request Parsing Phase</h3>

<h3>Unescapes the URL</h3>

<p>The request's parsed_uri path is unescaped, once and only once, at the
   beginning of internal request processing.</p>

<p>This step is bypassed if the proxyreq flag is set, or the parsed_uri.path
   element is unset.  The module has no further control of this one-time 
   unescape operation, either failing to unescape or multiply unescaping
   the URL leads to security reprecussions.</p>

<h3>Strips Parent and This Elements from the URI</h3>

<p>All <code>/../</code> and <code>/./</code> elements are removed by
   <code>ap_getparents()</code>.  This helps to ensure the path is (nearly) 
   absolute before the request processing continues.</p>

<p>This step cannot be bypassed.</p>

<h3>Initial URI Location Walk</h3>

<p>Every request is subject to an <code>ap_location_walk()</code> call.
   This ensures that &lt;Location &gt; sections are consistently enforced for 
   all requests.  If the request is an internal redirect or a sub-request, 
   it may borrow some or all of the processing from the previous or parent 
   request's ap_location_walk, so this step is generally very efficient 
   after processing the main request.</p>

<h3>Hook: translate_name</h3>

<p>Modules can determine the file name, or alter the given URI in this step.
   For example, mod_vhost_alias will translate the URI's path into the
   configured virtual host, mod_alias will translate the path to an alias
   path, and if the request falls back on the core, the DocumentRoot is
   prepended to the request resource.

<p>If all modules DECLINE this phase, an error 500 is returned to the browser,
   and a "couldn't translate name" error is logged automatically.</p>

<h3>Hook: map_to_storage</h3>

<p>After the file or correct URI was determined, the appropriate per-dir
   configurations are merged together.  For example, mod_proxy compares 
   and merges the appropriate &lt;Proxy &gt; sections.  If the URI is nothing 
   more than a local (non-proxy) TRACE request, the core handles the
   request and returns DONE.  If no module answers this hook with OK or
   DONE, the core will run the request filename against the &lt;Directory &gt;
   and &lt;Files &gt; sections.  If the request 'filename' isn't an absolute,
   legal filename, a note is set for later termination.</p>

<h3>Initial URI Location Walk</h3>

<p>Every request is hardened by a second <code>ap_location_walk()</code> 
  call.  This reassures that a translated request is still subjected to 
  the configured &lt;Location &gt; sections.  The request again borrows 
  some or all of the processing from it's previous location_walk above, 
  so this step is almost always very efficient unless the translated URI 
  mapped to a substantially different path or Virtual Host.</p>

<h3>Hook: header_parser</h3>

<p>The main request then parses the client's headers.  This prepares 
the remaining request processing steps to better serve the client's 
request.</p>

<h2>The Security Phase</h3>

<p>Needs Documentation.  Code is;</p>
<pre>
    switch (ap_satisfies(r)) {
    case SATISFY_ALL:
    case SATISFY_NOSPEC:
        if ((access_status = ap_run_access_checker(r)) != 0) {
            return decl_die(access_status, "check access", r);
        }
        if (ap_some_auth_required(r)) {
            if (((access_status = ap_run_check_user_id(r)) != 0) || !ap_auth_type(r)) {
                return decl_die(access_status, ap_auth_type(r)
		            ? "check user.  No user file?"
		            : "perform authentication. AuthType not set!", r);
            }
            if (((access_status = ap_run_auth_checker(r)) != 0) || !ap_auth_type(r)) {
                return decl_die(access_status, ap_auth_type(r)
		            ? "check access.  No groups file?"
		            : "perform authentication. AuthType not set!", r);
            }
        }
        break;
    case SATISFY_ANY:
        if (((access_status = ap_run_access_checker(r)) != 0) || !ap_auth_type(r)) {
            if (!ap_some_auth_required(r)) {
                return decl_die(access_status, ap_auth_type(r)
		            ? "check access"
		            : "perform authentication. AuthType not set!", r);
            }
            if (((access_status = ap_run_check_user_id(r)) != 0) || !ap_auth_type(r)) {
                return decl_die(access_status, ap_auth_type(r)
		            ? "check user.  No user file?"
		            : "perform authentication. AuthType not set!", r);
            }
            if (((access_status = ap_run_auth_checker(r)) != 0) || !ap_auth_type(r)) {
                return decl_die(access_status, ap_auth_type(r)
		            ? "check access.  No groups file?"
		            : "perform authentication. AuthType not set!", r);
            }
	        }
        break;
    }
</pre>
<h2>The Preparation Phase</h2>

<h3>Hook: type_checker</h3>

<p>The modules have an opportunity to test the URI or filename against
   the target resource, and set mime information for the request.  Both
   mod_mime and mod_mime_magic use this phase to compare the file name
   or contents against the administrator's configuration and set the
   content type, language, character set and request handler.  Some
   modules may set up their filters or other request handling parameters
   at this time.</p>

<p>If all modules DECLINE this phase, an error 500 is returned to the browser,
   and a "couldn't find types" error is logged automatically.</p>

<h3>Hook: fixups</h3>

<p>Many modules are 'trounced' by some phase above.  The fixups phase is
   used by modules to 'reassert' their ownership or force the request's
   fields to their appropriate values.  It isn't always the cleanest
   mechanism, but occasionally it's the only option.</p>

<h3>Hook: insert_filter</h3>

<p>Modules that transform the content in some way can insert their values
   and override existing filters, such that if the user configured a more
   advanced filter out-of-order, then the module can move it's order as
   need be.

<h2>The Handler Phase</h2>

<p>This phase is <strong><em>not</em></strong> part of the processing in
   <code>ap_process_request_internal()</code>.  Many modules prepare one
   or more subrequests prior to creating any content at all.  After the
   core, or a module calls <code>ap_process_request_internal()</code> it 
   then calls <code>ap_invoke_handler()</code> to generate the request.</p>

<h3>Hook: handler</h3>

<p>The module finally has a chance to serve the request in it's handler
   hook.  Note that not every prepared request is sent to the handler
   hook.  Many modules, such as mod_autoindex, will create subrequests
   for a given URI, and then never serve the subrequest, but simply
   lists it for the user.  Remember not to put required teardown from
   the hooks above into this module, but register pool cleanups against
   the request pool to free resources as required.</p>

<!--#include virtual="footer.html" -->

</body>
</html>