Newer Older
powelld's avatar
powelld committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229
Last modified at [$Date: 2010-10-30 17:56:13 +0000 (Sat, 30 Oct 2010) $]


    * Source code should follow style guidelines.
      OK, we all agree pretty code is good.  Probably best to clean this
      up by hand immediately upon branching a 2.1 tree.
      Status: Justin volunteers to hand-edit the entire source tree ;)

      Justin says:
        Recall when the release plan for 2.0 was written:
            Absolute Enforcement of an "Apache Style" for code.
        Watch this slip into 3.0.

      David says:
        The style guide needs to be reviewed before this can be done.
        The current file is dated April 20th 1998!

        OtherBill offers:
          It's survived since '98 because it's welldone :-)  Suggest we
          simply follow whatever is documented in styleguide.html as we
          branch the next tree.  Really sort of straightforward, if you
          dislike a bit within that doc, bring it up on the dev@httpd
          list prior to the next branch.

      So Bill sums up ... let's get the code cleaned up in CVS head.
      Remember, it just takes cvs diff -b (that is, --ignore-space-change)
      to see the code changes and ignore that cruft.  Get editing Justin :)

    * Replace stat [deferred open] with open/fstat in directory_walk.
      Justin, Ian, OtherBill all interested in this.  Implies setting up
      the apr_file_t member in request_rec, and having all modules use
      that file, and allow the cleanup to close it [if it isn't a shared,
      cached file handle.]

    * The Async Apache Server implemented in terms of APR.
      [Bill Stoddard's pet project.]
      Message-ID: <008301c17d42$9b446970$01000100@sashimi> (dev@apr)

        OtherBill notes that this can proceed in two parts...

           Async accept, setup, and tear-down of the request 
           e.g. dealing with the incoming request headers, prior to
           dispatching the request to a thread for processing.
           This doesn't need to wait for a 2.x/3.0 bump.

           Async delegation of the entire request processing chain
           Too many handlers use stack storage and presume it is 
           available for the life of the request, so a complete 
           async implementation would need to happen 3.0 release.

        Brian notes that async writes will provide a bigger
        scalability win than async reads for most servers.
        We may want to try a hybrid sync-read/async-write MPM
        as a next step.  This should be relatively easy to
        build: start with the current worker or leader/followers
        model, but hand off each response brigade to a "completion
        thread" that multiplexes writes on many connections, so
        that the worker thread doesn't have to wait around for
        the sendfile to complete.

(or: remove knowledge of the filesystem)

[ 2002/10/01: discussion in progress on items below; this isn't
  planned yet ]

    * dav_resource concept for an HTTP resource ("ap_resource")

    * r->filename, r->canonical_filename, r->finfo need to
      disappear. All users need to use new APIs on the ap_resource
      (backwards compat: today, when this occurs with mod_dav and a
       custom backend, the above items refer to the topmost directory
       mapped by a location; e.g. docroot)

      Need to preserve a 'filename'-like string for mime-by-name
      sorts of operations.  But this only needs to be the name itself
      and not a full path.

      Justin: Can we leverage the path info, or do we not trust the

      gstein: well, it isn't the "path info", but the actual URI of
              the resource. And of course we trust the user... that is
              the resource they requested.
              dav_resource->uri is the field you want. path_info might
              still exist, but that portion might be related to the
              CGI concept of "path translated" or some other further
              To continue, I would suggest that "path translated" and
              having *any* path info is Badness. It means that you did
              not fully resolve a resource for the given URI. The
              "abs_path" in a URI identifies a resource, and that
              should get fully resolved. None of this "resolve to
              <here> and then we have a magical second resolution
              (inside the CGI script)" or somesuch.
      Justin: Well, let's consider mod_mbox for a second.  It is sort of
              a virtual filesystem in its own right - as it introduces
              it's own notion of a URI space, but it is intrinsically
              tied to the filesystem to do the lookups.  But, for the
              portion that isn't resolved on the file system, it has
              its own addressing scheme.  Do we need the ability to
              layer resolution?

    * The translate_name hook goes away

      Wrowe altogether disagrees.  translate_name today even operates
      on URIs ... this mechansim needs to be preserved.
    * The doc for map_to_storage is totally opaque to me. It has
      something to do with filesystems, but it also talks about
      security and per_dir_config and other stuff. I presume something
      needs to happen there -- at least better doc.

      Wrowe agrees and will write it up.

    * The directory_walk concept disappears. All configuration is
      tagged to Locations. The "mod_filesystem" module might have some
      internal concept of the same config appearing in multiple
      places, but that is handled internally rather than by Apache

      Wrowe suggests this is wrong, instead it's private to filesystem
      requests, and is already invoked from map_to_storage, not the core
      handler.  <Directory > and <Files > blocks are preserved as-is,
      but <Directory > sections become specific to the filesystem handler
      alone.  Because alternate filesystem schemes could be loaded, this
      should be exposed, from the core, for other file-based stores to 
      share. Consider an archive store where the layers become 
      <Directory path> -> <Archive store> -> <File name>
      Justin: How do we map Directory entries to Locations?
    * The "Location tree" is an in-memory representation of the URL
      namespace. Nodes of the tree have configuration specific to that
      location in the namespace.
      Something like:
      typedef struct {
          const char *name;  /* name of this node relative to parent */

          struct ap_conf_vector_t *locn_config;

          apr_hash_t *children; /* NULL if no child configs */
      } ap_locn_node;

      The following config:
      <Location /server-status>
          SetHandler server-status
          Order deny,allow
          Deny from all
          Allow from
      Creates a node with name=="server_status", and the node is a
      child of the "/" node. (hmm. node->name is redundant with the
      hash key; maybe drop node->name)
      In the config vector, mod_access has stored its Order, Deny, and
      Allow configs. mod_core has stored the SetHandler.
      During the Location walk, we merge the config vectors normally.
      Note that an Alias simply associates a filesystem path (in
      mod_filesystem) with that Location in the tree. Merging
      continues with child locations, but a merge is never done
      through filesystem locations. Config on a specific subdir needs
      to be mapped back into the corresponding point in the Location
      tree for proper merging.

    * Config is parsed into a tree, as we did for the 2.0 timeframe,
      but that tree is just a representation of the config (for
      multiple runs and for in-memory manipulation and usage). It is
      unrelated to the "Location tree".

    * Calls to apr_file_io functions generally need to be replaced
      with operations against the ap_resource. For example, rather
      than calling apr_dir_open/read/close(), a caller uses
      resource->repos->get_children() or somesuch.

      Note that things like mod_dir, mod_autoindex, and mod_negotiation
      need to be converted to use these mechanisms so that their
      functions will work on logical repositories rather than just

    * How do we handle CGI scripts?  Especially when the resource may
      not be backed by a file?  Ideally, we should be able to come up
      with some mechanism to allow CGIs to work in a
      repository-independent manner.

      - Writing the virtual data as a file and then executing it?
      - Can a shell be executed in a streamy manner?  (Portably?)
      - Have an 'execute_resource' hook/func that allows the
        repository to choose its manner - be it exec() or whatever.
        - Won't this approach lead to duplication of code?  Helper fns?

      gstein: PHP, Perl, and Python scripts are nominally executed by
              a filter inserted by mod_php/perl/python. I'd suggest
              that shell/batch scripts are similar.

              But to ask further: what if it is an executable
              *program* rather than just a script? Do we yank that out
              of the repository, drop it onto the filesystem, and run
              it? eeewwwww...
              I'll vote -0.9 for CGIs as a filter. Keep 'em handlers.

      Justin: So, do we give up executing CGIs from virtual repositories?
              That seems like a sad tradeoff to make.  I'd like to have
              my CGI scripts under DAV (SVN) control.

    * How do we handle overlaying of Location and Directory entries?
      Right now, we have a problem when /cgi-bin/ is ScriptAlias'd and
      mod_dav has control over /.  Some people believe that /cgi-bin/
      shouldn't be under DAV control, while others do believe it
      should be.  What's the right strategy?