Apache HTTP Server Version 2.4
Description: | Multi-protocol proxy/gateway server |
---|---|
Status: | Extension |
Module Identifier: | proxy_module |
Source File: | mod_proxy.c |
Do not enable proxying with ProxyRequests
until you have secured your server. Open proxy servers are dangerous both to your
network and to the Internet at large.
mod_proxy
and related modules implement a
proxy/gateway for Apache HTTP Server, supporting a number of popular
protocols as well as several different load balancing algorithms.
Third-party modules can add support for additional protocols and
load balancing algorithms.
A set of modules must be loaded into the server to provide the
necessary features. These modules can be included statically at
build time or dynamically via the
LoadModule
directive).
The set must include:
mod_proxy
, which provides basic proxy
capabilitiesmod_proxy_balancer
and one or more
balancer modules if load balancing is required. (See
mod_proxy_balancer
for more information.)Protocol | Module |
---|---|
AJP13 (Apache JServe Protocol version 1.3) | mod_proxy_ajp |
CONNECT (for SSL) | mod_proxy_connect |
FastCGI | mod_proxy_fcgi |
ftp | mod_proxy_ftp |
HTTP/0.9, HTTP/1.0, and HTTP/1.1 | mod_proxy_http |
SCGI | mod_proxy_scgi |
WS and WSS (Web-sockets) | mod_proxy_wstunnel |
In addition, extended features are provided by other modules.
Caching is provided by mod_cache
and related
modules. The ability to contact remote servers using the SSL/TLS
protocol is provided by the SSLProxy*
directives of
mod_ssl
. These additional modules will need
to be loaded and configured to take advantage of these features.
Apache HTTP Server can be configured in both a forward and reverse proxy (also known as gateway) mode.
An ordinary forward proxy is an intermediate server that sits between the client and the origin server. In order to get content from the origin server, the client sends a request to the proxy naming the origin server as the target. The proxy then requests the content from the origin server and returns it to the client. The client must be specially configured to use the forward proxy to access other sites.
A typical usage of a forward proxy is to provide Internet
access to internal clients that are otherwise restricted by a
firewall. The forward proxy can also use caching (as provided
by mod_cache
) to reduce network usage.
The forward proxy is activated using the ProxyRequests
directive. Because
forward proxies allow clients to access arbitrary sites through
your server and to hide their true origin, it is essential that
you secure your server so that only
authorized clients can access the proxy before activating a
forward proxy.
A reverse proxy (or gateway), by contrast, appears to the client just like an ordinary web server. No special configuration on the client is necessary. The client makes ordinary requests for content in the namespace of the reverse proxy. The reverse proxy then decides where to send those requests and returns the content as if it were itself the origin.
A typical usage of a reverse proxy is to provide Internet users access to a server that is behind a firewall. Reverse proxies can also be used to balance load among several back-end servers or to provide caching for a slower back-end server. In addition, reverse proxies can be used simply to bring several servers into the same URL space.
A reverse proxy is activated using the ProxyPass
directive or the
[P]
flag to the RewriteRule
directive. It is
not necessary to turn ProxyRequests
on in order to
configure a reverse proxy.
The examples below are only a very basic idea to help you get started. Please read the documentation on the individual directives.
In addition, if you wish to have caching enabled, consult
the documentation from mod_cache
.
ProxyPass "/foo" "http://foo.example.com/bar" ProxyPassReverse "/foo" "http://foo.example.com/bar"
ProxyRequests On ProxyVia On <Proxy "*"> Require host internal.example.com </Proxy>
You can also force a request to be handled as a reverse-proxy request, by creating a suitable Handler pass-through. The example configuration below will pass all requests for PHP scripts to the specified FastCGI server using reverse proxy:
<FilesMatch "\.php$"> # Unix sockets require 2.4.7 or later SetHandler "proxy:unix:/path/to/app.sock|fcgi://localhost/" </FilesMatch>
This feature is available in Apache HTTP Server 2.4.10 and later.
The proxy manages the configuration of origin servers and their communication parameters in objects called workers. There are two built-in workers: the default forward proxy worker and the default reverse proxy worker. Additional workers can be configured explicitly.
The two default workers have a fixed configuration and will be used if no other worker matches the request. They do not use HTTP Keep-Alive or connection reuse. The TCP connections to the origin server will instead be opened and closed for each request.
Explicitly configured workers are identified by their URL.
They are usually created and configured using
ProxyPass
or
ProxyPassMatch
when used
for a reverse proxy:
ProxyPass "/example" "http://backend.example.com" connectiontimeout=5 timeout=30
This will create a worker associated with the origin server URL
http://backend.example.com
that will use the given timeout
values. When used in a forward proxy, workers are usually defined
via the ProxySet
directive:
ProxySet "http://backend.example.com" connectiontimeout=5 timeout=30
or alternatively using Proxy
and ProxySet
:
<Proxy "http://backend.example.com"> ProxySet connectiontimeout=5 timeout=30 </Proxy>
Using explicitly configured workers in the forward mode is
not very common, because forward proxies usually communicate with many
different origin servers. Creating explicit workers for some of the
origin servers can still be useful if they are used very often.
Explicitly configured workers have no concept of forward or reverse
proxying by themselves. They encapsulate a common concept of
communication with origin servers. A worker created by
ProxyPass
for use in a
reverse proxy will also be used for forward proxy requests whenever
the URL to the origin server matches the worker URL, and vice versa.
The URL identifying a direct worker is the URL of its origin server including any path components given:
ProxyPass "/examples" "http://backend.example.com/examples" ProxyPass "/docs" "http://backend.example.com/docs"
This example defines two different workers, each using a separate connection pool and configuration.
Worker sharing happens if the worker URLs overlap, which occurs when the URL of some worker is a leading substring of the URL of another worker defined later in the configuration file. In the following example
ProxyPass "/apps" "http://backend.example.com/" timeout=60 ProxyPass "/examples" "http://backend.example.com/examples" timeout=10
the second worker isn't actually created. Instead the first
worker is used. The benefit is, that there is only one connection pool,
so connections are more often reused. Note that all configuration attributes
given explicitly for the later worker will be ignored. This will be logged
as a warning. In the above example, the resulting timeout value
for the URL /examples
will be 60
instead
of 10
!
If you want to avoid worker sharing, sort your worker definitions
by URL length, starting with the longest worker URLs. If you want to maximize
worker sharing, use the reverse sort order. See also the related warning about
ordering ProxyPass
directives.
Explicitly configured workers come in two flavors:
direct workers and (load) balancer workers.
They support many important configuration attributes which are
described below in the ProxyPass
directive. The same attributes can also be set using
ProxySet
.
The set of options available for a direct worker
depends on the protocol which is specified in the origin server URL.
Available protocols include ajp
, fcgi
,
ftp
, http
and scgi
.
Balancer workers are virtual workers that use direct workers known as their members to actually handle the requests. Each balancer can have multiple members. When it handles a request, it chooses a member based on the configured load balancing algorithm.
A balancer worker is created if its worker URL uses
balancer
as the protocol scheme.
The balancer URL uniquely identifies the balancer worker.
Members are added to a balancer using
BalancerMember
.
DNS resolution happens when the socket to
the origin domain is created for the first time.
When connection reuse is enabled, each backend domain is resolved
only once per child process, and cached for all further connections
until the child is recycled. This information should to be considered
while planning DNS maintenance tasks involving backend domains.
Please also check ProxyPass
parameters for more details about connection reuse.
You can control who can access your proxy via the <Proxy>
control block as in
the following example:
<Proxy "*"> Require ip 192.168.0 </Proxy>
For more information on access control directives, see
mod_authz_host
.
Strictly limiting access is essential if you are using a
forward proxy (using the ProxyRequests
directive).
Otherwise, your server can be used by any client to access
arbitrary hosts while hiding his or her true identity. This is
dangerous both for your network and for the Internet at large.
When using a reverse proxy (using the ProxyPass
directive with
ProxyRequests Off
), access control is less
critical because clients can only contact the hosts that you
have specifically configured.
See Also the Proxy-Chain-Auth environment variable.
If you're using the ProxyBlock
directive, hostnames' IP addresses are looked up
and cached during startup for later match test. This may take a few
seconds (or more) depending on the speed with which the hostname lookups
occur.
An Apache httpd proxy server situated in an intranet needs to forward
external requests through the company's firewall (for this, configure
the ProxyRemote
directive
to forward the respective scheme to the firewall proxy).
However, when it has to
access resources within the intranet, it can bypass the firewall when
accessing hosts. The NoProxy
directive is useful for specifying which hosts belong to the intranet and
should be accessed directly.
Users within an intranet tend to omit the local domain name from their
WWW requests, thus requesting "http://somehost/" instead of
http://somehost.example.com/
. Some commercial proxy servers
let them get away with this and simply serve the request, implying a
configured local domain. When the ProxyDomain
directive is used and the server is configured for proxy service, Apache httpd can return
a redirect response and send the client to the correct, fully qualified,
server address. This is the preferred method since the user's bookmark
files will then contain fully qualified hosts.
For circumstances where mod_proxy
is sending
requests to an origin server that doesn't properly implement
keepalives or HTTP/1.1, there are two environment variables that can force the
request to use HTTP/1.0 with no keepalive. These are set via the
SetEnv
directive.
These are the force-proxy-request-1.0
and
proxy-nokeepalive
notes.
<Location "/buggyappserver/"> ProxyPass "http://buggyappserver:7001/foo/" SetEnv force-proxy-request-1.0 1 SetEnv proxy-nokeepalive 1 </Location>
In 2.4.26 and later, the "no-proxy" environment variable can be set to disable
mod_proxy
processing the current request.
This variable should be set with SetEnvIf
, as SetEnv
is not evaluated early enough.
Some request methods such as POST include a request body.
The HTTP protocol requires that requests which include a body
either use chunked transfer encoding or send a
Content-Length
request header. When passing these
requests on to the origin server, mod_proxy_http
will always attempt to send the Content-Length
. But
if the body is large and the original request used chunked
encoding, then chunked encoding may also be used in the upstream
request. You can control this selection using environment variables. Setting
proxy-sendcl
ensures maximum compatibility with
upstream servers by always sending the
Content-Length
, while setting
proxy-sendchunked
minimizes resource usage by using
chunked encoding.
Under some circumstances, the server must spool request bodies to disk to satisfy the requested handling of request bodies. For example, this spooling will occur if the original body was sent with chunked encoding (and is large), but the administrator has asked for backend requests to be sent with Content-Length or as HTTP/1.0. This spooling can also occur if the request body already has a Content-Length header, but the server is configured to filter incoming request bodies.
LimitRequestBody
only applies to
request bodies that the server will spool to disk
When acting in a reverse-proxy mode (using the ProxyPass
directive, for example),
mod_proxy_http
adds several request headers in
order to pass information to the origin server. These headers
are:
X-Forwarded-For
X-Forwarded-Host
Host
HTTP request header.X-Forwarded-Server
Be careful when using these headers on the origin server, since
they will contain more than one (comma-separated) value if the
original request already contained one of these headers. For
example, you can use %{X-Forwarded-For}i
in the log
format string of the origin server to log the original clients IP
address, but you may get more than one address if the request
passes through several proxies.
See also the ProxyPreserveHost
and ProxyVia
directives, which control
other request headers.
Note: If you need to specify custom request headers to be
added to the forwarded request, use the
RequestHeader