Difference between revisions of "User:TerryE/Traffic Server Configuration"

From Apache OpenOffice Wiki
Jump to: navigation, search
Line 1: Line 1:
+
== Why Traffic Server? ==
== Why Traffic Server? ==
+
  
Apache Traffic Server is a lightweight, yet high-performance, web proxy cache that improves network efficiency and performance<ref>http://trafficserver.apache.org/</ref> . Like [[MWmanual:Squid caching|Squid]] and [[MWmanual:Varnish caching|Varnish]], Traffic Server can be configured as a reverse proxy<ref>http://trafficserver.apache.org/docs/v2/admin/reverse.htm</ref>. In this mode, it acts as a full surrogate for the back-end wiki with port 80 on the advertised hostname for the wiki resolving to Traffic Server. In doing so this enables the processing of web requests to be offloaded from the PHP and database intensive [[MWwiki:MediaWiki|MediaWiki]] application.
+
Apache Traffic Server is a lightweight, yet high-performance, web proxy cache that improves network efficiency and performance<ref>http://trafficserver.apache.org/</ref> . Like [[mwmanual:Squid caching|Squid]] and [[mwmanual:Varnish caching|Varnish]], Traffic Server can be configured as a reverse proxy<ref>http://trafficserver.apache.org/docs/v2/admin/reverse.htm</ref>. In this mode, it acts as a full surrogate for the back-end wiki with port 80 on the advertised hostname for the wiki resolving to Traffic Server. In doing so this enables the processing of web requests to be offloaded from the PHP and database intensive [[mwwiki:MediaWiki|MediaWiki]] application.  
  
Traffic Server can be configure to store high frequency cached content in memory, and where content is flush to disk, access will still invovle significantly less physical I/O than the MediaWiki application. Hence permitting a significantly higher throughput for a give CPU and I/O resource constraint. MediaWiki id been designed to integrate closely with such web cache packages and will Traffic Server when a page should be purged from the cache in order to be regenerated. From MediaWiki's point of view, a correctly-configured Traffic Server installation is interchangeable with Squid or Varnish.
+
Traffic Server can be configure to store high frequency cached content in memory, and where content is flush to disk, access will still invovle significantly less physical I/O than the MediaWiki application. Hence permitting a significantly higher throughput for a give CPU and I/O resource constraint. MediaWiki id been designed to integrate closely with such web cache packages and will Traffic Server when a page should be purged from the cache in order to be regenerated. From MediaWiki's point of view, a correctly-configured Traffic Server installation is interchangeable with Squid or Varnish.  
  
==The architecture==
+
== The architecture ==
An example setup of Traffic Server, Apache and MediaWiki on a single server is outlined below. A more complex [[MWmanual:Cache strategy|caching strategy]] may use multiple web servers behind the same Traffic Server caches (all of which can be made to appear to be a single host) or use independent servers to deliver wiki or image content.
+
 
 +
An example setup of Traffic Server, Apache and MediaWiki on a single server is outlined below. A more complex [[mwmanual:Cache strategy|caching strategy]] may use multiple web servers behind the same Traffic Server caches (all of which can be made to appear to be a single host) or use independent servers to deliver wiki or image content.  
  
 
{|
 
{|
 
|-
 
|-
|Outside world
+
| Outside world  
|<--->
+
| &lt;---&gt;
|style='border:1px solid black;'|
+
| style="border:1px solid black;" |  
Server<br>
+
Server<br>  
 +
 
 
{|
 
{|
 
|-
 
|-
|style='border:1px solid black;'|
+
| style="border:1px solid black;" |  
Traffic Server accelerator<br>
+
Traffic Server accelerator<br> <code>w.x.y.z:80</code>  
<code>w.x.y.z:80</code>
+
 
|<--->
+
| &lt;---&gt;
|style='border:1px solid black;'|
+
| style="border:1px solid black;" |  
Apache webserver<br>
+
Apache webserver<br> <code>127.0.0.1:80</code>  
<code>127.0.0.1:80</code>
+
 
 
|}
 
|}
 +
 
|}
 
|}
  
To the outside world, Traffic Server appears to act as the web server. In reality it passes on requests to the Apache web server, but only when necessary. An Apache running on the same server only listens to requests from localhost (127.0.0.1) while Traffic Server only listens to requests on the server's external IP address. Both services run on port 80 without conflict as each is bound to different IP addresses.
+
To the outside world, Traffic Server appears to act as the web server. In reality it passes on requests to the Apache web server, but only when necessary. An Apache running on the same server only listens to requests from localhost (127.0.0.1) while Traffic Server only listens to requests on the server's external IP address. Both services run on port 80 without conflict as each is bound to different IP addresses.  
  
 
== Configuring Traffic Server 3.0.1  ==
 
== Configuring Traffic Server 3.0.1  ==
  
The Traffic Server Administrator's Guide discusses two simple methods of defining the configuration.<ref>http://trafficserver.apache.org/docs/v2/admin/configure.htm</ref>. However, the package also provides Perl modules to facilitate configuration for those admins familiar with using Perl, so I have used this. The baseline configuration is as folllows:
+
The Traffic Server Administrator's Guide discusses two simple methods of defining the configuration.<ref>http://trafficserver.apache.org/docs/v2/admin/configure.htm</ref>. However, the package also provides Perl modules to facilitate configuration for those admins familiar with using Perl, so I have used this. The baseline configuration is as folllows:  
  
 
<source lang="perl">
 
<source lang="perl">
Line 78: Line 80:
 
copy( "$src/storage.config", "etc/trafficserver" );
 
copy( "$src/storage.config", "etc/trafficserver" );
 
copy( "$src/plugin.config",  "etc/trafficserver" );
 
copy( "$src/plugin.config",  "etc/trafficserver" );
</source>
+
</source>  
  
 
== Configuring MediaWiki  ==
 
== Configuring MediaWiki  ==
Line 118: Line 120:
 
The Apache web server default logging format would only list 127.0.0.1 as the connecting address. Hence and extra "cached" logging option is enabled<ref>http://httpd.apache.org/docs-2.2/mod/mod_log_config.html</ref>, and this captures the originating browser's address by using the "x-forwarded-for" header passed by Traffic Server.  
 
The Apache web server default logging format would only list 127.0.0.1 as the connecting address. Hence and extra "cached" logging option is enabled<ref>http://httpd.apache.org/docs-2.2/mod/mod_log_config.html</ref>, and this captures the originating browser's address by using the "x-forwarded-for" header passed by Traffic Server.  
  
<tt>LogFormat "%{X-Forwarded-for}i&nbsp;%l&nbsp;%u&nbsp;%t \"%r\"&nbsp;%&gt;s&nbsp;%b \"%{Referer}i\" \"%{User-Agent}i\"" cached <br>CustomLog /var/log/apache2/access.log cached</tt>
+
<tt>LogFormat "%{X-Forwarded-for}i&nbsp;%l&nbsp;%u&nbsp;%t \"%r\"&nbsp;%&gt;s&nbsp;%b \"%{Referer}i\" \"%{User-Agent}i\"" cached <br>CustomLog /var/log/apache2/access.log cached</tt>  
 +
 
 +
== See also  ==
 +
 
 +
*[[MWmanual:Cache|Cache]]
 +
*[[MWmanual:Squid caching|Squid caching]]
 +
 
 +
== References  ==
  
== See also ==
+
<references />
* [[MWmanual::Cache]]
+
* [[MWmanual::Squid caching]]
+
  
== References ==
+
{{ASFcopyright}}
<references />
+

Revision as of 02:11, 22 August 2011

Why Traffic Server?

Apache Traffic Server is a lightweight, yet high-performance, web proxy cache that improves network efficiency and performance[1] . Like Squid and Varnish, Traffic Server can be configured as a reverse proxy[2]. In this mode, it acts as a full surrogate for the back-end wiki with port 80 on the advertised hostname for the wiki resolving to Traffic Server. In doing so this enables the processing of web requests to be offloaded from the PHP and database intensive MediaWiki application.

Traffic Server can be configure to store high frequency cached content in memory, and where content is flush to disk, access will still invovle significantly less physical I/O than the MediaWiki application. Hence permitting a significantly higher throughput for a give CPU and I/O resource constraint. MediaWiki id been designed to integrate closely with such web cache packages and will Traffic Server when a page should be purged from the cache in order to be regenerated. From MediaWiki's point of view, a correctly-configured Traffic Server installation is interchangeable with Squid or Varnish.

The architecture

An example setup of Traffic Server, Apache and MediaWiki on a single server is outlined below. A more complex caching strategy may use multiple web servers behind the same Traffic Server caches (all of which can be made to appear to be a single host) or use independent servers to deliver wiki or image content.

Outside world <--->

Server

Traffic Server accelerator
w.x.y.z:80

<--->

Apache webserver
127.0.0.1:80

To the outside world, Traffic Server appears to act as the web server. In reality it passes on requests to the Apache web server, but only when necessary. An Apache running on the same server only listens to requests from localhost (127.0.0.1) while Traffic Server only listens to requests on the server's external IP address. Both services run on port 80 without conflict as each is bound to different IP addresses.

Configuring Traffic Server 3.0.1

The Traffic Server Administrator's Guide discusses two simple methods of defining the configuration.[3]. However, the package also provides Perl modules to facilitate configuration for those admins familiar with using Perl, so I have used this. The baseline configuration is as folllows:

use strict;
use Apache::TS::Config::Records;
use File::Copy;
 
chdir("/home/server");
 
my $recedit = new Apache::TS::Config::Records(file => "etc/trafficserver/records.config");
 
$recedit->set( conf => "proxy.config.exec_thread.autoconfig",                   val => "0"    );
$recedit->set( conf => "proxy.config.exec_thread.limit",                        val => "2"    );
$recedit->set( conf => "proxy.config.http.server_port",	                        val => "80"   );
$recedit->set( conf => "proxy.config.cache.ram_cache.size",                     val => "64M"  );
$recedit->set( conf => "proxy.config.cache.ram_cache_cutoff",                   val => "512K" );
$recedit->set( conf => "proxy.config.url_remap.remap_required",                 val => "1"    );
$recedit->set( conf => "proxy.config.url_remap.pristine_host_hdr",              val => "0"    );
$recedit->set( conf => "proxy.config.http.insert_response_via_str",             val => "1"    );
$recedit->set( conf => "proxy.config.http.accept_no_activity_timeout",          val => "30"   );
$recedit->set( conf => "proxy.config.http.keep_alive_no_activity_timeout_out",  val => "5"    );
$recedit->set( conf => "proxy.config.http.negative_caching_enabled",            val => "1"    );
$recedit->set( conf => "proxy.config.http.negative_caching_lifetime",           val => "120"  );
$recedit->set( conf => "proxy.config.http.cache.ignore_client_cc_max_age",      val => "1"    );
$recedit->set( conf => "proxy.config.http.normalize_ae_gzip",                   val => "1"    );
$recedit->set( conf => "proxy.config.dns.search_default_domains",               val => "0"    );
$recedit->set( conf => "proxy.config.hostdb.size",                              val => "1000" );
$recedit->set( conf => "proxy.config.hostdb.storage_size",                      val => "1M"   );
$recedit->set( conf => "proxy.config.ssl.enabled",                              val => "0"    );
$recedit->set( conf => "proxy.config.ssl.number.threads",                       val => "0"    );
$recedit->set( conf => "proxy.config.cache.threads_per_disk",                   val => "4"    );
 
$recedit->append( line => "" );
$recedit->append( line => "# My local stuff" );
$recedit->set(    conf => "proxy.config.http.server_max_connections", val =>"100" );
$recedit->set(    conf => "proxy.config.http_ui_enabled",             val => "3"  );
#$recedit->append( line => "#CONFIG proxy.config.http.enable_http_info INT 1" );
#$recedit->set( conf => "proxy.config.mlock_enabled", val => "2" );
 
$recedit->write( file => "etc/trafficserver/records.config" );
 
# Some copies
my $src = "/home/server/ATS";
 
copy( "$src/remap.config",   "etc/trafficserver" );
copy( "$src/storage.config", "etc/trafficserver" );
copy( "$src/plugin.config",  "etc/trafficserver" );

Configuring MediaWiki

Since Traffic Server is captures the end-user browser requests and forwards those which requre processing by Apache through the localhost loopback connector, Apache will alsways receive "127.0.0.1" as the direct remote address. However, as Traffic Server forwards requests to Apache, it is configured to add the "X-Forwarded-For" header so that the remote address from the outside world is preserved. MediaWiki must be configured to use the "X-Forwarded-For" header in order to correctly display user addresses in Special:RecentChanges.

The required configuration for Traffic Server is essentially the same as for Squid, with the following config assignments in LocalSettings.php:

$wgUseSquid = true;
$wgSquidServers = array('127.0.0.1');
// $wgInternalServer = '';           // Internal server name as known to Squid. NOT SET.
// $wgMaxSquidPurgeTitles = 0        // Maximum no of pages to purge in one client operation. NOT SET.
// $wgSquidMaxage =  Cache timeout for the squid.
$wgUseXVO = true;                    // Send X-Vary-Options header for better caching.
$wgDisableCounters = true;           // Disable collection of Page counters
$wgShowIPinHeader = false;           // Disable display of IP for guests as this frustrates caching

These settings serve two main purposes:

  • If a request is received from the Traffic Server cache server, the MediaWiki logs need to display the IP address of the user, not that of Traffic Server. A Special:RecentChanges in which every edit is reported as '127.0.0.1' isn't meaningful. Listing this address in $wgSquidServers lets the application know that the user IP address should be obtained from the 'x-forwarded-for' header.
  • Whenever a page or file is modified on the wiki, MediaWiki must be configured to send Purge notification to any caches which serve its content. $wgSquidServers contains the list of such servers. (The name is misleading. Squid was the first cache supported by MediaWiki.)

Note that the configuration is already tuned to support PHP APC acceration for both MediaWiki code, and metadata caching.

Outstanding issues

  • Logging and Page Stats. Most inbound request will be handled by the Traffic Server cache, so the internal stats collected by MediaWiki will only reflect cache misses. We need to think about how we handle logfile analysis and stats in general. I have turned off page counters as these will only reflect cache misses in future.
  • Decision to retain a version MediaWiki 1.15 baseline. For MediaWiki v1.16.x and later, internationalisation can add a material D/B load, For this and other schema changes, we've decided to stick with the last stable MW 1.15.x version (1.15.6) as the S/W baseline

Apache configuration

The Apache server is configured to listen on the standard port at the localhost IP, and accepts all requests from Traffic Server:

Listen 127.0.0.1:80

The Apache web server default logging format would only list 127.0.0.1 as the connecting address. Hence and extra "cached" logging option is enabled[4], and this captures the originating browser's address by using the "x-forwarded-for" header passed by Traffic Server.

LogFormat "%{X-Forwarded-for}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" cached
CustomLog /var/log/apache2/access.log cached

See also

References

  1. http://trafficserver.apache.org/
  2. http://trafficserver.apache.org/docs/v2/admin/reverse.htm
  3. http://trafficserver.apache.org/docs/v2/admin/configure.htm
  4. http://httpd.apache.org/docs-2.2/mod/mod_log_config.html

Copyright © 2011 The Apache Software Foundation. Licensed under the Apache License, Version 2.0.
Apache Traffic Server, Apache, the Apache Traffic Server logo, and the Apache feather logo are trademarks of The Apache Software Foundation.


Personal tools