User:TerryE/phpBB3.0.7 Upgrade/Issues and Actions

From Apache OpenOffice Wiki
< User:TerryE
Revision as of 23:14, 28 July 2010 by TerryE (Talk | contribs)

Jump to: navigation, search

I have included the list of open issues and activities that I intend to carried out as part of the phpBB V3.0.7 upgrade below.

The Core Build

This work is now complete and all ten live NL databases migrated to 3.0.7. See Closed Issues and Actions for details on the process.

Round-off and archive

See Closed Issues and Actions for details on the following completed activities:

  • Automated administrator email responder
  • Automated unactivated users prune
  • Oracle branding
  • Retiring custom fields

I need still need to clean out any temporary archive files, and archive the core set of sh and patch files so that I can pick this up as a basis of a future 3.0.7 -> 3.x upgrade.

Automated Backup

I had a set of scripts that ran as a cron job for the old server with the PostgresSQL D/B, but I turned these off when I moved to MySQL and we decommissioned using the OUCV server as a second site. I have a script which I use manually about once per week (on top of standard Solaris backups), but I would be more comfortable with a nightly incremental dump, though there are some issues that I influence my decision:

  • Mysqlhotcopy vs mysqldump. One of the issues here is the lock impact on the forums for users. Mysqlhotcopy is a binary snapshot which has about the lowest read lock impact on the database, with about 14secs in the case of the biggest forum (EN). Mysqldump generates a SQL load file, but hangs the EN forum for a couple of minutes. Splitting the largest table (posts) into a separate "select to outfiles" and dumping the rest helps a lot and brings the lock time down to some similar to the hot copy, but this complicates both backup and recovery. Either way it makes sense to do the backup in the early hours CET, as the loads on the heavy forums (ER / FR) are minimal.
  • Incremental vs full. It is difficult to do a true incremental backup of the database because the schema / phpBB architecture isn't really designed with this in mind, though an incremental to optimise any network transfer can be implemented by file delta of the dump files. Given that the bz2 compressed copy of the current mysqlhotcopy set is ~70Mb, there isn't any material benefit in doing file deltas to save host space.
  • Off-site copy or not. The server is already reasonable protected by the common data-centre backup. However, I have had both good and bad experiences of the use of such backups after server / application failure to know that it is prudent to ensure that closed file based backup sets should be maintained on the servers to enable recovery and that routine off-site copies are maintained in the case of disaster. In this last case a weekly snapshot is probably good enough as losing the last week's post following disaster is acceptable; losing the entire forum content is not.
  • Binary logs. I currently have binary logging enabled, and flush then purge these after any database backup. Clearly if I fully automate the backup process, then this becomes a non-issue as long as I include this flush / purge in the backup script.

I have been brooding about this whole issue for some time to decide on an approach which:

  • is simple to implement
  • is simple to operate
  • that has minimal impact to the online service
  • whilst achieving the overall disaster resilience requirements.

One of the issues that has influenced my decision is that I use a Oracle VirtualBox VM copy of the live system (albeit on a LAMP platform) for both support and development of any upgrades. I run these VMs both on my Laptop and my local server (both Ubuntu). So I have decided to use a dedicated mirror VM as my DR copy of live, as this will also give me a local system for support:

  • I run a daily database backup script on-server based on the mysqlhotcopy of all databases to a mirror file-set. This script runs as a cron job in user forumadmin. This user has been added to the mysql group, with mysql data directories for the forum databases given g:rx access and the database table files g:r access. The mirror set is a subdirectory to user forumadmin. This script also compresses the mirror set to a daily backup subdirectory (at nice -n 20), one BZ2 tarball per country D/B and these are retained on-server for one week, excepting that the tarball for the first of the month which is currently retained indefinitely (or manually pruned). The script also executes the prune policy to remove these aged backup sets.
  • My main mechanism for live-to-VM replication is by rsync over SSH for:
    1. The current /var/lib/phpBB/comXXX file structure which contains the shared phpBB code-base.
    2. The various /var/www/XX language trees. Note that the material content here are the various uploaded attachments.
    3. Optionally for the database mirror file-set. Because rsync uses block based fingerprinting and one-the-fly compression plus the block oriented nature of a database, syncing the uncompressed mirrors generates far less network traffic than attempting to sync the already compressed BZ2 tarballs, if the resync is done weekly. After that a straight scp of the tarball is simpler.
  • I have a VM-side script which executes this sync and a host script which can resume the VM, executes this rync process and suspend the VM again. This enables my syncing to the last nightly backup by a single command (and by a cron initiated version).

Note that I don't backup the MYI index files on the server-side as this takes extra time and space, so I need to recreate these on the VM side. When rsync has updated the MYD and FRM, the MYIs are out of sync and need to be repaired my myisamchk. Howver, this doesn't work if the MYIs are missing, but REPAIR TABLE with the USE_FRM option does, so this magic does the trick:

mysql --skip-column-names information_schema -e \
      "select concat('REPAIR TABLE ',table_schema,'.',table_name,' USE_FRM;')  from tables where length(table_schema)=2" \
      | mysql

Enabling Atom Feeds

A number of the power users on the FR forum used the previous RSS feed. This functionality has been dropped and an equivalent needs reinstituted. The previous mod implementation adopted the Simple RSS mod for phpBB3. However, this mod has significant limitations that I can't ignore; most importantly it bypasses the phpBB access control system, allowing unregistered users to browse closed forums. I didn't get this working properly but this functionality is now superseded by a standard php V306 feature to provide ATOM feeds.

  • We need to investigate its implementation for those forums that want it, as it requires enabling / configuring through ACP.
  • I need to discuss this further with the FR proponents of this mod.
  • Note that for some reason, the ATOM feed link isn't being displayed if the Feed is enable. I need to investigate this.

Forum watchdog

For some reason Mysql stalls and the number of open threads explodes. When this happens the Apache2 threads which are all invoking phpBB transactions run out of cursor connects. This seems to be a latching state. The simplest remedy is to stop apache2, stop and restart mysql then restart apache. I have written a simple php script which probes the database and returns an "OK" if it can connect to the Mysql engine and run a simple query against the EN forum. Note that this only responds to fetches from localhost. This can then be invoked by the watchdog script to return this status:

  phpBB_status=$(wget -O - -o /dev/null -T 10  http://localhost/db_check_loopback.php)

The watchdog is kicked off daily at midnight and polls phpBB through Apache every 15 mins. If OK is not returned then one retry is done 5 mins later and at this point apache/mysql is bounced; Email warnings are sent to TerryE / CCornell. Note that this clearly only addresses apache/mysql stalls.

A preliminary version is deployed but I am waiting review comment from the Oracle sysadmin.

Logfile Retention Policy

At the moment I've configured Logfile rotation on the Apache access and error logs on a weekly basis. At the moment I periodically compress old logs and delete access logs older the 3 months. Rotation on the MySQL logs is also done on an adhoc basis. I need to implement a simple weekly cron task which:

  • Rotates the Mysql log (I have turned off user logging -- no point on our prod system)
  • BZ2 compresses Apache logs older than one week (remember to use nice)
  • Deletes the compressed Apache access logs files older than 3 months and error logs older than 6 months

I have binary logs enabled on Mysql but these are flushed and purged after the nightly backup (see above).

Basic Management Reporting

Some of the NL forums are clearly active and healthy. Some clearly less so. However, we have no objective measures in place to identify any issues that are arising and any trends that need to trigger additional support for any individual forum that might be having problems. One option here might be to introduce some basic health measures so that we can maintain a simple forum dashboard.

Standardising Forum Configurations

With 10 forums any degree of divergence can cause problems on upgrade and through-life maintenance, so we need to standardise where possible. I am not talking about language packs or basic forum structure but things like the extra BBcodes, permissible file extensions for upload, etc.. If they make sense in one OOo forum then they should be in all.

Write up phpBB 3.0.7 configuration

I have decided to do this in the phpBB wiki because this will be of more interest to phpBB admins rather than OOo Volunteers and NL Admins. I will post the link here when I've done it.

Personal tools