User:TerryE/phpBB3.0.7 Upgrade/Issues and Actions

From Apache OpenOffice Wiki
< User:TerryE
Revision as of 18:23, 27 July 2010 by TerryE (Talk | contribs)

Jump to: navigation, search

I have included the list of open issues and activities that I intend to carried out as part of the phpBB V3.0.7 upgrade below.

The Core Build

This work is now complete and all ten live NL databases migrated to 3.0.7. See Closed Issues and Actions for details on the process.

Round-off and archive

See Closed Issues and Actions for details on the following completed activities:

  • Automated administrator email responder
  • Automated unactivated users prune
  • Oracle branding
  • Retiring custom fields

I need still need to clean out any temporary archive files, and archive the core set of sh and patch files so that I can pick this up as a basis of a future 3.0.7 -> 3.x upgrade.

Automated Backup

I had a set of scripts that ran as a cron job for the old server with the PostgresSQL D/B, but I turned these off when I moved to MySQL and we decommissioned using the OUCV server as a second site. I have a script which I use manually about once per week (on top of standard Solaris backups), but I would prefer to have a nightly incremental dump.

There are still some open issues that I want to bottom:

  • mysqlhotcopy vs mysqldump. One of the issues here is the lock impact on the system for end users. Mysqlhotcopy is a binary snapshot which has about the lowest lock impact on the database -- 15secs in the case of the biggest forum (EN). This can be dropped to a few secs by separating out the posts table (~85% of the D/B volumes), as we don't need the transactional integrity here. The latter generates a SQL load file but hangs the EN forum for a couple of minutes. Again splitting the four largest tables (post, topics, users, logs) into separate select to outfiles and dumping the rest helps a bit, but as long as we do the backup in the early hours CET, the big loads (ER / FR) are minimal.
  • Incremental vs full. It is quite difficult to do a true incremental backup of the database because the schema / phpBB architecture isn't really designed with this in mind, though an incremental to optimise any network transfer can be implemented by file delta at the dump level. Given that the bz2 compressed copy of the current mysqlhotcopy set is ~70Mb, there isn't any great deal of point in doing file deltas to save host space.
  • Off-site copy or not. Note that the server is already reasonable protected by the common data-centre backup. however, I have had both good and bad experiences of the use of such backups after server / application failure to know that it is prudent to (i) ensure that closed file based backup sets should be maintained on the servers to enable recovery and routine offsite copies should be maintained in the case of disaster. In this last case a weekly snapshot is probably good enough: losing the last week's post following disaster is acceptable; losing the entire forum content is not.
  • Binary logs. I currently have binary logging enabled, and flush then purge these after any database backup. Clearly if I fully automate the backup process, then this becomes a non-issue as long as I include this flush / purge in the backup script.

I have been brooding about this whole issue for some time to decide on an approach which:

  • is simple to implement
  • is simple to operate
  • that has minimal impact to the online service
  • whilst achieving the overall disaster resilience requirements.

One of the issues that has influenced my decision is my use of a Oracle VirtualBox VM copy of the live system (albeit on a LAMP platform) for both support and development of any upgrades. I run copies of this on my Laptop and my local server (both Ubuntu). So I have decided to use a dedicated mirror VM as my DR copy of live. This will also give me a local system for support. In this light:

  • I run a daily database backup script on-server based on the mysqlhotcopy of all databases to a mirror file-set. Script runs as a cron job in user mysql forumadmin which is added to the mysql group (the forum directories have g:rx access and the database table files g:r access). The mirror set is a subdirectory to user forumadmin. This script then compresses the mirror set to a daily backup subdirectory, one BZ2 tarball per country D/B, which are retained on-server for one week, except that for the first of the month which will currently be retained indefinitely (or manually pruned). The script also executes the prune policy to remove aged backup sets. These currently take up about 80 Mb per daily backup.
  • My main mechanism for live-to-VM replication is by rsync for
    1. The current /var/lib/phpBB/comXXX file structure which contains the shared phpBB code-base.
    2. The various /var/www/XX language trees. Note that the material content here are the various uploaded attachments.
    3. Optionally for The database mirror file-set. Because rsync uses block based fingerprinting and one-the-fly compression plus the block oriented nature of a database, syncing the uncompressed mirrors generates far less network traffic than attempting to sync the already compressed BZ2 tarballs, so long as the D/B is less than a week or so old. After that a straight scp of the tarball is simpler.
  • I have a VM-side script which executes this sync and a host script which can resume the VM, executes this rync process and suspend the VM again. This enables my syncing to the last nightly backup by a single command (and by a cron initiated version).

Note that I don't backup the indices on the server-side as this takes extra time and space, so these need recreating on the VM side. For undocumented reasons myisamchk doesn't work if the MYIs aren't present, but REPAIR TABLE with the USE_FRM option does, so the following magic does the trick:

mysql --skip-column-names information_schema \
      -e "select concat('REPAIR TABLE ',table_schema,'.',table_name,' USE_FRM;') \
          from tables where length(table_schema)=2" | mysql

Enabling Atom Feeds

A number of the power users on the FR forum used the previous RSS feed. This functionality has been dropped and an equivalent needs reinstituted. The previous mod implementation adopted the Simple RSS mod for phpBB3. However, this mod has significant limitations that I can't ignore; most importantly it bypasses the phpBB access control system, allowing unregistered users to browse closed forums. I didn't get this working properly but this functionality is now superseded by a standard php V306 feature to provide ATOM feeds.

  • We need to investigate its implementation for those forums that want it, as it requires enabling / configuring through ACP.
  • I need to discuss this further with the FR proponents of this mod.
  • Note that for some reason, the ATOM feed link isn't being displayed if the Feed is enable. I need to investigate this.

Forum watchdog

For some reason Mysql stalls and the number of open threads explodes. When this happens the Apache2 threads which are all invoking phpBB transactions run out of cursor connects. This seems to be a latching state. The simplest remedy is to stop apache2, stop and restart mysql then restart apache. I have written a simple php script which probes the database and returns an "OK" if it can connect to the Mysql engine and run a simple query against the EN forum. Note that this only responds to fetches from localhost. This can then be invoked by the watchdog script to return this status:

  phpBB_status=$(wget -O - -o /dev/null -T 10  http://localhost/db_check_loopback.php)

The watchdog is kicked off daily at midnight and polls phpBB through Apache every 15 mins. If OK is not returned then one retry is done 5 mins later and at this point apache/mysql is bounced; Email warnings are sent to TerryE / CCornell. Note that this clearly only addresses apache/mysql stalls.

A preliminary version is deployed but I am waiting review comment from the Oracle sysadmin.

Logfile Retention Policy

At the moment I've configured Logfile rotation on the Apache access and error logs on a weekly basis. At the moment I periodically compress old logs and delete access logs older the 3 months. Rotation on the MySQL logs is also done on an adhoc basis. I need to implement a simple weekly cron task which:

  • Rotates the Mysql log (I have turned off user logging -- no point on our prod system)
  • BZ2 compresses Apache logs older than one week (remember to use nice)
  • Deletes the compressed Apache access logs files older than 3 months and error logs older than 6 months

I have binary logs enabled on Mysql but these are flushed and purged after the nightly backup (see above).

Basic Management Reporting

Some of the NL forums are clearly active and healthy. Some clearly less so. However, we have no objective measures in place to identify any issues that are arising and any trends that need to trigger additional support for any individual forum that might be having problems. One option here might be to introduce some basic health measures so that we can maintain a simple forum dashboard.

Standardising Forum Configurations

With 10 forums any degree of divergence can cause problems on upgrade and through-life maintenance, so we need to standardise where possible. I am not talking about language packs or basic forum structure but things like the extra BBcodes, permissible file extensions for upload, etc.. If they make sense in one OOo forum then they should be in all.

Write up phpBB 3.0.7 configuration

I have decided to do this in the phpBB wiki because this will be of more interest to phpBB admins rather than OOo Volunteers and NL Admins. I will post the link here when I've done it.

Personal tools