User:TerryE/phpBB3.0.7 Upgrade/Closed Issues and Actions

From Apache OpenOffice Wiki
Jump to: navigation, search

The Core Build

This work is now complete and all databases migrated to 3.0.7.

Code Base

Language
Language files
Prosilver GIFs
English (U.K.) (built-in) (built-in)
English (U.S.) lang_en_us.tar.gz prosilver_en_us.tar.gz
French lang_fr.tar.gz prosilver_fr.tar.gz
Catalan lang_ca.tar.gz prosilver_ca.tar.gz
Catalan (Valencian) lang_ca_pv.tar.gz prosilver_ca_pv.tar.gz
Dutch lang_nl.tar.gz prosilver_nl.tar.gz
Hungarian lang_hu.tar.gz prosilver_hu.tar.gz
Italian lang_it.tar.gz prosilver_it.tar.gz
Japanese lang_ja.tar.gz prosilver_ja.tar.gz
Mandarin Chinese (Traditional script) lang_zh_cmn_hant.tar.gz prosilver_zh_cmn_hant.tar.gz
Mandarin Chinese (Simplified script) lang_zh_cmn_hans.tar.gz prosilver_zh_cmn_hans.tar.gz
Polish lang_pl.tar.gz prosilver_pl.tar.gz
Spanish lang_es.tar.gz prosilver_es.tar.gz
Spanish (Argentina) lang_es_ar.tar.gz prosilver_es_ar.tar.gz
Vietnamese lang_vi.tar.gz prosilver_vi.tar.gz

I've pulled these down to a kits directory from the phpBB server.

for co in ca ca_pv en_us es es_ar fr hu it ja nl pl vi zh_cmn_hant zh_cmn_hans; do
  wget http://www.phpbb.com/files/language_packs_30x/lang_$co.tar.gz
  wget http://www.phpbb.com/files/language_packs_30x/prosilver_$co.tar.gz
done

On top of this I also have two tarballs and one custom module for my OOo specific additions:

  • OOoImages.tar.bz2 which includes the extra avatar libraries, smileys etc. used on the forums.
  • OOoProsilver.tar.bz2 which includes the forum logos for each of the NL forums and Oracle GIFs.
  • acm_mysqli.php a module which uses MySQL to implement the phpBB cache.

Forum Customisation

The following customisations have been carried over from Version 3.0.5, and the patch file which applies these changes has been regressed from the 3.0.5 code base to the 3.0.7 code base:

  1. Reorder window title to make multi-tab title more usable. All browser tabs are labelled with their title. The phpBB3 default is to begin the title with the forum name then page category, hence if you have a lot of tabs open they are called openOffi... By putting the specific title first, it makes it a lot easier to pick the correct tab.
    viewforum.php, viewtopic.php
  2. Add Expand option to Code Block. phpBB3 now puts each code block in a scrollable window. This makes it a pain if you want to see all of the code in the browser. This mod adds an Expand option to the code block to allow you to blow it up to full size.
    template/bbcode.html, template/forum_fn.js, theme/content.css, theme/bidi.css
  3. Tweak Access key codes to remove conflicts. For those that use keyboard accelerators within the browser, the lack of a shortcut for Preview Post is a pain. This makes <Accel>+P a shortcut for Preview and <Accel>+G a shortcut for graphics.
    template/posting_buttons.html, template/index_body.html, template/posting_editor.html
  4. Add Active Post count to View Active Topics. This just adds a count of new posts in the Active topics heading.
    index.php
  5. Add truncated title of last topic to Forum View. Lots of OOo volunteers asked for this one. It enables you to recognise at a glance if a forum has any new posts in it.
    includes/functions_display.php, template/forumlist_body.html
  6. Customise Headers and Footers to OpenOffice.org branding. Add OOo Community logo and adjust framing colours to match. Add OOo favicon.
    includes/functions.php, template/overall_footer.html, template/overall_header.html, theme/colours.css, theme/bidi.css, theme/common.css, XX/common.php
  7. Add OpenOffice.org Qs to BB FAQ. The FAQ button goes to a phpBB3 standard FAQ which is all about using phpBB3. I've added a Q0 which points the user to the wiki and forum FAQs for openOffice.org info.
    en/help_faq.php
  8. Add instruction to “read Survival Guide before posting” to post form. The posting form now includes an instruction to read the survival guide before you post. (This is only shown if you have less than 15 posts).
    posting.php, template/posting_layout.html, XX/common.php
  9. Optional Mandatory Subject on Registration. This is a customisation which was introduced in the French forum, and we have now rolled this out across all forums.
    includes/functions_user.php, includes/ucp/ucp_register.php, template/ucp_register.html, XX/common.php
  10. Change move topic default. This no longer leaves a shadow topic by default.
    template/mcp_move.html
  11. Add Poster's warning count for moderators. This was a specific request from the FR moderators. If a poster has outstanding warnings then moderators will see his/her warning count in the post heading column.
    template/viewtopic_body.html
  12. Allow Cron to be run as batch script. This allows cron jobs to be scheduled using a cron script.
    cron.php
  13. Runtime optimisations. This is replace the file based caching with SQL caching.
    includes/acm/acm_memory.php
  14. Miscellaneous OOo configuration. The batch equivalent of the interactive phpBB setup.
    config.php, install/database_update.php, fr/mcp.php, fr/acp/common.php
  15. Customisations that are configurable through phpBB. These don't need code or style changes:
    • Add custom fields to define users, main OS, installed OOo version and supplementary OSs.
    • Add additional BBcodes for [edit], [pre], [strike], [center], [sub], [super], [hr]

Note that the XX above refers to all 15 language variants ca, etc.

The build process

This is a modified rerun of what I did for the 3.0.4 and 3.0.5 upgrades. I use a VirtualBox/ customised LAMP VM appliance based on Ubuntu JeOS. OK, this is Linux rather than Solaris, but in terms of the AMP stack elements it is functionally the same, and contains a full copy of the live database. I build and dress rehearse the 3.0.7 build and 3.0.5 → 3.0.7 upgrade on this. I then move and blow the complete 307 tarball into the live system and repeat the 3.0.5 → 3.0.7 upgrade live. (I prefer to do it this say since I prefer the Linux utilities and the gnu extensions to diff, patch, etc.)

The painful bit in all this is regressing the 3.0.5 patch file to the 3.0.7 code base as this is a largely manual activity, but once this has been done I have scripted the various steps to automate the build. These are pretty simple and self-explanatory so I won't go into details here. However, I have changed the installation in line with the Debian standard of phpBB:

  • The phpBB install is now homed in /var/lib/phpBB with two separate sub-directories for each version, e.g. ref307 and com307. The first of these is an "out of the box" phpBB install with no modifications. The second is the OOo customised install. OK, this incurs a 20Mb overhead, but this is a very simple method of tracking any uncontrolled changes in prod as diffing the refXXX and comXXX will throw these up.
  • All templates and code are designed to be mappable into all code bases.
  • Each of the forums is contained within its own directory hierarchy under the Apache root directory /var/www, so the English instance is under /var/www/en in the /var/www/en/forum directory root.
  • Within each forum, the relevant PHP files and directories are symlinked across to the corresponding files or directory in the /var/lib/phpBB/com307 hierarchy. In general, directory symlinks are used, but due to a quirk of the phpBB implementation, each module picks up the root path of the phpBB install from the directory of the invoked module, so the English forum's View Topic module must be /var/www/en/forum/viewtopic.php and this is a symlink to /var/lib/phpBB/com307/viewtopic.php, and so on.
  • Whilst this may seem a convolved and inefficient, remember that all of the PHP code is cached through APC. APC is symlink aware, and therefore treats all of the copies of viewtopic.php as the same cached item /var/lib/phpBB/com307/viewtopic.php, etc., so this symlinking is crucial to good cache performance when running ten separate language instances.
  • I have a simple script maps the necessary symlinks from this directory hierarchy to the common production directory, that is /var/lib/phpBB/com307 for version 3.0.7.

The scripts does this rehook only takes a split second. The main time delay is the phpBB script to upgrade the databases from the 3.0.5 schema to the 3.0.7 schema and this only takes seconds to run for all databases except the EN and FR ones, with these taking ~90sec. So the whole upgrade for a given language instance can, in practice, be carried without a scheduled service drop.

Oracle Branding

Now that Oracle has acquired Sun, some of the Sun Footer logos have been replaced with Oracle branding ones. I will sort this out as part of this upgrade. I've done this earlier by request of Clayton.

Automated Administrator Email Responder

V1 of this was developed about a year ago. I upgraded V2 to Python for deployment with phpBB V305. This fails to part about 5% of valid email queries and rejects these as errors. I need to fix this as part of the upgrade to V307:

  • Don't reply if the message subject contains "Re: "
  • Don't reply if the message body or subject contains "out of office".

I also don't post the HTML-only messages as 95% of these are SPAM. This is now completed.

Automated Unactivated Users Prune

This is another batch script that I have developed to automate a tedious twice weekly task. There were two approaches to this one:

  • Use a screen-scrape approach. This is what I've done in the Mailbox robot. Though in this case it would involve interrogating the ACP->Inactive User form after going through ACP authentication.
  • Use a dedicated minimal housekeeping function. This would be board-specific piece of PHP code somewhat similar to the pseudo cron task coding. It would take no parameters as the algo would be fixed and autolog-on.

There are pros and cons to both approaches. Programmaticly accessing ACP functions especially without authentication just involved too much effort, so I fell back to my old python based screen-scrape approach for carrying out the updates themselves since this also ensures that necessary audit is carried out. This involved:

  1. Establish an authenticated session context through http://localhost/XX/forum/ucp.php?mode=login and screen scraped response. The session is then hoisted the session to ACP access, via direct database update (a lot easier than doing this via screen-scrape).
  2. Enumerating the inactive users through http://localhost/XX/forum/adm/index.php?i=inactive&sid=SIDa&mode=list&start=0 to acquire form credentials for delete request.
  3. Interogate D/B to get list of users to delete.
  4. Reply to inactive form with list of users to delete.
  5. Confirm deletion and log-out.

Automated Backup

I had a set of scripts that ran as a cron job for the old server with the PostgresSQL D/B, but I turned these off when I moved to MySQL and we decommissioned using the OUCV server as a second site. I have a script which I use manually about once per week (on top of standard Solaris backups), but I would be more comfortable with a nightly incremental dump, though there are some issues that I influence my decision:

  • Mysqlhotcopy vs mysqldump. One of the issues here is the lock impact on the forums for users. Mysqlhotcopy is a binary snapshot which has about the lowest read lock impact on the database, with about 14secs in the case of the biggest forum (EN). Mysqldump generates a SQL load file, but hangs the EN forum for a couple of minutes. Splitting the largest table (posts) into a separate "select to outfiles" and dumping the rest helps a lot and brings the lock time down to some similar to the hot copy, but this complicates both backup and recovery. Either way it makes sense to do the backup in the early hours CET, as the loads on the heavy forums (ER / FR) are minimal.
  • Incremental vs full. It is difficult to do a true incremental backup of the database because the schema / phpBB architecture isn't really designed with this in mind, though an incremental to optimise any network transfer can be implemented by file delta of the dump files. Given that the bz2 compressed copy of the current mysqlhotcopy set is ~70Mb, there isn't any material benefit in doing file deltas to save host space.
  • Off-site copy or not. The server is already reasonable protected by the common data-centre backup. However, I have had both good and bad experiences of the use of such backups after server / application failure to know that it is prudent to ensure that closed file based backup sets should be maintained on the servers to enable recovery and that routine off-site copies are maintained in the case of disaster. In this last case a weekly snapshot is probably good enough as losing the last week's post following disaster is acceptable; losing the entire forum content is not.
  • Binary logs. I currently have binary logging enabled, and flush then purge these after any database backup. Clearly if I fully automate the backup process, then this becomes a non-issue as long as I include this flush / purge in the backup script.

I have been brooding about this whole issue for some time to decide on an approach which:

  • is simple to implement
  • is simple to operate
  • that has minimal impact to the online service
  • whilst achieving the overall disaster resilience requirements.

One of the issues that has influenced my decision is that I use a Oracle VirtualBox VM copy of the live system (albeit on a LAMP platform) for both support and development of any upgrades. I run these VMs both on my Laptop and my local server (both Ubuntu). So I have decided to use a dedicated mirror VM as my DR copy of live, as this will also give me a local system for support:

  • I run a daily database backup script on-server based on the mysqlhotcopy of all databases to a mirror file-set. This script runs as a cron job in user forumadmin. This user has been added to the mysql group, with mysql data directories for the forum databases given g:rx access and the database table files g:r access. The mirror set is a subdirectory to user forumadmin. This script also compresses the mirror set to a daily backup subdirectory (at nice -n 20), one BZ2 tarball per country D/B and these are retained on-server for one week, excepting that the tarball for the first of the month which is currently retained indefinitely (or manually pruned). The script also executes the prune policy to remove these aged backup sets.
  • My main mechanism for live-to-VM replication is by rsync over SSH for:
    1. The current /var/lib/phpBB/comXXX file structure which contains the shared phpBB code-base.
    2. The various /var/www/XX language trees. Note that the material content here are the various uploaded attachments.
    3. Optionally for the database mirror file-set. Because rsync uses block based fingerprinting and one-the-fly compression plus the block oriented nature of a database, syncing the uncompressed mirrors generates far less network traffic than attempting to sync the already compressed BZ2 tarballs, if the resync is done weekly. After that a straight scp of the tarball is simpler.
  • I have a VM-side script which executes this sync and a host script which can resume the VM, executes this rync process and suspend the VM again. This enables my syncing to the last nightly backup by a single command (and by a cron initiated version).

Note that I don't backup the MYI index files on the server-side as this takes extra time and space, so I need to recreate these on the VM side. When rsync has updated the MYD and FRM, the MYIs are out of sync and need to be repaired my myisamchk. Howver, this doesn't work if the MYIs are missing, but REPAIR TABLE with the USE_FRM option does, so this magic does the trick:

mysql --skip-column-names information_schema -e \
      "select concat('REPAIR TABLE ',table_schema,'.',table_name,' USE_FRM;')  from tables where length(table_schema)=2" \
      | mysql

Forum watchdog

For some reason Mysql stalls and the number of open threads explodes. When this happens the Apache2 threads which are all invoking phpBB transactions run out of cursor connects. This seems to be a latching state. The simplest remedy is to stop apache2, stop and restart mysql then restart apache. I have written a simple php script which probes the database and returns an "OK" if it can connect to the Mysql engine and run a simple query against the EN forum. Note that this only responds to fetches from localhost. This can then be invoked by the watchdog script to return this status:

  phpBB_status=$(wget -O - -o /dev/null -T 10  http://localhost/db_check_loopback.php)

The watchdog is kicked off daily at midnight and polls phpBB through Apache every 15 mins. If OK is not returned then one retry is done 5 mins later and at this point apache/mysql is bounced; Email warnings are sent to TerryE / CCornell. Note that this clearly only addresses apache/mysql stalls.

A preliminary version is deployed but I am waiting review comment from the Oracle sysadmin.

Retiring Custom Fields

This functionality has been moved into the standard signature format. I had already done this for the EN forums and switched off the use of custom fields in the creation of new accounts on all forums, but I've had about existing 800 accounts across three NL forums where I need to move this data into the signatures. Three other NL databases still have custom fields configured:

  • HU: elsodleges_opr, ooo_valtozat, masodlagos_os
  • ZH: primary_os, ooo_version, secondary_os
  • PL: ishuman

The first two mirror the previous EN implementation. These currently have 268 and 560 entries resp. where users have both a signature and custom fields set. I will need to reconcile these with the help of the NL admins, but in nearly all cases the signature is more accurate and current than the fields (people update their sigs but not their profile fields), so I have dropped the information in the fields. In the cases where the sigs are blank, then I used the custom fields to create a signature, and then deleted the fields and truncated the inactive fields_data table. Out of interest here is an example of the SQL used to create these signatures for the ZH forum:

CREATE TEMPORARY TABLE xx AS 
SELECT     u.user_id, concat(f1.lang_value, unhex('20E280A220'), f2.lang_value, unhex('20E280A220'), pf_secondary_os) AS user_sig
FROM       phpbb_zh_users u
INNER JOIN phpbb_zh_profile_fields_data d  ON u.user_id = d.user_id
INNER JOIN phpbb_zh_profile_fields_lang f1 ON (pf_primary_os  = f1.option_id+1 AND f1.field_id = 1 AND f1.lang_id = 2)
INNER JOIN phpbb_zh_profile_fields_lang f2 ON (pf_ooo_version = f2.option_id+1 AND f2.field_id = 2 AND f2.lang_id = 2)
WHERE      u.user_sig = ;
  
UPDATE phpbb_zh_users u, xx t
SET    u.user_sig = t.user_sig
WHERE  u.user_id = t.user_id;

Any the wiser?

The last is an earlier anti-spam implementation which is a custom field the that user must select as "yes" to register. All custom fields have now been retired.

Watchdog and Logfile Retention

I've configured Logfile rotation on the Apache access and error logs on a weekly basis. I have modified the daily management / watchdog script to:

  • Do a nightly hotcopy of the MySQL schemas for each NL forum.
  • BZ2 compress Apache logs older than one week
  • Deletes the compressed Apache access logs files older than 3 months and error logs older than 6 months.
  • Poll the forum every 5 minutes to ensure that nothing has stalled. After two consecutive time-outs, the AMP stack is bounced to clear any latched error states.
Personal tools