User:TerryE/phpBB3.0.4 Migration/Standardisation Strategy
OOo Community Forums Migration Strategy
The current openOffice.org (“OOo”) forums (“Community Forums”) started with the creation of the English forum in Nov 2007 running on phpBB 3.0.0 over a Sun Coolstack platform and PostgreSQL 8.1 running on Solaris x86 infrastructure in the Sun Munich Facility. We subsequently upgraded to phpBB 3.0.1 and PostgreSQL 8.2 and then added the Hungarian, French, Spanish and Japanese forums. We are now in the process of adding the Vietnamese and Chinese forums. Given the proliferation of forums, the general growth in transactional volumes and the fact that phpBB is now at version 3.0.4, we have decided to upgrade the H/W infrastructure and install a clean re-baseline for all forums. This wiki page describes the process and approach for migration the OOo forums from their current platform to the new installation.
Assumptions and Constraints
- The Community Forums are running on a number of separate phpBB instances all at version phpBB 3.0.1 (but with disjoint code hierarchies) over a common PostgreSQL 8.2 database on a Solaris/Coolstack platform.
- The content of the Community Forums is largely maintained in the central database, though attachment and image data are maintained in the file separate file hierarchies. All such content will be preserved on migration.
- A new hardware platform will be running Solaris 10, plus Coolstack 1.3.1 including the optional CSKPython package. All instances will use the 32bit CSK packaged MySQL 5.0.
- Each instance of the Community Forums implements a language-specific forum (a “Language Forum”). At cut-over to the new hardware, these will comprise the following Language Forums: English (en), Spanish (es), French (fr), Hungarian (hu), Japanese (ja), Vietnamese (vi) and Chinese (zh).
- Each Language Forum will have a designated lead “Administrator” who is responsible for final acceptance and configuration of the forum as follows:
- en: Hagar de l'Est
- es: jza
- fr: Bidouille
- hu: r4zoli
- ja: khirano
- vi: nnam
- zh: forestcheng
- The target Community Forums configuration will be a largely “out of the box” configuration of phpBB version 3.0.4 designed to facilitate the easy addition of extra language instances.
- All Language Forums will adopt the phpBB Prosilver template with minor tweaks to implement OOo branding. The Subsilver will not be supported.
- “Supported Languages” at cut-over to the new hardware will be English (UK and US dialects), French, Spanish (Spain, Argentina, Catalan, Valencia dialects), Hungarian, Mandarin (Traditional and Simplified Scripts), Japanese and Vietnamese.
- All Supported Languages will be available to all Language Forums. However it is up to the lead Administrator of each forum to decided which languages are enabled for that forum. This will normally be UK English and the local languages. The lead Administrator will also decide the default Language for the forum.
- Each Language Forum will contain an administration sub-forum, access to which will be limited to trusted moderators and administrators.
- We want to move to as "Zero Admin" an approach as possible. Therefore all routine system administration functions will be scripted by means of automated (usually crontab based) scripts.
- A common code base will be shared by all instances by sym-linking the individual instance hierarchies to a common directory set. This will be based on a fresh install of phpBB from the phpBB Downloads reference site.
- The Community Forums will retain a small number of functional changes as detailed in section Forum Customisations below. These changes are that subset of the changes applied to current individual Language Forums, filtered by criteria based on proven user benefit of the change compared to the incremental support effort,
- Python will be used as the standard system administration scripting Language.
Code Base
- The core phpBB code base from phpBB 3.0.4 Download
- The following language packs from phpBB 3.0.x Languages repository. These are maintained in the site /files/language_packs_30x/ sub-directory
English (U.K.) | (built-in) | (built-in) |
English (U.S.) | lang_en_us.tar.gz | prosilver_en_us.tar.gz |
French | lang_fr.tar.gz | prosilver_fr.tar.gz |
Catalan | lang_ca.tar.gz | prosilver_ca.tar.gz |
Catalan (Valencian) | lang_ca_pv.tar.gz | prosilver_ca_pv.tar.gz |
Spanish | lang_es.tar.gz | prosilver_es.tar.gz |
Spanish (Argentina) | lang_es_ar.tar.gz | prosilver_es_ar.tar.gz |
Hungarian | lang_hu.tar.gz | prosilver_hu.tar.gz |
Mandarin Chinese (Traditional script) | lang_zh_cmn_hant.tar.gz | prosilver_zh_cmn_hant.tar.gz |
Mandarin Chinese (Simplified script) | lang_zh_cmn_hans.tar.gz | prosilver_zh_cmn_hans.tar.gz |
Japanese | lang_ja.tar.gz | prosilver_ja.tar.gz |
Vietnamese | (unsupported translation) | (unsupported translation) |
- Note that the Vietnamese is not yet available through the phpBB site.
Overview of the Migration Philosophy
To analyse the differences in the current Language Forum implementations, I have written some scripts to do a smart delta of the seven current language-specific directory hierarchies. Essentially the configurations are an exercise of 'entropy in action': every configuration is slightly different, both in terms of language packs and code modifications. This means than any upgrade of the baseline phpBB package has to address seven different versions. The consequences for long term maintenance are just a killer and this is going to get worse as we add support for extra languages, so we must move to a single common code base for all forums.
Previous to this migration I did a migration of the Virtualbox Community Forums from phpBB V2 to V3.0.4, and used our OOo configuration and customisations as a starting point for this target phpBB implementation. I reviewed all of the OOo customisations in terms of their value and volunteer feedback, and junked any that didn't have a proven benefit. This "junked" list includes my experimental kB patch; the heavy OOo customisation of the standard phpBB FAQ customisation; the beginners template; the email link to the forumadmin on all pages. I then regressed all remaining changes onto the standard phpBB V3.0.4 to derive a patch to implement the required changes. With the exception of the VirtualBox vs. OOo specific branding in the headers and footers, this work largely applies to our target configuration as well.
The main difference between the OOo and VirtualBox implementations is that VirtualBox has a single production forum, whereas we maintain a number of separate Language Forums within OOo. Nonetheless, the base software configuration will be shared by all forums and the configuration set ups will be largely consistent across all forums. This will enable a common set of language templates to be maintained across all forums, and hence all languages will be available to the individual language forum Administrators (though these Administrators might only chose to enable a subset of languages within their own forum). Certain configuration items are forum specific rather than language specific. An example of this is that an advisory message is displayed in the post form asking users to read the Survival Guide before posting. We will have a Survival Guide in every forum, so the language-specific instruction can a direct translation of a common instruction, that this the wording can be equivalent in content for every language and forum, so this text will be in the language template. However, the link to actual guide is forum specific and this will be maintained in the forum's system configuration table:
- Language specifics will be maintained in the language template
- Forum specifics will be maintained in the forum phpBB_XX_config table.
The way that phpBB establishes its root with Apache is by examining the current directory of any script that is directly executed through a URI. These are as follows:
- forum: cron.php, faq.php, file.php, index.php, listforums.php, mcp.php, memberlist.php, posting.php, report.php, search.php, style.php, ucp.php, viewforum.php, viewonline.php, viewtopic.php
- forum/adm:index.php, swatch.php
- forum/download: file.php
Each forum will therefore have private copies of these directories along with the sub-directories which contain forum-specific files: cache, download, files, images, store as well as the config.php file which defines the database context. The remain subdirectories will be sym-linked to a common repository:
- docs -> ../../common/forum/docs
- language -> ../../common/forum/language
- styles -> ../../common/forum/styles
- includes -> ../../common/forum/includes
The individual files in the top level directory, etc., will be sym-linked to the common copy:
- forum/common.php -> ../../common/forum/common.php
- forum/cron.php -> ../../common/forum/cron.php
- forum/faq.php -> ../../common/forum/faq.php
- forum/file.php -> ../../common/forum/file.phpx`
- forum/index.php -> ../../common/forum/index.php
- forum/listforums.php -> ../../common/forum/listforums.php
- forum/mcp.php -> ../../common/forum/mcp.php
- forum/memberlist.php -> ../../common/forum/memberlist.php
- forum/posting.php -> ../../common/forum/posting.php
- forum/report.php -> ../../common/forum/report.php
- forum/search.php -> ../../common/forum/search.php
- forum/style.php -> ../../common/forum/style.php
- forum/ucp.php -> ../../common/forum/ucp.php
This may sound inefficient, but thanks to the phpBB caching system (which is local to each forum) the material overhead of this redirection is removed. (Though I still need to evaluate the image directory tree and update this section in the light of this evaluation.)
Hence the software configuration of the new forum versions will be a clean patched install. Only the forum content will be migrated. This content is largely contained in a set of forum-specific tables, though it also comprises a set of files in the cache, download, files, images and store directories. I have written a small shell script (as this runs on the old hardware which doesn't have Python installed) which can dump either all tables relating to a specific language forum or the subset of tables that contain the user and post data. I also have a second python script which converts this PostgreSQL V3.0.1 schema and dump into a MySQL V3.0.4 compatible load.
One facet of the migration is the process of Live Synchronisation. This exploits the natural structure of the phpBB data model, and specifically that the phpBB tables can be grouped into the following two categories. This enables the overlay of the current user and post data onto a previous migration:
- Tables which define the forum configuration. These largely define the structure and configuration forum, and are only changed when this configuration is changed. These tables can be migrated and refined during the pilot runs to finalise the new database layout. As long as the configuration is frozen during the migration period, e.g. no changes to the forum hierarchy, these tables don’t need to be updated. The one exception here is the forums table which contains some denormalised data elements (e.g. the number of topic in the forum) which need special treatment as discussed below.
- acl_groups, acl_options, acl_roles, acl_roles_data, bbcodes, bots, config, disallow, extension_groups, extensions, forums, forums_access, groups, icons, knowledge_base, lang, moderator_cache, modules, profile_fields, profile_fields_lang, profile_lang, ranks, search_wordlist, search_wordmatch, sitelist, smilies, styles, styles_imageset, styles_imageset_data, styles_template, styles_template_data, styles_theme, words, zebra
- Tables which contain post and user related data. These contain the user content that evolve over time as users register and post to topics. These need to be migrated again in the second phase and then copied over the target database.
- acl_users, attachments, banlist, bookmarks, confirm, drafts, forums_track, forums_watch, log, poll_options, poll_votes, posts, privmsgs, privmsgs_folder, privmsgs_rules, privmsgs_to, profile_fields_data, reports, reports_reasons, search_results, sessions, sessions_keys, topics, topics_posted, topics_track, topics_watch, user_group, users, warnings
By splitting the tables this way we can migrate the database, then take some days getting it configured correctly whilst the live (phpBB v3.0.1) database is being updated. We can then remigrate the post and user related data, so that you can tweak the rest of the configuration during testing in parallel to live use of the database. It therefore does not mirror groups, role based access to forums, etc. so if these are out of sync you will need to manually resynchronise them.
On a first load the full schema is loaded and all tables. On resynchronisation of post and user related data, only the tables in the second group are truncated and reinserted.
The forum table is denormalised to include information on total topic counts, details on the last post, etc. The merge script include SQL updates to update these fields from the topics table.
Migration Process
The migration will take place in three phases:
- The first involves building a complete forum set in a virtual LAMP appliance. It's just easier for me to do this on a LAN connected VM than on the target Solaris hardwar and zone in the Sun Labs in Germany over the internet. Then each of the current language databases will be snapped as a SQL backup into the forum backups directory, and then wget pulled to the LAMP appliance for loading and configuring into the test system. I will test and tune this LAMP configuration to validate the basic configuration.
- Once I have tested this configuration, I will then transfer it to the OOoTest system in Germany. Prior to the transfer I will truncate the tables in group (1) before exporting the database. I then dump the group (1) tables and refresh the test database to resync it to live. At this point the individual language forum leaders will be given Administrator access to each forum to validate functionality and configuration.
- Once the Administrators have confirmed the system configuration of their respective forums, we are then ready to go live. The cut over itself will be based on a process that I developed and tested in the VirtualBox migration. This is to bring the live forums offline, then backup the current user and post data and then overlay this onto the target OOoTest server. This involves truncating and replacing the content of the following tables from live. Once this import is complete the forums are available for go-live.
Forum Customisation
The delivered customised instances contains the following changes:
- Reorder window title to make multi-tab title more usable. All browser tabs are labelled with their title. The phpBB3 default is to begin the title with the forum name then page type, hence if you have a lot of tabs open they are called openOffi... By putting the specific title first, it makes it a lot easier to pick the correct tab.
- Add Expand option to Code Block. phpBB3 now puts each code block in a scrollable window. This makes it a pain if you want to see all of the code in the browser. This mod adds an Expand option to the code block to allow you to blow it up to full size.
- Tweak Access key codes to remove conflicts. For those that use keyboard accelerators within the broswer, the lack of a shortcut for Preview Post is a pain. This makes <Accel>+P a shortcut for Preview and <Accel>+G a shortcut for graphics.
- Add Active Post count to View Active Topics. This just adds a count of new posts in the Active topics heading.
- Add truncated title of last topic to Forum View. Lots of OOo volunteers asked for this one. It enables you to recognise at a glance if a forum has any new posts in it.
- Customise Headers and Footers to OpenOffice.org branding. Add OOo Community logo and adjust framing colours to match. Add OOo favicon.
- Rebasline to phpBB 3.0.4. The install is a patch file which updates the standard 3.0.4 unpacked tarball.
- Add OpenOffice.org Qs to BB FAQ. The FAQ button goes to a phpBB3 standard FAQ which is all about using phpBB3. I've added a Q0 which points the user to the wiki and forum FAQs for OpenOffice info.
- Add “Request to read Survival Guide before posting” to post form. The posting form now includes an instruction to read the FSG before you post. (This is only shown if you have less than 15 posts).
- Add custom fields do define users, main OS, installed OOo version and supplementary OSs. (This is actually a configuration of standard phpBB functionality.)