Mercurial Pilot

From Apache OpenOffice Wiki
Revision as of 06:10, 1 October 2009 by Os (Talk | contribs)

Jump to: navigation, search

Mercurial Pilot

Documentation caution.png The mercurial pilot has been successfully concluded, we are now in the implementation phase. The information presented here is still useful and will be part of the upcoming OOo Mercurial documentation

Purpose

To find out if our DSCM tool of choice can stand the harsh realities of the OOo development process. The OOo code base is pretty huge and we want to know if there are scalability and usability issues before we commit ourselves to a (hopefully) very long lasting SCM tool for OpenOffice.org. After all, SCM migrations are no fun, indeed.

Timeline

The OOo Mercurial pilot is expected to last at least 2 month. This is necessary to ensure that some "old enough" hg hosted child workspaces exist to check out how well the updating/re-sychronizing/re-basing works, something which is considerably painful with SVN. If we find that the pilot doesn't expose some hidden Mercurial problems, we will switch over completely to Mercurial as soon as the necessary infrastructure for a full scale hg usage is in place. If substantial problems surfaces during the pilot we might extend the pilot time frame to see if the problem can be overcome, or, in an extreme case, even set up a new pilot with git or bazaar.

The details

  • The pilot will only cover the main development code line (aka trunk in SVN).
  • The master repository can be found here: http://hg.services.openoffice.org/hg/DEV300. This repository is hourly updated from the SVN trunk.
  • After a CWS is finished and QAed, it is necessary to pull/merge/push one final time in case of expected conflicts to help the REs with clean patches.
  • RE will pull from your hg hosted CWS, export your changes and apply it as patch to the SVN trunk.

What the pilot covers

  • The complete CWS life cycle from a developers point of view, this includes cloning, committing, pulling, merging, re-basing and transplanting changesets from other CWSs
  • Parts of the CWS handling for RE: pulling from developers, merging.

What is not covered

  • Transplanting changesets from a release code line to the main development line. RE will set up separate tests for this.
  • Integration is done via SVN, thus pushing to the master repository on the remote server by REs is not tested. No problem, this repository is only by convention different from developer repositories.

Caveats

Since every change will still have to be integrated via SVN there are a few caveats which you will need to consider if you plan to participate in the pilot:

  • Your changesets will loose their identity during integration. This might create problems if you cross merged changesets between several CWSs.
  • Integration will lump all changeset of your CWS into one single changeset.
  • The single changeset commit during integration will contain of course only a single commit log. If possible I'll lump all commit logs of your changesets together via scripting. Alternatively REs will accept a detailed commit log supplied by you.
  • The author of the hg changesets is lost as well because a RE engineer will do the integration commit to SVN. We include the original author info into the commit log.
  • Be careful with renaming files. If you do a lot of renaming you'll most probable cause a lot of stress on the integrator. Remember, the RE integrate your CWS via diff and patch. Also, all restrictions of SVN regarding the renaming of files/directories still apply. If you plan to do a lot of renaming please do it on a SVN based repository. Or better, if possible, wait for the final switch to hg.
  • Tinderboxes, buildbots etc will have to be adapted to Mercurial, this will take some time.

I'm an OOo domain developer, what to do if I want to participate?

The "outgoing" repositories on hg.services.openoffice.org will be created automatically for all child workspaces which are:

  • in state 'new' or later
  • flagged for Mercurial

The Mercurial flag can be set either via the EIS webinterface or, starting with DEV300 m57, with the "--hg" switch to "cws create" (see example). If you are an OOo domain developer vou can use your SVN public key to access the outgoing repositories via SSH. It's not mandatory that you use this server to publish your changes, but at integration time latest the RE will naturally need to access your repository somewhere. It is planned to offer this server as central publishing point for OOo CWSs to ease the development coordination. An overview of active hg hosted CWSs can be found here: http://hg.services.openoffice.org/hg/cws.

Example

Let's say you want to participate in the pilot with your new CWS 'mycws', based on DEV300 m58. The steps are: (see OOo and Mercurial for general mercurial workflow info)

  • clone the repository the repository from the master server into a pristine local copy (best w/o tree)
$ hg clone -U -r DEV300_m58 http://hg.services.openoffice.org/hg/DEV300 local_DEV300

Why use '-r DEV300_m58' here? During the pilot the master repository will be hourly updated from /trunk of the SVN server. Since /trunk may not be buildable all the time (for instance during integrations) as a consequence the tip of the master hg repository may not be buildable as well. The situation will change after the switch to Mercurial. RE will only push complete milestones to the repository so you can safely use the tip of the repository. Why use a pristine intermediate local repository? Well, the current repository size is ~1.3 GB, cloning from the server can take from about 50min up to several hours if you are unlucky. You want to do this only once.

  • clone into your cws
$ hg clone local_DEV300 mycws

Note: don't use '-r DEV300_m58 here, if you followed the steps above you know that the tip of repository local_DEV300 is at milestone DEV300_m58. Without '-r' hg can use optimizations like hard linking the repository if both repository are on the same disk.[1] Besides you'll encounter the one unintuitive bugfeature I have seen so far in Mercurial. If the clone is created with -r <tag>, the clone will contain everything up to the tag excluding the tag itself.

  • configure, etc
  • register your CWS with EIS
$ cws create --hg -m m58 DEV300 mycws
  • A server side clone of the master repository will be automatically created to serve you as your outgoing repository. This saves time (270000 changesets less to push over the line) and server diskspace (hard links).
  • hack away in your CWS ...
  • push your changes to the outgoing repository for publishing if you feel so
$ hg push  ssh://hg@hg.services.openoffice.org/cws/mycws
  • pull/merge from the repository to re-synchronize your CWS from a new milestone, say DEV300_m62
$ hg pull -r DEV300_m62 http://hg.services.openoffice.org/hg/DEV300
$ hg merge

Note: if you haven't published your repository yet, 'hg rebase' is an option to consider.

  • QA,fixes, etc
  • final pull/merge/push cycle to ease the integration into SVN trunk.

Mercurial documentation and other resources

Mercurial extensions you might want to have look at

  • hg rebase: git style rebasing instead of pull/merge
  • hg transplant: cherry picking of changsets from other CWSs
  • hg win32text: if you work on windows
  • hg purge: purge your tree from untracked items
  • hg mq: quilt style patch queue, also enables the 'hg strip' command
  • hg collapse: collapses a sequence of commits in one

Things that may surprise former CVS or SVN users

  • hg clone -r <tag> <from_rep> <to_rep> will clone everything up to the tagged revsion excluding the tag. See explanation above.
  • hg resolve does a very different thing than svn resolve. Read the man page.
  • Many hg commands work on the whole repository if no path is specified, no matter where your current working directory is inside the source tree. Example: "cd DEV300/sw/source; hg status" will compute the status of the whole source tree (which might take a minute depending on IO bandwidth). Use "cd DEV300/sw/source; hg status ."

Tips and Tricks

  • After pulling/merging:
    • "hg log --follow-first" is a convenient way to display your CWS changes at the top of the log.
    • "hg log --follow-first -P <original_clone_milestone>" will show only the CWS changesets.
    • "hg outgoing" works as well, of course.
  • Combined diff which contains exactly the changes of your CWS without anything pulled from master
    • If your current milestone tag is locally available: hg diff -r <current_milestone_tag>
    • If your current milestone tag is *not* locally available (might happen, see above):
      • search for your last pull/merge from the master with "hg log -m"
      • "hg diff -r <second_parent_of_last_master_merge>"
    • Alternatively, if you just did a pull/merge from master: "hg export --switch-parent tip"
  • The default output format of some commands does not suit you? Use styles:
    • "hg log --style=compact"
    • "hg outgoing --style=changelog"
  • Or create your own format with the template engine:
    • "hg outgoing --template '{date|shortdate} {author|person} {desc}\n' --newest-first"
  • Accessing repository over ssh using Tortoise Hg on Windows
    • If you want to use the ssh client of your Cygwin shell (and also ssh-agent), add the following to the [ui] section of your mercurial.ini: ssh = ssh

Mercurial on Windows/cygwin (experimental)

Using hg from cygwin sometimes fails miserably. TortoiseHg (http://bitbucket.org/tortoisehg/stable/wiki/Home) is worth a try.

To get rid of cygwin's hg rename /usr/bin/hg.

You can decide which ssl support you use in the cygwin shell. If you want to use the ssh-agent then you have to specify the following line in the [ui] section of ~/.hgrc: ssh = <Windows path to cygwin>\bin\ssh.exe

For the graphical interface of TortoiseHg Putty's pageant is the default ssh key provider. TortoiseHg allows easy graphical access to hg repositories within the Windows Explorer.

Personal tools