SCM Requirements

From Apache OpenOffice Wiki
Revision as of 14:11, 16 January 2007 by Hr (Talk | contribs)

Jump to: navigation, search

The problem

CVS has been an invaluable SCM (Software Configuration Management) tool for the past 6 years for OpenOffice.org, but it's showing its age. There have been calls to replace it with a modern SCM solution. The stated reasons vary, but the following topics are mentioned most often:

Branching and tagging is an O(n) operation
OpenOffice.org CWS (Child Workspace) development model relies on heavy use of branches and tags. CVS branching and tagging scales with the number of files affected and is so slow that it actually hinders development for a project of the size of OpenOffice.org.
Versioned renaming of files and directories
A history preserving renaming/moving of files is awkward in CVS, a renaming of directories is plain impossible.
Proper handling of binary files
The CVS way to handle binary files is clumsy and prone to errors.
Atomic commits
CVS has no atomic commits, something which practically all modern SCM have. Interestingly, I never felt that the missing atomic operations are a problem for "commit" operations, but they are badly needed for tagging and branching. It's quite common that a tag run is interrupted (for example during "cwsadd") and than the repository has to be cleaned before the operation can be retried.

Development model

Within the child workspace development model every development is done in private branches. That way a change can be developed and QAed completely insulated from changes in the trunk. In the SCM literature these kind of branches are also called "feature branches", we extend that concept even to the vast majority of bug fixes. The CWS tools provide means to update the branch to a newer version ("milestone") of the trunk, release engineering is responsible for the integration (merging) of the branches into the trunk. There seems to be an universal agreement that this model is a good model for OpenOffice.org, no one wants to go back to the bad old days of everyone committing directly to the trunk or a release branch. We just need a tool that is better in supporting this kind of development model than CVS.

There is much more to the CWS development model than just the branches and their handling, but this is outside the scope of the SCM system.

Requirements for the next OpenOffice.org SCM system

We want a SCM tool which fits to our development model, not the other way around. With that in mind we come to the following preliminary requirement list:

Mandatory requirements

  1. The repository format must be stable enough to support a code base of the size of OpenOffice.org (this is self evident)
  2. The new SCM should support the subset of CVS which is used in every day life, such as "status", "diff", "annotate", "log" etc in a reasonable way. I'm certain that every modern SCM does this, CVS sets the lower bar here.
  3. The general operation of the SCM should not be significantly slower than CVS, at least the important things: "commit", "diff", "log" etc. If some seldom used operations like "annotations", "history" are slower than this is probably not much of problem.
  4. The SCM tool must easily support the concept of branches.
    1. Branch creation must be light weight. We create branches even for one liner fixes in single files (bugfix CWS).
    2. It must be possible to repeately update (resync) a branch to a newer version (milestone) of the trunk or a release branch. We create branches that live for many months with constant work on them (huge feature CWS).
    3. If the update (resync) operation is expensive (in terms of merge time and repository size) than branching and resyncing must be possible on a subset of the tree, let's say only on a number of modules.
  5. The SCM must easily support the concept of tags
    1. Tag creation must be lightweight. We create regularly new milestones which needs to be tagged (milestone tags). If the repeated update mechanism for the branches requires tagging as CVS does to prevent multiple merging, than this is even more important (anchor tags).
  6. There must be an easy way to share changes on a branch even before the branch is ready for integration. There is a need to do cross merging between branches from time to time. Usually not a complete changeset is cross merged but just single selected pieces.
    1. With a centralized SCM this requirement is inherent, but for a distributed SCM this requires the setup of

Things we'll consider strongly in favor of a new SCM

Personal tools