Education ClassRoom/Previous Logs/OOo svn migration
On October 8 2008, Heiner Rechtien (Sun) presented the OpenOffice.org migration to Subversion.
Éric Bachard opened the classroom at 5:00, Paris Time.
Éric Bachard
Thanks a lot for accepting to be there :)
Heiner Rechtien
Thank you for the invitation Eric.
I'm going to say a bit about the OOo Subversion migration. I've prepared a (very) few slights and linked them to the eduction page in the Wiki.
Éric Bachard
The link : http://tools.openoffice.org/scm_migration/subversion_migration.pdf
Heiner Rechtien
We had a long discussion about what to use, a DSCM or Subversion. We evaluated a number of DSCM and found them all lacking in one aspect or another, so we finally choosed Subversion, but we'll keep the eyes open and reevaluate the DSCMs from time to time.
1st slide
The CVS server on CN is not exactly known as the fastest one so we made sure that our new server is fast. It should never be a bottleneck. We hope that the 4 CPU 8 cores 64 GB should suffice for a time. The network connectivity is also quite good, but can't be really controlled by us, of course. The server is located in Las vegas.
Over time we'll probably add a few services on that machine, the first one will be OpenGrok, a fast indexer and LXR replacement
We've got a backup server in case something goes seriously wrong. We hope that we can achieve a pretty good reliance with this setup.
2nd slide
One of the critical features we need from a SCM system is merge tracking. Here do the DSCMs (mercurial, git, bazaar) really shine, becuase mergetracking is kind of immanent in them. Subversion introduced merge tracking with Subversion 1.5, so we really need to use 1.5 SVN clients. Best is to always use the latest one (1.5.2 at the moment).
Branching and tagging is implemented in Subversion as "paths" into the repository. Thus the repository layout deserves a few word. All "paths" in SVN are accessed via an URL. For instance the trunk or head revision can be accessed via svn://svn.services.openoffice.org/ooo/trunk. When we create a milestone, then we do a svn copy operation from trunk to tags, ie :
svn copy <...>/ooo/trunk <...>/tags/DEV300_m32
This copy is a "copy-on-write" operation, it takes only a tiny bit of space in the repository. Branches are created with the same command, ie :
svn copy <...>/ooo/trunk <...>/branches/OOO310
So, if you are looking for a milestone, check the .../tags path. For a (major) branch it's .../branches. And for a CWS, you guessed it, .../cws. There are a few other paths in there like ../contrib, .../vendor, .../patches. which are for stuff which aren't really meant to go in the main development line.
Wedo not employ a path based access restrictions. If you got write access to the repository you can change things everywhere, but we think it's good enough to use just some conventions. We'll see if this works.
3rd slide
You can access the repository via svn://, http:// and svn+ssh:// methods. The first two are read-only, the last one is the read/write access method. You'll need to provide a public key for write access. svn+ssh:// is basically a SVN protocol over a SSH tunnel.
4th slide
We had to modify the CWS tooling of course, because it was ingrained with CVS special things. Now there is just one script which replaces the number of cwsxxx tools we had. cws create is for the creation of CWSs, it creates the CWS branch in the repository and registers a CWS with EIS. With cws fetch you can check out your CWS, alternatively you can also just use svn checkout <.../ooo/cws/mycws, of course. cws rebase is the replacement for cwsresync and makes use of the subversion mergetracking feature.
5th slide
The other cws commands are there as well.
6th slide
Finally I want to say a few things about what exactly has been migrated. Naively one would think that a migration involves the conversion of the whole project history. This is not really possible, because it would take almost forever for a project of the size of OOo. I broke of my trials after about a month or so of conversion time :) Next one could think of a full migration minus all the "finished" CWS branches (5000 or so). This is feasible but still takes a week or so and results in a 90 GBytes repository.
Way easier is to migrate only the trunk history, that means no historic branch, or tag will be in the SVN history. Also, only active files are migrated and only the last revision of binary files.
And the most radical approach is of course to migrate no history at all. To make the long story short, we choosed the "trunk only" approach. The main reason for this is, that we didn't have any down time for the developers.
The next slides contain a discussion about the pros and cons of the migration methods, but I think we should break here for a Q&A session.
Q&A session.
<chacha_chaudhry>
So CVS server will remain up ?
Heiner Rechtien
Yes. All old branches will be maintained via CVS.
Éric Bachard
So the history is safe ?
Heiner Rechtien
Yes. There will always be a at least read only server.
<chacha_chaudhry>
Just curious, how many active CWS are now ?
Heiner Rechtien
I think there were some 700 or so. The question is, how many of them are really active.
Éric Bachard
...or waiting for QA resp :) I have several cws in this case.
[17:32] <ericb2> blauwal: ok
[17:00] <ericb2> blauwal: let's start ?
[17:00] <blauwal> ericb2: yes, let's start
[17:00] <ericb2> blauwal: thanks a lot for accepting to be there :)
[17:00] <blauwal> Thank you for the invitation Eric
[17:00] <ericb2> blauwal: you're welcome !
[17:00] * fardad is all ears (eyes)!
[17:01] <blauwal> I'm going to say a bit about the OOo subversion migration
[17:01] * ronyf (n=chatzill@abt-wi-018.wu-wien.ac.at) has joined #education.openoffice.org
[17:01] <blauwal> I've prepared a (very) few slights and linked them to the eduction page in the Wiki
[17:01] * chacha_chaudhry (n=dev@gnu-india/supporter/rakeshpandit) has joined #education.openoffice.org
[17:01] * ChanServ gives channel operator status to chacha_chaudhry
[17:02] Balise blauwal
[17:02] <blauwal> I'm a lousy typist, so please bear with me
[17:02] <ericb2> blauwal: :)
[17:02] <blauwal> CVS is aging as you all know and we got a lot of pressure to work with something better
[17:03] <ericb2> The link : http://tools.openoffice.org/scm_migration/subversion_migration.pdf
[17:03] <blauwal> We had a long discussion about what to use, a DSCM or Subversion
[17:03] <blauwal> We evaluated a number of DSCM and found them all lacking in one aspect or another, so we finally choosed SVN
[17:04] <blauwal> but we'll keep the eyes open and reevaluate the DSCMs from time to time
[17:04] <blauwal> 1) slide
[17:05] <blauwal> The CVS server on CN is not exactly known as the fastest one so we made sure that our new server is fast.
[17:05] <blauwal> It should never be a bottleneck.
[17:06] <blauwal> We hope that the 4 CPU 8 cores 64 GB should suffice for a time
[17:06] <blauwal> The network connectivity is also quite good, but can't be really controlled by us, of course. The server is located in Las vegas
[17:07] <blauwal> Over time we'll probably add a few services on that machine, the first one will be OpenGrok, a fast indexer and LXR replacement
[17:08] <blauwal> We've got a backup server in case something goes seriously wrong. We hope that we can achieve a pretty good reliance with this setup.
[17:08] <blauwal> next slide ...
[17:09] <blauwal> One of the critical features we need from a SCM system is merge tracking. Here do the DSCMs (mercurial, git, bazaar) really shine, becuase mergetracking is kind of immanent in them
[17:10] * rtimm (n=Ruediger@sd-socks-197.staroffice.de) has joined #education.openoffice.org
[17:10] <blauwal> Subversion introduced merge tracking with Subversion 1.5, so we really need to use 1.5 SVN clients
[17:10] <blauwal> Best is to always use the latest one (1.5.2 at the moment)
[17:11] * humph (n=dave@cdot.senecac.on.ca) has joined #education.openoffice.org
[17:11] <blauwal> Branching and tagging is implemented in Subversion as "paths" into the repository. Thus the repository layout deserves a few word.
[17:13] <blauwal> All "paths" in SVN are accessed via an URL. For instance the trunk or head revision can be accessed via sv://svn.services.openoffice.org/ooo/trunk
[17:13] <blauwal> s/sv/svn
[17:13] <blauwal> when we create a milestone, the we do a svn copy operation from trunk to tags, ie
[17:14] <blauwal> svn copy <...>/ooo/trunk <...>/tags/DEV300_m32
[17:14] <blauwal> this copy is a "copy-on-write" operation, it takes only a tiny bit of space in the repository
[17:15] <blauwal> branches are created with the same command, ie
[17:15] <blauwal> svn copy <...>/ooo/trunk <...>/branches/OOO310
[17:16] <blauwal> so, if you are looking for a milestone, check the .../tags path
[17:16] <blauwal> for a (major) branch it's .../branches
[17:16] <blauwal> and for a CWS, you guessed it, .../cws
[17:17] <blauwal> there are a few other paths in there like ../contrib .../vendor .../patches which are for stuff which aren't really meant to go in the main development line
[17:18] <blauwal> We do not employ a path based access restrictions. If you got write access to the repository you can change things everywhere, but we think it's good enough to use just some conventions
[17:19] <blauwal> we'll see if this works.
[17:19] <blauwal> next slide ...
[17:20] <blauwal> You can access the repository via svn://, http:// and svn+ssh:// methods, the first two are read-only, the last one is the read/write access method. You'll need to provide a public
[17:20] <blauwal> key for write access
[17:21] <blauwal> svn+ssh:// is basically a svn protocol over a ssh tunnel
[17:21] <blauwal> next slide
[17:21] <blauwal> We had to modify the CWS tooling of course ...
[17:21] <blauwal> because it was ingrained with CVS special things
[17:22] * riddle28 (i=ca9929ab@gateway/web/ajax/mibbit.com/x-fa6fef1e35ff3f4d) has joined #education.openoffice.org
[17:22] <blauwal> Now there is just one script which replaces the number of cwsxxx tools we had
[17:23] <blauwal> "cws create" is for the creation of CWSs, it creates the CWS branch in the repository and registers a CWS with EIS
[17:24] <blauwal> With "cws fetch" you can check out your CWS, alternatively you can also just use "svn checkout <.../ooo/cws/mycws" of course
[17:24] <blauwal> "cws rebase" is the replacement for cwsresync and makes use of the subversion mergetracking feature.
[17:24] <blauwal> next slide
[17:25] <blauwal> the other cws commands are there as well
[17:25] <blauwal> next slide
[17:25] * rtimm (n=Ruediger@sd-socks-197.staroffice.de) has left #education.openoffice.org
[17:25] <blauwal> finally I want to say a few things about what exactly has been migrated.
[17:26] <blauwal> Naively one would think that a migration involves the conversion of the whole project history
[17:27] <blauwal> This is not really possible, because it would take almost forever for a project of the size of OOo.
[17:27] <blauwal> I broke of my trials after about a month or so of conversion time :)
[17:28] <blauwal> Next one could think of a full migration minus all the "finished" CWS branches (5000 or so). This is feasible but still takes a week or so and results in a 90 MBytes repository
[17:29] <blauwal> sorry 90 GB repository
[17:30] <blauwal> Way easier is to migrate only the trunk history, that means no historic branch, or tag will be in the SVN history. Also, only active files are migrated and only the last revision of binary files
[17:30] <blauwal> And the most radical approach is of course to migrate no history at all.
[17:31] <blauwal> To make the long story short, we choosed the "trunk only" approach. The main reason for this is, that we didn't have any down time for the developers.
[17:32] <blauwal> the next slides contain a discussion about the pros and cons of the migration methods, but I think we should break here for a Q&A session
[17:32] <ericb2> blauwal: ok
[17:33] <blauwal> OK keep the questions coming :)
[17:33] <chacha_chaudhry> blauwal: so CVS server will remain up ?
[17:33] <blauwal> yes.
[17:34] <blauwal> All old branches will be maintained via CVS
[17:34] <ericb2> blauwal: so the history is safe ?
[17:34] <chacha_chaudhry> blauwal: Just curious, how many active CWS are now ?
[17:34] <blauwal> Yes. There will always be a at least read only server
[17:35] <blauwal> I think there were some 700 or so
[17:35] <ericb2> uff
[17:35] <blauwal> The question is, how many of them are really active
[17:35] <ericb2> blauwal: or waiting for QA resp :)
[17:35] <blauwal> :)
[17:35] * ericb2 has several cws in this case
[17:36] <chacha_chaudhry> It would have been real pain with 90 GB repo. :)
[17:36] <blauwal> you have to see the number in comparison to the number of closed CWS: >5000
[17:36] <blauwal> yes, hard to sync
[17:36] <blauwal> the trunk only solution resulted in a repository of about 6 GB
[17:37] <ericb2> blauwal: how can I checkout a given milestone and complete with one or several cws ?
[17:37] <chacha_chaudhry> Nice, I made a mistake, creating recently a cws with CVS, how will I have to migrate: I mean some quick basic steps?
[17:38] <ericb2> chacha_chaudhry: sorry
[17:38] <blauwal> ericb2: svn checkout svn:/svn.services.openoffice.org/ooo/tags/DEV300_m32 for instance
[17:38] <chacha_chaudhry> ericb2: no problem.
[17:38] <blauwal> chacha_chaudhry: I wrote a migration guide
[17:38] <chacha_chaudhry> blauwal: Nice
[17:38] <blauwal> You can find it on the migration wiki page
[17:38] <chacha_chaudhry> blauwal: okay :)
[17:39] <ericb2> chacha_chaudhry: http://wiki.services.openoffice.org/wiki/OOo_and_Subversion
[17:39] <blauwal> basically: create a diff and apply it to a subversion branch
[17:39] <chacha_chaudhry> ericb2: Thanks
[17:40] <blauwal> If you have a checked out tree, you can use the "svn switch" command for a substantial speedup
[17:40] <blauwal> for switching between milestones or CWSs for example
[17:40] <chacha_chaudhry> blauwal: So, annotating a file will only give us details from m32 onwards and in case we really want to peak out before that browser will be best ?
[17:40] <chacha_chaudhry> blauwal: okay
[17:40] <blauwal> chacha_chaudhry: no, the trunk history is still there
[17:41] <chacha_chaudhry> blauwal: aah, okay
[17:41] <blauwal> back from 2000 :)
[17:41] <chacha_chaudhry> :)
[17:41] <ericb2> blauwal: and when there are new files , or files on HEAD (like apple_remote new module), are they already in the trunk ?
[17:41] <blauwal> ericb2: it depends:
[17:42] <blauwal> ericb2: everything which was in a DEV300 m31 build (officially) has been migrated
[17:42] <blauwal> ericb2: new modules which have been under development probably not
[17:43] <ericb2> blauwal: so, I'll have to add the files one by one ? FYI, I just resync'ed most of my cws's with m31. (Are they counted in ?)
[17:43] <blauwal> ericb2: in that case, just add them to your CWS in subversion and they will finally go into trunk on migration day
[17:43] <ericb2> blauwal: ok
[17:43] <blauwal> you can add the recursive in SVN
[17:44] <blauwal> kinda similar to cvs import on a subpath
[17:44] <ericb2> blauwal: maybe I'll ask for help
[17:44] <ericb2> blauwal: I fear to do mistakes
[17:44] <blauwal> no problem, just contact me or other REs when you are ready
[17:45] <ericb2> blauwal: thanks !
[17:45] * ericb2 noticed
[17:46] <chacha_chaudhry> blauwal: may you explain bit about merge tracking, how it will help us as compared to CVS, just curious ?
[17:46] <blauwal> You might remember that with CVS we had to do it by hand
[17:47] <blauwal> We worked with the branch tag and a so called anchor tag
[17:47] <chacha_chaudhry> blauwal: okay
[17:47] <chacha_chaudhry> blauwal: yeah
[17:48] <blauwal> The anchor tag is needed for preventing the dreaded repeated merge syndrome
[17:48] <blauwal> If you rebase a CWS it means that you merge newer content from trunk into your CWS
[17:48] <chacha_chaudhry> okay
[17:49] <blauwal> This means your branch is no longer pure ... it contains changes by others.
[17:49] <ericb2> blauwal: are there possible conflicts ?
[17:49] <blauwal> The anchor tags tracks what has been merge from trunk ... so that you can still do a pure diff
[17:49] <blauwal> ericb2: yes
[17:49] <chacha_chaudhry> blauwal: okay
[17:49] <ericb2> blauwal: and how to solve them ? Manually like with cvs ?
[17:50] <blauwal> chacha_chaudhry: merge tracking in subversion does the same ... just automatically :)
[17:50] <chacha_chaudhry> blauwal: Nice :)
[17:50] <blauwal> ericb2: conflicts need to be solved manually, this has not changed
[17:50] <ericb2> blauwal: ok
[17:51] <blauwal> ericb2: well no tool can do that :) but now there is a "svn resolve" command
[17:51] <ericb2> blauwal: ahh . .but when is the dev informed about the existing conflict ? When doing rebase ?
[17:51] <blauwal> After svn noticed a conflict in your working copy you have to explicitly resolve it. other wise you can't commit
[17:52] <ericb2> blauwal: ok. Force clean merge
[17:52] <blauwal> ericb2: Yes. rebasing is nothing other than doing: svn merge svn+ssh:/..../ooo/trunk
[17:52] <blauwal> to your working copy
[17:53] <blauwal> you solve the conflicts, and then commit the result of the merge
[17:55] <chacha_chaudhry> ericb2: some time one want to keep his parts only in every conflict, without looking. I used to run a emacs macro for that. Is Force clean merge same ?
[17:55] <blauwal> svn resolve lets you choose what you do: keep theirs, keep mine or keep modified
[17:55] <ericb2> blauwal: if I can, would be kind to put some examples to the wiki. e.g. provide one example of the needed command lines for a resync from a cws from m31 to m33
[17:56] <chacha_chaudhry> blauwal: ^^^ sorry ericb2 :)
[17:56] <ericb2> blauwal: and some other little things
[17:56] <ericb2> chacha_chaudhry: no problem :)
[17:56] <chacha_chaudhry> blauwal: okay, Nice
[17:56] <blauwal> ericb2: Yes I'll do. I'm a bit hesitating currently because there is still a bug in the script :(
[17:56] <ericb2> blauwal: ah, ok. I'll wait
[17:56] <ericb2> blauwal: but I'll start to think because I'd like to not miss 3.1 gate :)
[17:57] <blauwal> ericb2: :)
[17:57] * soneca (n=dconte@189.0.87.234) has joined #education.openoffice.org
[17:57] <blauwal> essentially you can do it without CWS tooling at all: just merge from trunk
[17:57] <soneca> good afternoom
[17:57] <blauwal> CWS tooling only updates the EIS infos
[17:58] >soneca< : we currently are doing a ClassRoom. the link for the slides : http://wiki.services.openoffice.org/wiki/Education_ClassRoom/Agenda -> look on right, at Heiner expose ( green line )
[17:58] <blauwal> it will not mangle your CWS if you do it without updating EIS
[17:58] <blauwal> because all necessary information is in Subversion
[17:59] <chacha_chaudhry> blauwal: Is CWS tooling all Perl ?
[17:59] <blauwal> so, if you are pressed to rebase: svn merge svn+ssh://svn@svn.services.openoffice.org/ooo/trunk ooo
[17:59] <blauwal> and you are done
[17:59] <blauwal> chacha_chaudhry: yes
[17:59] <blauwal> chacha_chaudhry: I was thinking about going to python or whatever, but so I could reuse the EIS part
[18:00] <chacha_chaudhry> blauwal: :)
[18:00] <ericb2> blauwal: I'm not sure I have completly understood. How the svn knows my cws name ? the CWS_WORK_STAMP ?
[18:00] <blauwal> ericb2: CWS_WORK_STAMP is still needed for building
[18:01] <blauwal> svn is only interested in the repository path
[18:01] <blauwal> for instance: ....openoffice.org/ooo/cws/foo42
[18:01] <blauwal> so if you do a "svn info" in your working copy and you see the CWS URL you are doing the right thing
[18:02] <ericb2> blauwal: important. Thanks
[18:03] <blauwal> "svn info" is your friend anyway ... I use it all the time when a do things with many different working copies
[18:04] <blauwal> the CWS tooling is supposed to hide the URLs a bit from the user, but it's best to be aware of them anyway
[18:06] <chacha_chaudhry> blauwal: wiki tells pruning has been done with all files from /Attic. I am not aware of what /Attic had ? I suppose depreciated files.
[18:06] <blauwal> chacha_chaudhry: Files which are not active on HEAD are in Attic
[18:07] <chacha_chaudhry> blauwal: okay.
[18:07] <blauwal> either because they have been removed .... or because they never left the branch on which they were created
[18:09] <ericb2> @ALL Other questions ? Just shoot ;-)
[18:12] <ericb2> blauwal: We don't want to abuse, but I'd like to say It was a great ClassRoom, thank you very much for coming !!
[18:13] <blauwal> Feel free to contact me per email or IRC if you have problems
[18:13] <blauwal> Thank you for listening
[18:13] <ericb2> blauwal: no problem, we'll do. will your slides be always available ?
[18:13] <blauwal> Yes. At least as long the CN server lives :)
[18:14] <ericb2> @ALL : the log of the ClassRoom is online : http://wiki.services.openoffice.org/wiki/Education_ClassRoom/Previous_Logs/OOo_svn_migration
[18:14] <fardad> blauwal: thank you
[18:14] <chacha_chaudhry> blauwal: Thanks for this nice important session.
[18:14] <ericb2> blauwal: ok ! thanks again, and see yo soon !!
[18:14] <blauwal> Bye!