Difference between revisions of "Git"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Comparison: Replaced outdated comparison with Mercurial with set of links to different comparisons)
 
(9 intermediate revisions by 2 users not shown)
Line 10: Line 10:
  
 
The size of the sources is about 1.3G, the size of the 3rd party stuff is 591M.  Please follow the instructions on http://go-oo.org/git to get the tree.
 
The size of the sources is about 1.3G, the size of the 3rd party stuff is 591M.  Please follow the instructions on http://go-oo.org/git to get the tree.
 +
 +
For testing purposes, even a git tree '''without history''' is available as <tt>git://go-oo.org/git/without-history/src680-m211.git</tt>.  It is a full import of src680-m211 (with the 3rdparty libraries, localizations, etc.)  The plan is to start the OOo git tree as a tree without history with the possibility to 'graft' the history into this ([http://tools.openoffice.org/servlets/ReadMsg?list=dev&msgNo=6198 message], [http://repo.or.cz/w/elinks.git?a=blob;f=contrib/grafthistory.sh sample script]).
  
 
== Transformations ==
 
== Transformations ==
Line 22: Line 24:
 
* 'RESYNC:.*FILE MERGED', and 'RESYNC:.*FILE REMOVED' are grouped inside branches (with single 'RESYNC' log entry)
 
* 'RESYNC:.*FILE MERGED', and 'RESYNC:.*FILE REMOVED' are grouped inside branches (with single 'RESYNC' log entry)
 
** May result in multiple 'RESYNC' commits inside the branch when a commit happened to another one in the middle of the resync
 
** May result in multiple 'RESYNC' commits inside the branch when a commit happened to another one in the middle of the resync
 +
 +
After creating the tree, it is worth repacking, like
 +
 +
<pre>
 +
git repack -a -f --depth=50 --window=250
 +
</pre>
 +
 +
If it's going out of memory, one can limit it:
 +
 +
<pre>
 +
git config pack.deltaCacheLimit 1
 +
git config pack.deltaCacheSize 1
 +
git config pack.windowMemory 4g
 +
</pre>
  
 
== Requirements/TODO ==
 
== Requirements/TODO ==
Line 28: Line 44:
 
** maybe use the data from [[DomainDeveloper]] (complete that where necessary) if there's no easy way to extract the names from CollabNet
 
** maybe use the data from [[DomainDeveloper]] (complete that where necessary) if there's no easy way to extract the names from CollabNet
 
* Delete merged branches (from 'heads', not from history!)
 
* Delete merged branches (from 'heads', not from history!)
* Evaluate the speed & compare with SVN (the RE preferred option) - checkout/clone, branch, resync, integration
+
* Provide the too old history as 'graft' - see eg. http://repo.or.cz/w/elinks.git?a=blob;f=contrib/grafthistory.sh
 
* Translations to a separate git tree as well?
 
* Translations to a separate git tree as well?
 
* URE to a separate git tree?
 
* URE to a separate git tree?
Line 35: Line 51:
  
 
== Comparison ==
 
== Comparison ==
 +
 +
=== General ===
  
 
Links to Git comparison with other SCMs: http://git.or.cz/gitwiki/GitLinks#comparison
 
Links to Git comparison with other SCMs: http://git.or.cz/gitwiki/GitLinks#comparison
  
 
Comparison of git with Subversion: http://git.or.cz/gitwiki/GitSvnComparsion
 
Comparison of git with Subversion: http://git.or.cz/gitwiki/GitSvnComparsion
 +
 +
=== Machines used for the testing ===
 +
 +
CVS tests:
 +
* ???
 +
 +
Git tests [let's call this one 'git machine' ;-)]:
 +
* CPU: AMD Athlon(tm) 64 Processor 3200+
 +
* RAM: 1G
 +
* Disk (info from bonnie):
 +
              ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek-
 +
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)-
 +
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
 +
one    1*2000 37819 77.6 44296 16.8 16982  5.1 35203 63.9 45915  6.6  152.4  0.4
 +
* OS: SUSE 10.1
 +
* Filesystem: ext3
 +
* Net connection: ~20Mbit
 +
 +
SVN tests:
 +
* ???
 +
 +
=== Notes ===
 +
 +
The git repository could [should! ;-)] be tuned for better results:
 +
* '''Delete integrated branches''' - the history will be still preserved, just the number of open heads will reduce (by about 3000)
 +
* '''Graft history''' - the old development can be 'hidden' and available just to those who really need it using a simple script, like http://repo.or.cz/w/elinks.git?a=blob;f=contrib/grafthistory.sh .  This way we can save about 1G of download!
 +
 +
=== The Results ===
  
 
{| border="1" cellspacing="0" cellpadding="5"
 
{| border="1" cellspacing="0" cellpadding="5"
Line 49: Line 95:
 
|8.5G
 
|8.5G
 
|1.3G
 
|1.3G
|Not measured yet
+
|
 
|-
 
|-
 
|Size of data on the server [3rd party]
 
|Size of data on the server [3rd party]
 
|1.1G
 
|1.1G
 
|591M
 
|591M
| -"-
+
|
 
|-
 
|-
 
|Size of checkout [OOo sources]
 
|Size of checkout [OOo sources]
 
|1.4G
 
|1.4G
|2.8G [files you can hack on + the history]
+
|2.8G [files you can hack on (contains localize.sdf's) + the history]
|1.5G
+
|3.3G [files you can hack on + localize.sdf's from data-trunk + .svn directories]
 
|-
 
|-
 
|Size of checkout [3rd party]
 
|Size of checkout [3rd party]
 
|98M
 
|98M
 
|688M [files you can hack on + the history]
 
|688M [files you can hack on + the history]
| -"-
+
|199M [files you can hack on + .svn directories]
 
|-
 
|-
|Checkout time [OOo sources]
+
|Initial checkout time [OOo sources]
 
|117 minutes (Linux, 2MBit DSL), 26 minutes (Linux, 2MBit DSL, with compression (-z 6)
 
|117 minutes (Linux, 2MBit DSL), 26 minutes (Linux, 2MBit DSL, with compression (-z 6)
|130 minutes (Linux, 2MBit DSL) [from go-oo.org], 100min (Linux, 2MBit DSL, Wireless, no proxy) [from go-oo.org] (1586669 objects (counting, deltifying, indexing) 1144663 deltas to resolve)
+
|130 minutes, (51 min for a pull) (Linux, 2MBit DSL) [from go-oo.org]<br/>
| 60 Minutes (Windows, 34Mbit Line)
+
100min (Linux, 2MBit DSL, Wireless, no proxy) [from go-oo.org] (1586669 objects (counting, deltifying, indexing) 1144663 deltas to resolve)<br/>
 +
44min (faster machine than the [git machine], but with the same connection) [from go-oo.org]
 +
| 60 minutes (Windows, 34Mbit Line)<br/>
 +
58 min [git machine]
 
|-
 
|-
|Checkout time [3rd party]
+
|Initial checkout time [3rd party]
| -"-
+
|
|Not measured yet
+
|
| -"-
+
|
 
|-
 
|-
 
|Branch creation
 
|Branch creation
| -"-
+
|
 
|Immediately
 
|Immediately
| -"-
+
|Immediately with local svn server, 25 sec with collab.net server
 
|-
 
|-
 
|Branch switch
 
|Branch switch
| -"-
+
|
 
|<15sec [to newly created], 3min to an old one
 
|<15sec [to newly created], 3min to an old one
| -"-
+
|12min 40sec [git machine] ??
 
|-
 
|-
 
|Diff
 
|Diff
| -"-
+
|
 
|Immediately
 
|Immediately
| -"-
+
|4min 13sec [git machine]
 
|-
 
|-
 
|Commit
 
|Commit
| -"-
+
|
 
|13-25sec
 
|13-25sec
| -"-
+
|
 
|-
 
|-
 
|Merge
 
|Merge
| -"-
+
|
 
|10sec [new branch with few changes], <3min [long living branch, harder scenario]
 
|10sec [new branch with few changes], <3min [long living branch, harder scenario]
| -"-
+
|
 
|-
 
|-
 
|Resync
 
|Resync
| -"-
+
|
 
|Same as 'Merge' - it's a merge from 'master' to the branch.
 
|Same as 'Merge' - it's a merge from 'master' to the branch.
| -"-
+
|
 
|-
 
|-
 
|Integration
 
|Integration
| -"-
+
|
 
|Same as 'Merge' - it's a merge from a branch to the 'master'.
 
|Same as 'Merge' - it's a merge from a branch to the 'master'.
| -"-
+
|
 
|-
 
|-
 
|Push
 
|Push
 
|Not necessary
 
|Not necessary
|Not measured yet.
+
|push back one branch in local network: 9 sec, push back repository 40 min
 
|Not necessary
 
|Not necessary
 
|}
 
|}
Line 129: Line 178:
 
|<tt>cvs -d:pserver:anoncvs@anoncvs.services.openoffice.org:/cvs co OpenOffice2</tt>
 
|<tt>cvs -d:pserver:anoncvs@anoncvs.services.openoffice.org:/cvs co OpenOffice2</tt>
 
|<tt>git clone git://go-oo.org/git/openoffice.org/ooo.git openoffice.org</tt> (How does this work with a proxy)
 
|<tt>git clone git://go-oo.org/git/openoffice.org/ooo.git openoffice.org</tt> (How does this work with a proxy)
|<tt> svn checkout http://svn.stage.openoffice.org/svn/svn/trunk svn</tt>
+
|<tt> svn checkout http://svn.stage.openoffice.org/svn/svn/trunk svn</tt><br/>
 +
(This tree does not contain localize.sdf's, they are in <tt>trunk-data</tt>.)
 
|-
 
|-
 
|Branch creation
 
|Branch creation
Line 135: Line 185:
 
|[all the following commands were issued in the openoffice.org subdir]<br/>
 
|[all the following commands were issued in the openoffice.org subdir]<br/>
 
<tt>git branch test</tt>
 
<tt>git branch test</tt>
|
+
|[all the following commands were issued in the svn subdir]
 
|-
 
|-
 
|Branch switch
 
|Branch switch
 
|
 
|
 
|<tt>git checkout test</tt>
 
|<tt>git checkout test</tt>
|
+
|<tt>svn switch http://svn.stage.openoffice.org/svn/svn/vendors/sun-cvs/tags/SRC680_m172</tt>
 
|-
 
|-
 
|Diff
 
|Diff
 
|
 
|
 
|<tt>vim vcl/unx/kde/salnativewidgets-kde.cxx</tt> [to do some changes] <tt>; git diff</tt>
 
|<tt>vim vcl/unx/kde/salnativewidgets-kde.cxx</tt> [to do some changes] <tt>; git diff</tt>
|
+
|<tt>vim vcl/unx/kde/salnativewidgets-kde.cxx</tt> [to do some changes] <tt>; svn diff</tt>
 
|-
 
|-
 
|Commit
 
|Commit
Line 176: Line 226:
 
|
 
|
 
|}
 
|}
 +
[[Category:SCM]]

Latest revision as of 00:40, 16 December 2009

Git is a popular version control system designed to handle very large projects with speed and efficiency. See http://git.or.cz/ for more info.

The Windows users might be interested in the MinGW git port (binaries).

Git and OpenOffice.org

A functional git tree with the entire OOo history for testing purposes is here: http://go-oo.org/git. It is an imported CVS tree that was split into two parts:

  • The sources themselves - ooo.git
  • The 3rd party stuff (binary mozilla, zlib, berkeleydb, ...) - 3rdparty.git

The size of the sources is about 1.3G, the size of the 3rd party stuff is 591M. Please follow the instructions on http://go-oo.org/git to get the tree.

For testing purposes, even a git tree without history is available as git://go-oo.org/git/without-history/src680-m211.git. It is a full import of src680-m211 (with the 3rdparty libraries, localizations, etc.) The plan is to start the OOo git tree as a tree without history with the possibility to 'graft' the history into this (message, sample script).

Transformations

These transformations are done while converting from CVS:

  • The OOo repository is split into the sources and 3rd party sources as described above
  • 'cws_src680_xyz' branches are renamed to simple 'xyz'
  • 'CWS_SRC680_XYZ_ANCHOR' tags are renamed to simple 'XYZ'
  • 'INTEGRATION: CWS xyz' commits are grouped into one commit (they are generated by CWS tooling per-file), and treated as a merge in the git tree
  • Tabs are converted to 4 spaces at the beginning of the lines in .c/.cxx/.h/.hxx/.mk/.src
  • 'RESYNC:.*FILE MERGED', and 'RESYNC:.*FILE REMOVED' are grouped inside branches (with single 'RESYNC' log entry)
    • May result in multiple 'RESYNC' commits inside the branch when a commit happened to another one in the middle of the resync

After creating the tree, it is worth repacking, like

git repack -a -f --depth=50 --window=250

If it's going out of memory, one can limit it:

git config pack.deltaCacheLimit 1
git config pack.deltaCacheSize 1
git config pack.windowMemory 4g

Requirements/TODO

  • Convert CollabNet account names into real names
    • maybe use the data from DomainDeveloper (complete that where necessary) if there's no easy way to extract the names from CollabNet
  • Delete merged branches (from 'heads', not from history!)
  • Provide the too old history as 'graft' - see eg. http://repo.or.cz/w/elinks.git?a=blob;f=contrib/grafthistory.sh
  • Translations to a separate git tree as well?
  • URE to a separate git tree?
  • ODF Toolkit to a separate git tree?
  • .pdf version of developer's guide consume quite some space as well - any chance to do something with it?

Comparison

General

Links to Git comparison with other SCMs: http://git.or.cz/gitwiki/GitLinks#comparison

Comparison of git with Subversion: http://git.or.cz/gitwiki/GitSvnComparsion

Machines used for the testing

CVS tests:

  •  ???

Git tests [let's call this one 'git machine' ;-)]:

  • CPU: AMD Athlon(tm) 64 Processor 3200+
  • RAM: 1G
  • Disk (info from bonnie):
              ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek-
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)-
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU   /sec %CPU
one    1*2000 37819 77.6 44296 16.8 16982  5.1 35203 63.9 45915  6.6  152.4  0.4
  • OS: SUSE 10.1
  • Filesystem: ext3
  • Net connection: ~20Mbit

SVN tests:

  •  ???

Notes

The git repository could [should! ;-)] be tuned for better results:

  • Delete integrated branches - the history will be still preserved, just the number of open heads will reduce (by about 3000)
  • Graft history - the old development can be 'hidden' and available just to those who really need it using a simple script, like http://repo.or.cz/w/elinks.git?a=blob;f=contrib/grafthistory.sh . This way we can save about 1G of download!

The Results

What CVS git SVN
Size of data on the server [OOo sources] 8.5G 1.3G
Size of data on the server [3rd party] 1.1G 591M
Size of checkout [OOo sources] 1.4G 2.8G [files you can hack on (contains localize.sdf's) + the history] 3.3G [files you can hack on + localize.sdf's from data-trunk + .svn directories]
Size of checkout [3rd party] 98M 688M [files you can hack on + the history] 199M [files you can hack on + .svn directories]
Initial checkout time [OOo sources] 117 minutes (Linux, 2MBit DSL), 26 minutes (Linux, 2MBit DSL, with compression (-z 6) 130 minutes, (51 min for a pull) (Linux, 2MBit DSL) [from go-oo.org]

100min (Linux, 2MBit DSL, Wireless, no proxy) [from go-oo.org] (1586669 objects (counting, deltifying, indexing) 1144663 deltas to resolve)
44min (faster machine than the [git machine], but with the same connection) [from go-oo.org]

60 minutes (Windows, 34Mbit Line)

58 min [git machine]

Initial checkout time [3rd party]
Branch creation Immediately Immediately with local svn server, 25 sec with collab.net server
Branch switch <15sec [to newly created], 3min to an old one 12min 40sec [git machine] ??
Diff Immediately 4min 13sec [git machine]
Commit 13-25sec
Merge 10sec [new branch with few changes], <3min [long living branch, harder scenario]
Resync Same as 'Merge' - it's a merge from 'master' to the branch.
Integration Same as 'Merge' - it's a merge from a branch to the 'master'.
Push Not necessary push back one branch in local network: 9 sec, push back repository 40 min Not necessary

'3rd party' in this context means the following modules: agg, beanshell, berkeleydb, bitstream_vera_fonts, boost, curl, dictionaries, epm, expat, freetype, hsqldb, icu, jpeg, libwpd, libxml2, moz, msfontextract, nas, neon, np_sdk, portaudio, python, sablot, sane, sndfile, stlport, vigra, xalan, xt, zlib.

Commands used for the tests:

What CVS git SVN
checkout [OOo sources] cvs -d:pserver:anoncvs@anoncvs.services.openoffice.org:/cvs co OpenOffice2 git clone git://go-oo.org/git/openoffice.org/ooo.git openoffice.org (How does this work with a proxy) svn checkout http://svn.stage.openoffice.org/svn/svn/trunk svn

(This tree does not contain localize.sdf's, they are in trunk-data.)

Branch creation [all the following commands were issued in the openoffice.org subdir]

git branch test

[all the following commands were issued in the svn subdir]
Branch switch git checkout test svn switch http://svn.stage.openoffice.org/svn/svn/vendors/sun-cvs/tags/SRC680_m172
Diff vim vcl/unx/kde/salnativewidgets-kde.cxx [to do some changes] ; git diff vim vcl/unx/kde/salnativewidgets-kde.cxx [to do some changes] ; svn diff
Commit [with the changes from 'Diff']

git commit -a

Merge [the simple scenario] git branch test2 ; git checkout test2 ; vim vcl/unx/kde/salnativewidgets-kde.cxx [another changes] ; git commit -a ; git checkout test [preparation to have something to merge]

git pull . test2 [the merge itself]

Merge [the harder scenario] git pull git://go-oo.org/git/openoffice.org/ooo.git unxsplash

[an old CWS of mine - called cws_src680_unxsplash in the CVS]

Resync [it's usually not necessary to do resynces with git; but when needed to get a feature a branch would depend on, it's just a merge from remote 'master']
Integration git checkout master ; git pull . test

[or alternatively: git checkout master ; git merge 'merging test into master' master test]

Personal tools