Git Side Benefit: Reducing Disk Usage

A side benefit of switching from Subversion to Git for source control is that Git does not use shadow files and directories to find out what has changed.
I created two checkouts of a svn project — one using traditional svn and one using git (which can actually clone a svn repos — see explanation below).

[code lang=”bash”][~/my-projects] du -hd 0 \
> berammelse berammelse.git \
> dokboks dokboks.git \
> intrastat intrastat.git \
> kompetenceregister k2.git \
> ourpeople ourpeople.git \
> let0004 let0004.git

43M berammelse
38M berammelse.git
63M dokboks
11M dokboks.git
53M intrastat
34M intrastat.git
144M kompetenceregister
47M kompetenceregister.git
118M ourpeople
151M ourpeople.git
163M let0004
98M let0004.git
[/code]
For any project I tried this, Git uses significantly less disk space. It’s just a side effect, I know. But more important: The git version has all the logfile history offline and that makes rolling forward/back much faster :)
FYI i’m using my version of Dr. Nic’s “gitify” command that wraps a “git svn” command. This way I can use git locally and then re-commit into svn when done.

[code lang=”ruby”]#!/usr/bin/env ruby -wKU
# command from
# http://drnicwilliams.com/2007/11/22/going-offline-without-your-favourite-subversion-repository/
# Install with
# sudo port install git-core +svn

# get svn info location
svnurl = `svn info | grep “^URL:”`.gsub(‘URL: ‘,”).chomp

# project = basename
project = File.basename(Dir.pwd)

puts cmd = “git svn clone #{svnurl} ../#{project}.git”

`#{cmd}`

puts “(‘git svn rebase’ merges changes from svn.\n ‘git svn dcommit’ updates svn with your git commits)”
[/code]

5 Responses to “Git Side Benefit: Reducing Disk Usage”

  1. Jeet Says:

    I am not entirely sure about git reducing the disk usage, since it has to keep ‘whole’ copies of the commits, I am inclined to think that total size required by a well used git repository would be higher than svn/cvs repository that will only keep differences.

    Might try something locally before I can conclude.. but yeah, I have switched to git already :))

  2. Jesper Rønn-Jensen Says:

    @Jeet i follow your thoughts but the repositories I tested with above actually shows a disk usage reduction.

    The biggest SVN repository above had more than 16,000 commits and the Git version was 40% smaller even though it contained the entire project history.

    So even though that the svn repositories here are only snapshots in time (the history remains on the server), there is a significant difference.

    BTW, the Git repos also only contains differences between each revision.

  3. Andrew Says:

    See https://git.wiki.kernel.org/index.php/GitSvnComparison
    They say there that GIT version of Mozilla repository is 420 MB instead of 12 GB in SVN.
    But that’s the remote repository, which does not really matter to me :-)
    What matters to me is that when I download it to my machine, with GIT I have to download the entire 420 MB :-( while with SVN only a snapshot of the latest sources.
    Distributed VCSs are good for large decentralized projects.

  4. subbarao Says:

    Hey guys, I want to know the exact procedure used by git to reduce disk usage i.e. the algorithm used by it to store changes. Any suggestions ???

  5. justaddwater.dk | Using Git for SVN Repositories Workflow Says:

    […] the last months I have been using Git to work with my Subversion repositories. Besides from reducing disk usage, Git also makes my work slightly faster and independent of network access — I can make […]