Last year I was in charge of SVN to Git migration at company that I work for. We wanted to migrate the history as well. In our case there were about 40,000 revisions made during last 8 years. In order to minimize developers downtime I did a lot of scripting preparation ahead of time. Actual switch from SVN to Git took less then 2 hours. Here are the steps that we took.
1. Retrieve a list of all committers
You'll need to create a list of users that have committed to the SVN repo and then convert those users over to the Git format as Subversion only supplies the username of the person committing and not the username and email. To retrieve the list of users from SVN, create a new folder, right click and select Git Bash Here to open a Git command window. Run the following command:
svn log http://url/to/svn/repository -q | awk -F '|' '/^r/ {sub("^ ", "", $2);
sub(" $", "", $2);
print $2" = "$2" <"$2">"}' | sort -u > users.txt
Note: this will take a couple of minutes to complete based on the size of your repository, number of commits, and number of committers.
The text file will have separate lines for each committer and will need to be transformed from vkarpach = vkarpach <vkarpach> to vkarpach = Viktar Karpach <vkarpach@company.com>
2. Clone the repository using git-svn
Note - this step will take hours to complete, so it is suggested to run this step over night on dedicated box. Run the following command to convert the repository to a Git repository:
git svn clone --stdlayout --no-metadata -A users.txt http://url/to/svn/repository dest_dir-tmp
3. Make a copy of this folder.
git svn clone takes a lot of time. For our main project it took 48 hours for about 18000 commits. Make a copy of this folder, so you don't need to do it again. Create scripts for next steps, so when you are ready to switch you can do it quickly.
4. Fetch latest commits.
The team continued to use Subversion until a very last moment, so while working on migration scripts time to time I had to fetch latest commits.
git svn fetch
git reset --hard trunk
5. Clean up script.
Delete tags
for t in `git branch -r | grep 'tags/' | sed s_tags/__` ; do
git tag $t tags/$t^
git branch -d -r tags/$t
done
Delete trunk, since we will use master from now on.
Remove SVN references
git config --remove-section svn-remote.svn
rm -rf .git/svn .git/{logs/,}refs/remotes/svn/
And finally convert the remaining remote branches to local branches
git config remote.origin.url .
git config --add remote.origin.fetch +refs/remotes/*:refs/heads/*
git fetch
Remove remote branches:
for t in `git branch -r` ; do
git branch -d -r $t
done
Git doesn't support space in branch names, so git svn fetch replaced spaces with %20. I think it is more aesthetic to use underscore instead of %20:
for t in `git branch -a|grep '%20'` ; do
newName=`echo $t | sed 's/%20/-/g'`
git branch -m $t $newName
done
You might want to delete some unused branches:
for t in `cat ../list_of_branches_for_deletion.txt`; do
git branch -D $t
done
Where list_of_branches_for_deletion.txt contains branch names that will be deleted. Use following code to populate this files:
git branch -a > ../list_of_branches_for_deletion.txt
Manually edit list_of_branches_for_deletion.txt file. Leave only those branches that you want to delete.
6. Replace any svn externals with git submodules
git submodule add ssh://git@git.company.com:7999/ProjectName/external_repo.git ExternalFolderName
git commit -m "Added submodules"
Use sumbodules only for external projects that don't change very often. We had to combine our internal projects in one git repository, since it is hard to maintain submodules for rapidly changing projects. Each project gets its own directory in git repository:
Before migration:
svn_main_project
external_1
external_1_folder_1
external_1_folder_2
external_2
external_2_folder_1
external_2_folder_2
svn_main_project_folder_1
svn_main_project_folder_2
Where svn_main_project has to externals external_1 and external_2.
After migration
git
svn_main_project
svn_main_project_folder_1
svn_main_project_folder_2
external_1
external_1_folder_1
external_1_folder_2
external_2
external_2_folder_1
external_2_folder_2
You can use following bash script to push everything in sub_folder, so later you can combine repositories. The script will modify commit history as well.
git filter-branch --index-filter \
'git ls-files -s | sed "s-\t\"*-&sub_folder/-" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info &&
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE" || true' HEAD
7. Get your repository onto the server
Create a repository on your git server.
Init local repository
Use following if you are combining repositories:
git remote add external_1 ../external_1/
git pull external_1 master
git remote rm external_1
Add gitignore
cp ../gitignore.txt .gitignore
git add .
git commit -m "Added .gitignore"
Push all branches in one shot:
git remote add origin ssh://git@git.company.com:7999/repo.git
git push --all origin