I was recently working on a migration project which include migrating our codebase from SVN to GitHub. After a few trials and errors and plenty of hours searching the web for best practices, I finally managed to develop a system that did not cause any tears during the process and I thought it would be nice to share it with you.
I have broken down this migration process into 4 simple steps:
- Prepare your environment
- Convert your SVN repository to a local git repository
- Convert any large files to lfs objects (if needed)
- Push the new git repository to GitHub
Prepare your environment
On your local machine create a GitMigration folder which will host your new git repo: mkdir -p ~/GitMigration
In another command line update the SVN repo to make sure that you have the latest revisions. It would be also wise to inform your team that you are starting the migration and no more commits will be allowed to be pushed to SVN until the migration process is complete.
Once that is done you will need to create an authors.txt file. This will map the SVN usernames to the desired Git usernames in the following format:
jdoe = John Doe <john.doe@gmail.com>
esmith = Emma Smith <emma.smith@gmail.com>
If you don't want to find all the authors in the SVN manually you can pull the data from the SVN repository using the following command, just make sure that the final format follows the same structure as above:
svn log -q | awk -F '|' '/^r/ {gsub(/ /, "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors.txt
Convert your SVN repository to a local git repository
If you want to only migrate commits from a certain time period, you will need to locate the revision number in the SVN repo. To do this excuse the following command in the svn repo folder:
svn log -r {YYYY-DD-MM}:HEAD --limit 1
Within the GitMigration folder execute the following command:
git svn clone <svn-repo>/<project> <git-repo-name> --authors-file=authors.txt -r <revision-number>:HEAD
Where <svn-repo> is the URI of the SVN repository that you want to migrate, <project> is the name of the project that you want to import, < revisions number > is the revision number that you want to migrate from (if needed), and <git-repo-name> is the directory name of the new Git repository. This process might take a while, depending on the size of the SVN commits.
Now comes the clean-up, moving the tags and any remote refs to local branches. To move the tags to proper Git tags execute the following within the git repo directory:
for t in $(git for-each-ref --format='%(refname:short)' refs/remotes/tags); do git tag ${t/tags\//} $t && git branch -D -r $t; done
Next, move any references under refs/remote and turn them to local branches
for b in $(git for-each-ref --format='%(refname:short)' refs/remotes); do git branch $b refs/remotes/$b && git branch -D -r $b; done;
for p in $(git for-each-ref --format='%(refname:short)' | grep @); do git branch -D $p; done;
Convert any large files to lfs objects
GitHub has a file limit of 100MB, anything over that size will be refused to be pushed into GitHub. This is where Git Large File Storage (LFS) comes in. Git LFS replaces any large files with text pointers inside Git while storing the file contents on a remote server. There are many benefits to using LFS such as
- Large file versioning
- More repository space
- Faster cloning and fetching
- Same Git workflow
- Same access controls and permissions
To find out how many files that are above the 100MB limit run the following command:
git lfs migrate info --everything --above=99MB
this will print out information about the different file types in your repo that are above the 99MB mark (as shown below). To convert the files to LFS file objects run the following command
git lfs migrate import --everything --above=99MB
This will rewrite the history of the git repo that contained any files that were above 99MB and convert them to LSF objects.
Push the new git repository to GitHub
The final step is to add the remote git server and push the changes.
git remote add origin git@my-git-server:myrepository.git
git push origin --all;
git push origin --tags;
And that's it! Congratulations, you have migrated your SVN repo into Git. The next step is to inform your team that the migration is complete and provide them access to the new remote git server.