If you are dealing with large files (such as images and videos) in a git repository, and if you committed those files to your repo, removing them in a commit will not help reduce the repo size.

As git is designed to keep track of history, the large files will not be fully deleted. Following procedures will completely remove the files that you want to completely delete from a repo.

Note: If you have a repo with commits built upon a forked repo, the following command will affect the commits of the forked repo. The fetch will not work properly as git will not be able to find common history.

Rewriting History with git filter-branch

Checking top largest files consuming repo’s space

Use the following command to check 10 files taking space of a repo in order of their size. Configure the number after tail to modify the number of files to view.

$ git rev-list --objects --all | grep -f <(git verify-pack -v .git/objects/pack/*.idx| sort -k 3 -n | cut -f 1 -d " " | tail -10)

git filter-branch to remove large files from the history

For every commit, the filter-branch command rewrites the history of the repo with a given filter. The following command deletes images (e.g., *.jpg, *.png, and *.gif) existing in history.

$ git filter-branch -f --index-filter 'git rm --cached --ignore-unmatch "assets/*.jpg" "assets/*.png" "assets/*.gif"' --prune-empty --tag-name-filter cat -- --all

Above command force (-f) applies the filter (the string after --index-filter), removes empty commits (--prune-empty) rewritten by the filter, and overwrites the tag name (--tag-name-filter cat) to the new commit for all lists of commit objects (-- --all).

Tip. Use git count-objects -v to check the count of files tracked in the repo. The file count after the filter-branch command will be reduced.

Cleaning up repo

Remove logs and objects for the old commits that are no longer referenced with the rewritten commits

$ rm -Rf .git/refs/original
$ rm -Rf .git/logs/
$ git gc --aggressive --prune=now

Updating remote repo

Push the rewritten repo to the remote server.

$ git push origin --force --all
$ git push origin --force --tags

References

  1. https://github.com/18F/C2/issues/439
  2. How can I remove a large file from my commit history?
  3. Git - Remove All of a Certain Type of File from the Repository
  4. Remove files from git history

Tags:

Categories:

Updated: