If you’re working with large files (such as images and videos) in a git repository and have committed them to your repo, simply removing them in a commit will not reduce the size of the repository.

Git is designed to keep track of historical changes, and so the large files will not be fully deleted. Following procedures will completely remove the files that you want to completely delete from a repo.

Note: To avoid affecting commits in a forked repo, when dealing with a repo built upon it, avoid using the fetch command as git may not be able to find the common history.

Rewriting History with git filter-branch

Checking top largest files consuming repo’s space

Use the following command to view the top 10 largest files in a repo, ordered by size:

$ git rev-list --objects --all | grep -f <(git verify-pack -v .git/objects/pack/*.idx| sort -k 3 -n | cut -f 1 -d " " | tail -10)

The number after tail (e.g., -10) determines the number of files displayed. Change this value to view a different number of files.

git filter-branch to remove large files from the history

For every commit, the filter-branch command rewrites the history of the repo with a given filter. The following command deletes images (e.g., *.jpg, *.png, and *.gif) existing in history.

$ git filter-branch -f --index-filter 'git rm --cached --ignore-unmatch "assets/*.jpg" "assets/*.png" "assets/*.gif"' --prune-empty --tag-name-filter cat -- --all

Above command force (-f) applies the filter (the string after --index-filter), removes empty commits (--prune-empty) rewritten by the filter, and overwrites the tag name (--tag-name-filter cat) to the new commit for all lists of commit objects (-- --all).

Tip. Use git count-objects -v to check the count of files tracked in the repo. The file count after the filter-branch command will be reduced.

Cleaning up repo

Remove logs and objects for the old commits that are no longer referenced with the rewritten commits

$ rm -Rf .git/refs/original
$ rm -Rf .git/logs/
$ git gc --aggressive --prune=now

Updating remote repo

Push the rewritten repo to the remote server.

$ git push origin --force --all
$ git push origin --force --tags

References

  1. https://github.com/18F/C2/issues/439
  2. How can I remove a large file from my commit history?
  3. Git - Remove All of a Certain Type of File from the Repository
  4. Remove files from git history

Tags:

Categories:

Updated:

Leave a comment