git diffing gzipped files
Recently I found myself in a position where I wanted to diff a gzipped file across branches. The default git behavior was not helpful. A few config tweaks and I was on my way.
Here I have an example project with branches master and dev.
[~/Scratchpad/git_gz_diff]
nickphair@Nicks-MacBook-Pro:$ git branch
dev
* master
You can see below that there is a compressed foo.gz
file that differs across branches.
[~/Scratchpad/git_gz_diff]
nickphair@Nicks-MacBook-Pro:$ ls
foo.gz
[~/Scratchpad/git_gz_diff]
nickphair@Nicks-MacBook-Pro:$ gunzip --stdout foo.gz
Hello, World!
[~/Scratchpad/git_gz_diff]
nickphair@Nicks-MacBook-Pro:$ git checkout dev
Switched to branch 'dev'
[~/Scratchpad/git_gz_diff]
nickphair@Nicks-MacBook-Pro:$ gunzip --stdout foo.gz
Hello, Nick
[~/Scratchpad/git_gz_diff]
nickphair@Nicks-MacBook-Pro:$
Now, I would like to get a line-by-line diff of the two files. However, git diff
is not giving me what I had hoped for. It only tells me that the files are different.
[~/Scratchpad/git_gz_diff]
nickphair@Nicks-MacBook-Pro:$ git diff master -- foo.gz
diff --git a/foo.gz b/foo.gz
index 155b5c0..0eedbc5 100644
Binary files a/foo.gz and b/foo.gz differ
To get the diffing behavior I want, it is up to me to define a new external diff driver. Use your favorite text editor to add an entry to your ~/.gitconfig
that looks like,
[diff "gz"]
binary = true
textconv = gunzip --stdout
The above block of text defines a diff driver, gz. The binary and textconv lines map directly to arguments you pass to git diff
. The driver works on binary files and and applies gunzip --stdout
as a conversion filter when doing the diff. Put succinctly, the driver decompresses our files before diffing.
Great, we now have a driver we can use to diff our gzipped files. We should probably tell git to use this driver when dealing with .gz files. An entry in ~/.gitattributes will accomplish this.
*.gz diff=gz
Lastly, if it is not already set, be sure git knows the location of your attributes file. Add the following to your ~/.gitconfig
if needed.
[core]
attributesfile = ~/.gitattributes
Now, let us try our diff again.
[~/Scratchpad/git_gz_diff]
nickphair@Nicks-MacBook-Pro:$ git diff master -- foo.gz
diff --git a/foo.gz b/foo.gz
index 155b5c0..0eedbc5 100644
--- a/foo.gz
+++ b/foo.gz
@@ -1 +1 @@
-Hello, World!
+Hello, Nick
Success!