The best kittens, technology, and video games blog in the world.

Wednesday, May 04, 2016

Sensible git diff for json files

cat #1227 by K-nekoTR from flickr (CC-NC-ND)
JSON is rarely great for anything, but it's very often good enough, so one thing you'll often run into is JSON in git repositories.

Human-edited JSON (for which you should probably use something like JSON5 instead) is reasonable to work with.

Unfortunately typical machine-generated JSON is completely undiffable - it's all one big line, so trying to run git diff or git log -p will produce full content on before and after side, and you'll have no idea what actually changed without copying both before and after to external files, formatting them, and diffing the result - fairly slow and messy process.

Fortunately this is solvable. First, get latest version of my unix-utilities repositories for json_pp script for pretty-printing JSON (or any other json pretty-printer) and put it somewhere in your $PATH.

Then tell git to treat json as special file for purpose of diffing.

To do it globally:

echo "*.json diff=json" >> ~/.gitattributes
git config --global core.attributesfile ~/.gitattributes
git config --global diff.json.textconv json_pp

Or instead you can do it for just one project:

echo "*.json diff=json" >> .gitattributes
git config diff.json.textconv json_pp

You can use similar technique for other sort-of-text-but-not-really files like machine-generated XML.

textconv is only applied for human readable command output, so it doesn't affect any internal workings of git, and if you want to see raw diffs for any reasons you can always use --no-textconv argument to git diff.


Anon said...

No need for custom ruby binaries, by the way. Any machine with Python has a simple json formatter in the stdlib, just replace the json_pp with "python -m json.tool"

taw said...

Yeah, json_pp seems to do pretty much the same thing as python's json.tool (other than 2 vs 4 space indentation).
Nice find.