I often find myself searching for certain commits using git log and friends. While I really love
the power and flexibility that come with the git and other Unix command-line tools, sometimes it can be more
convenient to use a database to filter and aggregate commit data.
I gave it a quick try yesterday and imported the commit history of ArangoDB’s Git repository into ArangoDB
and ran some queries on the data. While the query results for our repository may not be interesting for everyone,
I think it is still worth sharing what I did. Even though I didn’t try it, I think the overall procedure is
applicable with any other Git repository.
Converting the Git history to JSON
The way to extract history and commit data from a local repository is to use git log. Though its output
is greatly customizable, it does not provide an out-of-the-box solution for producing JSON. So I wrote a simple
wrapper script (in PHP) around it. The script can be found here.
Here’s how to run it:
converting the git history to JSON
php git-log-to-json.php > arango-commits-master-201503.json
The script may run a few minutes on bigger repositories such as ours. In the end, it should produce a JSON
file named arango-commits-master-201503.json.
I have also uploaded the JSON file here. Note that the
file only contains commits from the master branch and not all commits done in ArangoDB in total.
Importing the commits into ArangoDB
The simplest way to get the commits into ArangoDB is to use arangoimp: