Tuesday, June 15, 2010

Using automake to generate ChangeLog from git

When deploying a GPL application it's required that you provide a change log containing information on all the changes to the code base. This file is normally named ChangeLog.
In the olden days, before today's fancy revision control systems, people often had to edit the ChangeLog files themselves for each and every change to the code. This is what we did in the Worldforge project when we used CVS for our source code needs.

However, for the modern developer used to distributed revision control systems such as Git this seems archaic. Why provide a separate list of all changes when the Git repository already contains a complete log? Instead of keeping the ChangeLog file updated with each commit, wouldn't it make more sense to generate it from the repository history when a new release is made?

When a new release needs to be created in a project using the autotools the "make dist" make target is invoked (or in reality the "make distcheck" target). This will package the source and produce tar archive of it all. The Makefile target "dist-hook" is provided to allow for us to hook into this process. So what we want to do is to provide some shell scripting which generates a ChangeLog file from the git history. It will look something like this:
cd ${top_srcdir} && git log --stat --name-only --date=short --abbrev-commit > ${distdir}/ChangeLog
This will generate a log of all the changes, in a condensed format not unlike the ones suggested by GNU.

We could be done here, but there are some things that could need improvement. For one thing, this will only work when the "make dist" target is invoked in the original git directory. Some distributions might want to take the dist release, add some distro specific patches, and then make a new dist from that. It would therefore be better to add some logic which can recognize whether there already exists a generated ChangeLog, and if so won't try to generate a new one from the git log. This will need some more logic so we'll be splitting this functionality out to a separate script and call that from the dist-hook target. We'll also add a ChangeLog with exactly one line:
This file is autogenerated from the Git history when a the "dist" make target is invoked. If you find this file in an official release something has gone wrong and you should contact [maintainer_email]. It needs to be exactly one line long in order for the ChangeLog generating script to work.
Our script will check the length of the ChangeLog file. If it's exactly one line, we know we should generate it from the git log, else we know it's already been generated and we shouldn't do anything. Our script (generate-ChangeLog.sh) will look something like this:
#! /bin/sh
top_srcdir=$1
distdir=$2
if [ `cat ${distdir}/ChangeLog | wc -l` = "0" ]; then
chmod u+w ${distdir}/ChangeLog && cd ${top_srcdir} && git log --stat --name-only --date=short --abbrev-commit > ${distdir}/ChangeLog
fi
The dist-hook will now look like this:
dist-hook:
sh $(top_srcdir)/generate-ChangeLog.sh $(top_srcdir) $(distdir)
Now we're pretty set. With Worldforge there's however still an issue with the older CVS ChangeLog. When the source was migrated from CVS to Git the log messages from CVS were directly imported into Git. These however were in a quite verbose format with the date and authors included. Since this data also is available as meta data in the Git repo, the result will be that these log entries will have a lot of redundant data, making the ChangeLog both very large and confusing. To prevent this, we'll store a copy of the old ChangeLog with the CVS entries with the source (as ChangeLog-CVS) and combine this with the Git log history. After looking at the Git history we can see that at the commit with id "f12012e7616c191a8926432faf866c8e43854062" marks where the transition from CVS to Git happened. We'll also replace the ChangeLog-CVS with a notice about it's previous use (as it's not needed anymore, but must be present in the dist) Our script will then look like this:
#! /bin/sh
top_srcdir=$1
distdir=$2
if [ `cat ${distdir}/ChangeLog | wc -l` = "0" ]; then
chmod u+w ${distdir}/ChangeLog && cd ${top_srcdir} && git log f12012e7616c191a8926432faf866c8e43854062..HEAD --stat --name-only --date=short --abbrev-commit > ${distdir}/ChangeLog && cat ${top_srcdir}/ChangeLog-CVS >> ${distdir}/ChangeLog
chmod u+w ${distdir}/ChangeLog-CVS && echo "This file was needed for generating the proper ChangeLog as an aggregate of the code held in git and older code in CVS. It's now empty, but needs to be included in the source distribution to not upset automake." > ${distdir}/ChangeLog-CVS

fi
And now we're done. We have an automated system which will generate the ChangeLog from a combination of both the old CVS provided ChangeLog and the Git log history. It will also make sure that if the "make dist" target is run again on an already generated dist the ChangeLog will be left as it is. Just remember to also include all relevant files in the EXTRA_DIST target of the Makefile.am.
Up to date versions of how this is used in Ember can be found in the Ember repository.

No comments: