Saturday, June 19, 2010

Simplify tolua++ with autotools

In Ember we use Lua for our scripting needs. The bindings to the C++ parts of the client are provided by the tolua++ library. Tolua++ works by generating C++ source code from .pkg files, which are simplified .h files. This works out extremely well; the developer only needs to copy the .h file to a similarly named .pkg file and remove those methods and fields that shouldn't be exported. The command line tool "tolua++" is then run to produce the C++ source.

Previously in Ember we handled all of this by keeping the generated C++ source checked into the source, and requiring that each developer had to run the tolua++ command to regenerate these sources each time a lua binding was added or changed. This however had a couple of downsides, chief amongst them that the code generated differs a bit depending on the version of tolua++ used. It also resulted in some often very large Git commits of generated code, which tended to pollute the Git history.
A much better solution would be to instead automate the generation of the binding source, and bake it into the normal build system. Fortunately the Autotools already provides all the facilities for making this happen.

An example of how this is handled in Ember though a Makefile.am can be found here:

SUFFIXES: .cxx .pkg .lo .la .cpp .o .obj

.pkg.cxx:
cd $(srcdir) && TOLUAXX=${TOLUAXX} $(abs_top_srcdir)/scripts/update_lua_bindings.sh `basename $@ .cxx` `basename $@ .cxx`.pkg $(abs_builddir)/`basename $@` $<

INCLUDES = -I$(top_srcdir)/src -I$(srcdir) -I$(top_builddir)/src -DPREFIX=\"@prefix@\"

noinst_LIBRARIES = liblua_EmberServices.a
liblua_EmberServices_a_SOURCES = EmberServices.cxx

CLEANFILES = EmberServices.cxx
TOLUA_PKGS = ConfigService.pkg EmberServices.pkg IInputAdapter.pkg Input.pkg InputService.pkg LoggingService.pkg MetaserverService.pkg ScriptingService.pkg ServerService.pkg
EXTRA_DIST = $(TOLUA_PKGS)
EmberServices.cxx: $(TOLUA_PKGS)

The files references in TOLUA_PKG are the .pkg files which define the bindings. These will be fed through the update_lua_bindings.sh script to generate the file EmberServices.cxx, which is then compiled and added to the liblua_EmberService.a archive. Note that we need to add EmberServices.css to the CLEANFILES variable to make sure that it's deleted when we're cleaning up. Since it's generated through the tolua++ tool Automake can't keep track of it itself.
The update_lua_bindings.sh script looks like this:

#! /bin/sh
#tolua++ will for some reason translate "const std::string" into "const,std::string", so we need to remove these incorrect commas from the final code
#some versions will also for some unexplainable reason not correctly remove the tolua specific directive tolua_outside, so we need to clean that out also
#We'll also replace the inclusion of "tolua++.h" with our own version which has better support for building on win32.
echo "Updating lua bindings."

#If the TOLUAXX environment variable isn't set default to using the command "tolua++".
if [ x${TOLUAXX} = x ]; then
TOLUAXX=tolua++
fi
${TOLUAXX} -n $1 $2 > $3
grep -q '** tolua internal error' $3 && cat $3 && exit 1
sed -i -e 's/const,/const /g' -e 's/tolua_outside//g' -e 's/tolua++\.h/components\/lua\/tolua++\.h/' $3

This script basically runs the tolua++ command, as defined in the TOLUAXX environment variable (with "tolua++" as fallback) and then applies some replacement to fix some issues we've been having with the generated code.
The TOLUAXX environment variable is set at configuration time. The default is "tolua++", but we'll provide the option to use an alternative command. Our acinclude.m4 file has this snippet (which is called from configure.ac):

AC_DEFUN([AM_CHECK_TOLUAXX],
[
AC_ARG_WITH(tolua++,AS_HELP_STRING([--with-tolua++=CMD],[Tolua++ command (default=tolua++)]),
toluaxx_command="$withval", toluaxx_command="tolua++")

AC_CHECK_TOOL(TOLUAXX, $toluaxx_command)

if test "x$TOLUAXX" = "x"; then
AC_MSG_ERROR([Could not find a working tolua++ command (tried '$toluaxx_command'). Use the --with-tolua++ switch to set the proper command to use.])
fi
])

This setup will allow the script bindings to be generated at compile time, but only the first time compilation occurs, or if any of the .pkg files have changed. More examples of how this is used can be found in the Ember sources.

Tuesday, June 15, 2010

Using automake to generate ChangeLog from git

When deploying a GPL application it's required that you provide a change log containing information on all the changes to the code base. This file is normally named ChangeLog.
In the olden days, before today's fancy revision control systems, people often had to edit the ChangeLog files themselves for each and every change to the code. This is what we did in the Worldforge project when we used CVS for our source code needs.

However, for the modern developer used to distributed revision control systems such as Git this seems archaic. Why provide a separate list of all changes when the Git repository already contains a complete log? Instead of keeping the ChangeLog file updated with each commit, wouldn't it make more sense to generate it from the repository history when a new release is made?

When a new release needs to be created in a project using the autotools the "make dist" make target is invoked (or in reality the "make distcheck" target). This will package the source and produce tar archive of it all. The Makefile target "dist-hook" is provided to allow for us to hook into this process. So what we want to do is to provide some shell scripting which generates a ChangeLog file from the git history. It will look something like this:
cd ${top_srcdir} && git log --stat --name-only --date=short --abbrev-commit > ${distdir}/ChangeLog
This will generate a log of all the changes, in a condensed format not unlike the ones suggested by GNU.

We could be done here, but there are some things that could need improvement. For one thing, this will only work when the "make dist" target is invoked in the original git directory. Some distributions might want to take the dist release, add some distro specific patches, and then make a new dist from that. It would therefore be better to add some logic which can recognize whether there already exists a generated ChangeLog, and if so won't try to generate a new one from the git log. This will need some more logic so we'll be splitting this functionality out to a separate script and call that from the dist-hook target. We'll also add a ChangeLog with exactly one line:
This file is autogenerated from the Git history when a the "dist" make target is invoked. If you find this file in an official release something has gone wrong and you should contact [maintainer_email]. It needs to be exactly one line long in order for the ChangeLog generating script to work.
Our script will check the length of the ChangeLog file. If it's exactly one line, we know we should generate it from the git log, else we know it's already been generated and we shouldn't do anything. Our script (generate-ChangeLog.sh) will look something like this:
#! /bin/sh
top_srcdir=$1
distdir=$2
if [ `cat ${distdir}/ChangeLog | wc -l` = "0" ]; then
chmod u+w ${distdir}/ChangeLog && cd ${top_srcdir} && git log --stat --name-only --date=short --abbrev-commit > ${distdir}/ChangeLog
fi
The dist-hook will now look like this:
dist-hook:
sh $(top_srcdir)/generate-ChangeLog.sh $(top_srcdir) $(distdir)
Now we're pretty set. With Worldforge there's however still an issue with the older CVS ChangeLog. When the source was migrated from CVS to Git the log messages from CVS were directly imported into Git. These however were in a quite verbose format with the date and authors included. Since this data also is available as meta data in the Git repo, the result will be that these log entries will have a lot of redundant data, making the ChangeLog both very large and confusing. To prevent this, we'll store a copy of the old ChangeLog with the CVS entries with the source (as ChangeLog-CVS) and combine this with the Git log history. After looking at the Git history we can see that at the commit with id "f12012e7616c191a8926432faf866c8e43854062" marks where the transition from CVS to Git happened. We'll also replace the ChangeLog-CVS with a notice about it's previous use (as it's not needed anymore, but must be present in the dist) Our script will then look like this:
#! /bin/sh
top_srcdir=$1
distdir=$2
if [ `cat ${distdir}/ChangeLog | wc -l` = "0" ]; then
chmod u+w ${distdir}/ChangeLog && cd ${top_srcdir} && git log f12012e7616c191a8926432faf866c8e43854062..HEAD --stat --name-only --date=short --abbrev-commit > ${distdir}/ChangeLog && cat ${top_srcdir}/ChangeLog-CVS >> ${distdir}/ChangeLog
chmod u+w ${distdir}/ChangeLog-CVS && echo "This file was needed for generating the proper ChangeLog as an aggregate of the code held in git and older code in CVS. It's now empty, but needs to be included in the source distribution to not upset automake." > ${distdir}/ChangeLog-CVS

fi
And now we're done. We have an automated system which will generate the ChangeLog from a combination of both the old CVS provided ChangeLog and the Git log history. It will also make sure that if the "make dist" target is run again on an already generated dist the ChangeLog will be left as it is. Just remember to also include all relevant files in the EXTRA_DIST target of the Makefile.am.
Up to date versions of how this is used in Ember can be found in the Ember repository.