Wednesday, 11 November 2009

Setting up Hudson CI for GWT Development

Hudson CI for GWT Development (made with Gliffy)GWT development requires a 32 bit Sun JDK.

We decided to do this by using a 32 bit VM, which could be moved to another server at a later date.

  1. Install vmwareplayer (Ensure networking is enabled during installation)

  2. Download a 32 bit ubuntu ISO (or 32 bit Debian)

  3. Create a new VM from the ISO.

  4. Enter VM and establish your ipaddress.

  5. Add hudson repo to /etc/apt/sources
    deb http://hudson.gotdns.com/debian binary/
  6. Install Sun JDK
    apt-get install sun-java6-jdk 
    update-alternatives --config java
    update-alternatives --config javac
  7. Install Hudson
    apt-get install hudson
  8. Change Hudson port and ajpport by editting /etc/init.d/hudson to add
    HUDSON_ARGS="--httpPort=8081  --ajp13Port=8102" 
  9. Install Apache
    apt-get install apache2
    a2enmod proxy
    a2enmod proxy_http
  10. Setup /etc/apache2/httpd.conf (or similar)
  11. servername hudson
    ProxyPass / http://localhost:8081/
    ProxyPassReverse / http://localhost:8081/
    ProxyRequests Off
    <Proxy http://localhost:8081/*>
    Order deny,allow
    Allow from all
    </Proxy>
  12. Install Maven
    apt-get install maven2
  13. Configure Hudson: setup smtp and set MAVEN_HOME to /usr/share/maven2
  14. Add individual projects to Hudson
  15. Download GWT toolkit 1.7.1 and extract then install gwt-dev-linux.jar to your local repo
    mvn install:install-file \
    -DgroupId=com.google.gwt -DartifactId=gwt-dev \
    -Dversion=1.7.1 -Dclassifier=linux \
    -Dpackaging=jar -Dfile=gwt-dev-linux.jar

Friday, 4 September 2009

Software Development with Certainty

My most recent project was to turn an impressive prototype developed by one person into a library, and applications based upon it, that can be simultaneously developed and used by a team.

The development process gains a new dimension once you have published your first version or have your first user. The task is made more tricky due to the absence of the original author.
In other words: situation normal, don't start from here.

The Erewhon applications, Gaboto library and its main dependencies ng4j and Jena are all changing rapidly. New functionality is being added and code and dependencies are being refactored and changed. The challenge is to enable this change without breaking installed systems or at least not breaking them unknowingly. This is ensured by establishing a contract between the code and the design by the use of tests. The tests guarantee that the system actually does do what it claims. Or, more properly, the tests are exactly what the system claims to do.

The minimum requirement to ensure that a project remains alive is that it keeps up with the current versions of all the libraries it depends upon. There are two approaches to the management of dependencies: Saltation and Continuous Integration.
Saltation is when you leave updating the dependencies until a blitz, you update all of them and re-test. The problem with Saltation is that it appears to the outside world that nothing is going on, the project is approaching stagnation, and when you do address the issue there is a lot of work to do, there is no obvious connection to the change that caused the problem and the developers of the library who have caused the problem have moved on and will not immediately know what they did to break your build.

By contrast using a Continuous Integration methodology one can expect to identify the particular commit that broke the build!

Continuous Integration relies upon repeating a repeatable build process. For this we turned to Maven as it addresses the other problem which we and our dependencies have: Dependency Management.

Dependency Management is a much re-invented wheel to address a problem which has re-occurred time and time again. In the world of Windows programming the problem is known as DLL Hell. In the Linux world there are two predominant dependency management systems: Deb from Debian and RPM from RedHat to manage package dependencies. Quite astonishingly, in the java world the problem was re-invented, as jar hell, by the practice of not versioning jar files. This anti-pattern was thought to be extinct by about 2001, however it has clung on in the eco-system surrounding Jena.

In addition to a repeatable build process with dependency management Maven offers code quality tools such as static analysers and style checkers and runs tests so enabling code coverage metrics.

Unlike Ant, which is a Turing complete scripting language with XML syntax, Maven is a project build system guided by an explicit definition of best practice and a widely used set of conventions which all Java programmers can now be expected to know.

Maven project definitions do not quite take us to Continuous Integration nirvana, we have a repeatable build but now we need to repeat it and to define when it should be repeated.

To repeat the build we use Hudson which can schedule, monitor and record builds defined as Ant scripts, bash scripts and commands as well a Maven builds. Hudson is currently the best of breed of the java CI servers, based upon experience with Continuum and Cruise Control: it just works.

The build should be repeated whenever any code within the project is changed or whenever any code in any dependency is changed. This is achieved by publishing SNAPSHOT builds after every commit.

A Maven project defines a single artefact, usually a jar but possibly a war, ear or website.
The artefacts are published to a repository which has the same structure as the main Maven repository.
Projects are versioned in the usual way, however between explicit versions a SNAPSHOT version is published, usually nightly.

  1. 0.1
  2. 0.2-SNAPSHOT
  3. 0.2
  4. 0.3-SNAPSHOT
  5. 0.3
  6. 0.4-SNAPSHOT


A non-SNAPSHOT build should not depend upon any SNAPSHOT artefacts.

The process of publishing a new version is now straight forward and should only involve incrementing the version, removing the SNAPSHOT classifier from all dependencies, tagging the source control system and deploying the artefact. Immediately after publication the SNAPSHOT classifier is added back to this project and all its dependencies.

In this state of bliss the creation of a new version is very little work and any breach of the contract between tests and code is caught as it is committed.

Tuesday, 18 August 2009

Rendering ancient languages with CSS3 font-face

Re-reading The Decipherment of Linear B led me to revisit the CSS3 feature @font-face and to address my ignorance of Unicode.

As always it is just about to be ready, really. The browsers which work: Firefox 3.5.2, Android cupcake, Safari (Mac). The browsers which fail: IE6, Firefox 3.0, Konqueror, Safari (iPhone). Opera works, but assumes one font per character sequence.

Unicode has Linear B sorted and will have Egyptian done sometime soon (October 2009).

There are some very nice, free fonts created by George Douros.

See the resulting Browser Unicode rendering test page.

Monday, 27 July 2009

Bye then Ben and Nevis. Coo Cows.


This morning Herbie and the man from the abbertoir will despatch Ben and Nevis. They were brothers and were bought some eighteen years ago. They have kept their fields from turning into woodland and have become local landmarks. Although bullocks their size and six foot horn span mean that they are best viewed from a distance.

Whilst actively seeking out human company, especially in winter when they were fed, they have never been handled and so are impossible to treat for ailments. What has led the vet to say that they must be put down is an insect infestation under their fur.

They were bought by Noel to maintain the bio-diversity of the grassland, which they did spectacularly, the field is thought to have the highest score in the West Country, though this is likely to be as much to do with its bewildering geology as its management.

Noel also chose them because they themselves do not require the attention which he lavished on them anyway, as he just liked Highland cattle. The twice daily walk to feed them undoubtably added years to his life so their ending is even more sad, as another connection to Noel passes.

Friday, 24 July 2009

Hell is other people's code

Refactoring java packages

Maintenance programming on OpenSource projects is much, much more pleasant than using non-writable libraries, as you are able to read with a pen; by which I mean that as you understand the code you can document that understanding. If code was difficult to understand due to it's lacking in structure or unevocative naming you can change it.

The maintenance programmer has a very different relation to the code than does the original author. The original author understands the intention of the code and inhabits the conceptual framework it embodies. The maintenance programmer strives to intuit.

As the understanding of the code grows so does the desire to rewrite it. Once I have tests for the main use cases for a library I am working with I am able to think about giving it a vigorous refactoring.

Refactoring is odd, it is hard and dangerous work yet nothing is meant to change. The functionality of the code should, indeed must, not change. It is tricky to come up with a defense against the injunctions if it ain't broke don't fix it and change for change's sake. I claim that unloved code is dieing code and unchanging software is no longer soft. At the end of the process there should be some increase in beauty and hence comprehensibility and maintainability.

These qualities cannot be measured, however there are things which can be measured: the links between packages (java classes live within a namespace tree where each node is called a package), the number of classes and the proportion of interfaces and abstract classes to concrete classes. These and other metrics are created by the JDepend code metrics tool.

Why package at all?

One of the code smells that refactoring should aim to reduce is the number of import statements within classes. There is a way in which to reduce the number of inter-package linkages: create all classes in the default package and be done with it. However you cannot control, or even be aware of, name clashes in the default namespace, it should NEVER be used. Another possibility is to put all of your code into one single package in a namespace you control. This too gets pretty silly pretty quickly for all but the most trivial project.

The purpose of java package namespaces should be to impose a conceptual structure upon your code which helps you and other programmers understand it. Java packages have no functional effect, given the prevailing practice of restricting visibility modifiers to public or private and eschewing the default package visibility. The compiler does not care how you group your classes. There are no tools that I am aware of which will make any comments about how you do it. The only measurable thing is the number of import statements required in classes within the package. My claim is that for two package schemas with equal number of packages one is better than the other if it results in fewer import statements.

Anti Patterns


Separating all exceptions into a separate package
This will necessarily increase the number of intra-package links, as each exception will have to be imported from the exceptions package.

Exceptions should be declared in the package they are thrown in or in the nearest parent node if they are thrown in two packages.
A new package for a new concept
Today is a new day, I am adding new stuff, I want my stuff to be separate from the previous work are all bad justifications for creating a new package.
Overuse of checked exceptions
Checked exceptions should only be used for events which the application programmer can respond to. Programming bugs should be unchecked exceptions.
Meaningless names and abbreviations
Naming is the most important aspect of writing maintainable code. Good naming removes the need to remember meaning.
Package names should be singular
Analogous to the naming of tables in an SQL database, you might think that you store people records in a table called people but everything falls out much more neatly if you call the table person, not just because you avoid English irregular plurals.
Do not repeat yourself (DRY)
In fully qualified names, as in much else in programming, you should not repeat yourself. So com.mycompany.myproject.mypackage.MyProjectEntity should probably be named com.mycompany.myproject.mypackage.Entity
Use your own default package space
If you are writing a modelling library and its root package is to be net.mylib then use that space for the main guts not net.mylib.model.

My latest refactoring with JDepend metrics before, nearly there and after.

Wednesday, 17 June 2009

Ant Rant - Version filtering deprecated

Filtering is one of those features which Ant makes available which was overused in the initial excitement when Ant was new.

By using version filtering you are tying your codebase into being built by Ant, or at least a build system which has the same filtering mechanism as Ant.

In Melati, WebMacro and now ten years later in Jena there was a feature to filter the code and insert the version number. The uses are so similar I suspect they were in some popular "Introduction to Ant".

We even had tests to ensure that the version variable had been successfully filtered.

assertNotEquals("@version@", Product.VERSION);

When I came to convert to Maven I wanted to keep this handy feature, but of course the test was failing.

I asked how to do filtering on the Maven lists way back. They were very scathing about the notion of a build process modifying production code. It is not the Maven way.

The question that needs a very convincing answer is what is this for? Does it fit with how people use and develop the software.

As a developer, in your IDE, do you want to see

public static final String NAME = "@Name@";

or more sanely

public static final String NAME = "MyProject";

Filtering may work for the users but not for the developers.

Lets look at the actual use cases, taking Jena as a typical project. A sensible place for these strings is Jena.java. Of the static strings in the following references are found.

PATH - 15 references - interestingly hardcoded, not interpolated.
NAME - 0 references
WEBSITE - 0 references
VERSION - 0 references
MAJOR_VERSION - 0 references
MINOR_VERSION - 0 references
REVISION_VERSION - 0 references
VERSION_STATUS - 0 references
BUILD_DATE - 0 references

We can see that after 6 years this facility has not actually been used. It was put in because the coder knew how to do it and thought it might come in useful. It hasn't. YAGNI.

If you REALLY want to use filtering then filter a properties file using Maven Resource Filtering.

However if you do just want to provide, for example, a version command in your application then explicitly add the version string as a static variable in a prominent place in your code and add a step to your release procedure document to update this variable.

Code should not depend upon the build system: rip it out.

Friday, 29 May 2009

You are using Continuous Integration, aren't you?

This is one of those questions expecting the answer yes, like you have lost your virginity, haven't you?, which has a motivational impact.

Another prod in the right direction was when someone used Hudson as the term for CI server, as you would use Hoover for vaccum cleaner.

I have tried three times to use Continuum, only once with any degree of success. Hudson just works, straight out of the box.

I am proud to say I am, at last and later than most, using Continuous Integration.