In a recent development project the team has decided to create two versions of the same generated artifact -a JAR file containing a message-processing framework that other systems will use. The full JAR would depend on weblogic.jar, a 50+MB gorilla, but as the features that create this dependency are not used by all our clients we thought it would be unfair to make everyone pay the price of having Bea’s monster among their transitive dependencies. As Bea’s thin client is not… er… reliable… we had no option but to split the artifact into two versions.
Now, how to do that? The system doesn’t follow basic Layering practices and the JMS dependency was spread all over the place. To better understand the problem look at the diagram below. Suppose the more red a class is, the highest couple with JMS -thus weblogic.jar- it has.

Now let’s suppose those classes are packaged like this.

Looks bad. Although most classes were blue in the first diagram, the lack of proper dependency management grouped them with red classes, and that grouping creates red packages. We have only one blue package (i.e. one package that does not depend on weblogic.jar directly or indirectly). Now if that’s bad consider inter-class dependencies shown in the diagram below.

And now inter-package dependencies caused by those.

Yuck! Not a single blue package. In our case that means that we had to cherry pick classes that were suitable for the leaner JAR file, and this means something like that:
<jar jarfile="my-leaner.jar"> <fileset dir="build/classes"> <include name="**/package1/**/ClassThatDoesNotDependOnJms.class"/> <include name="**/package1/**/YeaAnotherClassThatDoesNotDependOnJms.class"/> <include name="**/package2/**/NoJmsHere.class"/> <!-- … --> </fileset> </jar>
Now what’s wrong with this? Everything. Probably the most awful consequence of that is that this build file has to be updated frequently but the only way to test if the JAR file is properly formed is running it. Instead of verifying our changes at unit test level we have to wait for a 10 minutes build to verify if we forgot to add any of the needed classes.
The cherry picking above is not the problem, it’s a symptom. Unfortunately, in enterprise development –and by enterprise I mean non-academic/research, not Enterprisey- we tend to face things like this as “corner cases”: We don’t pay attention to those, we don’t try to solve the problem, we assume a quick-and-dirty solution here and there won’t be a big deal.
It wouldn’t be a big deal if these situations didn’t repeat over and over again but they do. Packages are important units of abstraction. To avoid problems like the one we just saw we have to design our package carefully.
Note: I use “package” and “component” interchangeably in this text but this doesn’t mean they’re synonyms. Here we are not talking about packages or components in any technology-specific or language construct context but about “packaging” and dependency management in general.
There is a large amount of publications on software packaging and dependency principles. My favorite one, focusing Java (1t edition) and C# (2nd edition), is Uncle Bob’s. This text gives the principles of packaging in some clear and simple rules of thumb.
Granularity and Cohesion
The Reuse/Release Equivalence Principle:
This rule guarantees atomicity of the package. I could summarize this as: the Package should be always handled as a whole. If you release a component you have to make sure that all classes in that are reusable. No patching, no hacking, no partials.
So, if we change the version of class ABC in our component we can’t release only that class, we should release the whole component.
Achieving this is not as hard as it may look. One not-that-hard way is to make sure that your component explicitly separates what is part of its interface (i.e.what is its API) and what is part of its internals. Controlling what your clients can depend on will help avoid breaking the black box.
The Common Reuse Principle (CRP):
It says that if something depends on a component it should depend on all and every class on that component. Not that may sound too draconian but the goal is to define what should and what should not be packed together.
If you can depend on half component only that probably means that you should have more than one component. This principle is very useful to avoid assembling pseudo-components that are nothing but a bunch of common code packed together.
It’s clear then that the component above should be split at least into one with the JMS dependency and other with whatever is left. So, this:

Should probably become something like that:

Notice that the dependencies now are expressed between packages. Instead of creating two versions of the same JAR we would split our component into two: one containing the classes that depend on weblogic.jar and the other with our core classes.
The Common Closure Principle (CCP):
This is the Single-Responsibility Principle applied to coarser-grained artifacts. A component should not have multiple reasons to change.
If the component in our previous example is a message processing framework –what probably means something that get messages from somewhere and delivers those to your application in a nice way- it would be very strange if clients have to ask you for changes in that component on order to, say, upgrade their database to a new version. The only reason for it to change must be a change in the message processing.
Stability and Coupling
The Acyclic Dependencies Principle (ADP):
This is a fairly known issue in software development. Dependency cycles are evil. In the diagram below we have a nice scenario where there is an acyclic graph.
Now let’s suppose the JMS library is actually managed by the client application. We have a cycle.

What’s the problem here? At first, suddenly your component, the reusable bit of code, depends on the client. This looks odd for a very good reason: if the component is supposed to be reused by a large number of clients this means that each client depends on every other client. If the guys at the HR department changed their Weblogic to JBoss (or even the Weblogic version!) it is very likely that everyone else will have to do the same.
The usual way to solve this cycle is applying the Dependency Inversion Principle. This is described in the book as well and it basically means that we will not make the component depend on the client but let both depend on a common concept, in this case it would be a common contract for JMS libraries.

The Stable-Dependencies Principle (SDP)
Stability is a very important concept for Packages. We’re not talking about stability like working without hiccups (as in “Windows is unstable, Linux is stable”) but about how often a package change its interface and/or behavior.
The goal of this principle is to avoid the awkward situation where a package that never changes depends on a package that changes often. It is OK to have unstable packages, the problem is that if something depends on an unstable package it will become unstable as well.
It is very common for a business process to be unstable. A proper Layering architecture, where the business process related code will be on the top, helps in this situation. A change in the business process should be bounded into the upper Layers and not affect more stable components.
The Stable-Abstractions Principle (SAP)
This is about separating contract from implementation. It says that the more abstract a component is the more stable it gets.
Abstract ideas and concepts don’t change often. If you think about Computer Science principles, for example, most of them are around for decades. What change –and change a lot- are the implementations of those concepts. The actual implementation will vary but the contract is stable.
In our example, the very concept of sending and receiving messages is pretty stable. This is what other classes should depend on, not on how this concept is implemented.
For example, now we have something like this:

But what’s the stability of the JMS implementation? The system we are talking about in this text is a message-processing framework, not a JMS framework. Letting our business components depend on the JMS itself may be a problem as this is not as stable as we would expect. Even worse, if we change the implementation what’s the impact?

So, a safer approach is to rely on something more stable. The very concept of message receiving and sending is pretty stable. If we extract this into an abstract component we can depend on that:

As the concept is more abstract it is more stable. The implementations vary but the concept is still the same. It’s the same technique of the Dependency Inversion Principle we used above but with a different goal.
Nothing New
I’m pretty sure you realise that there’s nothing new in the rules above. These rules are the same we apply when designing our classes and objects. And that’s the idea, the same rules that avoid a terrible micro-design can avoid a terrible macro-design.
And, of course, those are Rules of Thumb and not laws. It is very nice to have those in mind when designing components so you don’t fall into traps like the example presented but it is extremely important to not become an Architecture Astronaut.

Very good article!
Seems like the “big ones” are learning this now:
http://in.relation.to/Bloggers/HibernateCoreModules33
http://java.sun.com/developer/technicalArticles/javase/java6u10/
Hi pcalcado! The Uncle Bob’s link is broken
Hi Phillip, very nice post.
I’ve gone through this problem from both sides. Once I developed an application using Axis 2, and to develop a web services client, the framework imposed the use of a mere 51 jars!!! That’s because they didn’t want to give you a list of jars you need for each topology. With a brave effort I was able to cut the client dependencies down to 10 jars, but it was still more than I wanted.
In my team at Globo.com, we develop the Register and Authentication applications, both having several dozens client applications. We offer them a client jar, that carries some dependencies inside it, but only a small set (3 ou 4 jars).
Several months ago we discovered a very helpful tool. Maven’s Shade plugin helps a lot in this problem. The plugin let’s you pack several jars, but changing the package names.
For example, our client currently needs commons-http-client to access RESTFul services. However, many applications already have this jar inside them, with varying versions. In order to avoid conflicting versions of the jar, we can use the Shade plugin to rename the package names in the jar and also in our client’s source code. So instead of importing org.apache.httpclient.* we import for example org.maven.apache.httpclient.*.
This a plugin that acts during Maven’s build process, so we code using the normal package names and let the plugin handle this. This is very helpful to avoid imposing dependencies to client applications, so maybe some other projects could use it too.
Regards,
Bruno Pereira
Bruno,
I have to say that I have strong disagreement with the use of the words ‘maven’ and ‘useful’ in the same sentence
Anyway, this reminds me of a project where we had server-side JavaScript processing using Rhino just to find out that weblogic.jar (who else?) contains Rhino inside it. Sweet.
cheers
Hi Phillip, I know Maven has some problems and I’m not the best guy to talk about them, as I’m using Maven for several months only.
However, when you have to build an application that has several components and a big list of dependencies, it’s very hard to do it properly using ony Ant.
I’m not in love with Maven, but what else do you suggest? Apache Ivy? I think this might be an options, but not yet. I’m not preaching for Maven, but I don’t know any better tool for the job right now. If you tell me a nice alternative, I’ll start looking at it very soon
I’ve been using Ivy for quite a while and it’s awesome. Not only it has the few stuff that Maven is good at (dependency management) but it’s compatible with Maven repositories. Give it a shot.
Ok Phillip, I’ll surely do it. Thanks for the tip.