Everyday Tales: Anatomy of a Refactoring

I’ve been extremely busy with project after project in the past few months, leaving me no time to do any research and/or play around interesting things. Even though I prefer to write about what is really interesting me at a given moment, I think that writing about some smaller/simpler problems and solutions would be better than not blogging at all. Here is the first of my Everyday Tales.

My team is delivering an integration Layer for a large corporation. This client has many different systems and it is extremely common for its clients and employees to spend the day jumping from one system to another, depending on which of the many services provided they happen to be interested on in that very moment. Classic growth-over-the-years problem.

In this integration piece we are building a web services API and a web interface for users in multiple roles. We decided to go for a rich Domain Model. This is not very common in integration projects; most people in those scenarios would just create coarser-grained data structures composed of cherry-picked data coming from multiple back-end systems. We decided for the richer domain model mainly because:

  1. Our integration Layer will not only act as a single gateway multiple systems, it will also apply business rules and manage local state.
  2. The client is a classic example of environment where the business language was derived from the systems language. More than building some piece of software, our work as consultants is to help the client to solve the impedance mismatches across different departments and domains. Domain-Driven Design is a great tools for making sure that business, developers and everyone else speak the same language. To better apply Domain-Driven Design you should have a Domain Model.

The problem with integration projects is often that no one is really sure about which of the multiple systems we talk to have which piece of information, we discover new things every day. In agile projects that means that we learn a lot during the project and, therefore, we refactor a lot so that our code always reflects our understanding of the multiple models and how they interact with each other.

I will try to describe a sequence of refactorings we conducted over the past weeks. All changes described here were executed following baby steps while we were working on user stories. I will try to get as close as I can to the real world without breaking NDAs and letting irrelevant details spoil the core message.

Let’s say that instead of integrating multiple business systems we were integrating multiple social networks. A requirement for our system is that we should be able to see a single User object in our Layer but the data for this object should come from multiple sources.

We first started as something like this:

001.JPG

That worked fine for a while, we managed to get many stories done using this structure, but there was something funny about this model. For starters, we had the Domain Model depending on the infrastructure, what always rings a bell for me. We also had the PasswordSynchronisationService (something that makes sure you have the same password in all social networks) was depending on the UserRepository. This relationship caused the repository to have funny methods like getAllSocialNetworks() and getSocialNetworksForUser(User).

Clearly this class didn’t get the Separation of Concerns memo and the result was this was class #1 in number of conflicts and merges –every day at least two pairs would touch it, regardless of what stories they were playing.

For some time now I am avoiding the word Repository. Not only it causes all sorts of trouble when people understand it as an alias for DAO but it also has some really weird semantics.

Accordingly to Eric Evans in the Domain-Driven Design book:

For each type of object that needs global access, create an object that can provide the illusion of an in-memory collection of all objects of that type. Set up access through a well-known global interface. Provide methods to add and remove objects, which will encapsulate the actual insertion or removal of data in the data store. Provide methods that select objects based on some criteria and return fully instantiated objects or collections of objects whose attribute values meet the criteria, thereby encapsulating the actual storage and query technology.

So we decided to follow the suggestion of Rodrigo Yoshima (link in Portuguese) and drop the *Repository suffix. If they should behave like in-memory lists let’s give them a decent name! So here we are:

002.jpg

We created a AllUsers interface and let UserRepository implement it. Now it is easier to spot our modelling problem: it is not easy to say there’s something wrong when a UserRepository has a method getAllSocialNetworks(), but a class named AllUsers with such method is really peculiar.

While we think about what to do with our class, let’s think a bit about the domain we have now. I said before that the fact that the classes belonging to the Ubiquitous Language depend on classes from the infrastructure was sounding really funny to my ears. With our last refactoring we can actually spot an easy way out of this weird dependency relationship between Layers. Let’s change the model a bit more…

003.jpg

So now what we have is a AllSocialNetworks interface. The PasswordSyncrhonisationService doesn’t depend on the AllUsers anymore, and that means that we can move all methods related to Social Networks to the new class. As we want to do baby-steps we did not split the UserRepository, just added let it implement the new interface as well.

Talking about UserRepository, we solved the model-should-not-depend-on-infrastructure problem in a very simple way: we just moved the UserRepository out of the domain. In practical terms, that means that we moved it to a different package (it’s a Java project) and made sure that no class in the Domain Model dependeds on it. The Domain Model only knows about the All* interfaces and instances of those are injected through a Dependency Injection container.

A new story is played and suddenly we have something that sounds like a simple change. In order to bypass some technical limitations (we are calling HTTP APIs in this system, there’s no such thing as lazy-loading and ORM) we have to change the domain:

004.jpg

Now the user depends on the AllSocialNetworks list in order to get the Social Networks it belongs to. Any instance of User must have an instance of AllSocialNetworks inside it to be valid –this is part of the User’s invariant. This is not great but at least we will not have to implement our own lazy-loading mechanism –yet.

Just by introducing those two interfaces we simplified our domain. Now, whatever needs Social Networks points to the specific interface and don’t bother with the Users - keep in mind that this is just a tiny subset of a much larger domain and system; we have many other classes that depend and collaborate with those pictured here.

This new requirement created an interesting problem, though. To understand the problem, let’s see how a Facebook user was retrieved before we introduced the dependency from User to AllSocialNetworks:

005.jpg

A call to UserRepository asking for a given user would make it first go to the FacebookGateway and ask it to perform a (HTTP) request for the user profile. The user profile is brought back in, say, JSON and this data is then sent to a parser. The parser interprets the JSON representation, creates an instance of the User, populates the instance with the parsed data and returns it. All used to work fine.

In the new model, we have a slightly different flow:

006.jpg

Can you spot the difference? To be clearer, before the requirement we just introduced, the User class had a constructor like this:

public class User{
 public User(){
}
}

But now its constructor looks like this:

public class User{
 public User(AllSocialNetworks socialNetworks){
}
}

Why? Because before we said that the user story we played required us to change the invariant of a User. Before that story, a User didn’t need anything special to be created. After the story was played, a User started requiring an instance of AllSocialNetworks, without that dependency the User we just created is not a valid object and must not be used in the system. To enforce this invariant we want to make the dependency part of the constructor –a User object with no AllSocialNetworks instances to query is not reliable and should not even exist!

That all makes sense, but here is our problem:

007.jpg

In order to be able to create a valid instance of User, the parser class now must use an instance of AllSocialNetworks. Can you describe the problem we have here?

If the FacebookMessageParser class needs and instance of AllSocialNetworks to work. This relationship is part of its invariant and, as we said before, should be enforced in its constructor (i.e. we will be using Constructor-based injection). The problem here is that the only known instance of AllSocialNetworks is a UserRepository, but a UserRepository needs and instance of FacebookMessageParser to be valid! It’s a chicken-and-egg problem, also known as circular dependency.

008.jpg

Were this an isolated problem and we could potentially just compromise our invariants. We could change that class to setter-based injection, add to our tech debit and keep going for now. This would be pretty reasonable in a real world project, with real world budget and deadlines. The problem is that this was not the first time we had this very problem in this project. It was actually the third time this happened.

Clearly something in our design was wrong and some core change was necessary before we ended up breaking all our contracts and moving from “a couple of circular dependencies” to a full dependency-spiral-of-death structure.

In the next post we shall talk more about the possible solutions we came up with.

11 Responses to “Everyday Tales: Anatomy of a Refactoring”


  1. 1 Leandro Herrera Feb 24th, 2010 at 11:54 pm

    Hi Philip!

    Why don’t you let the UserRepository create the user and pass it to the Parser so it can fullfill the user’s attributes?

    The Parser won’t be needing to know anything about the AllSocialNetworks.

    Do you need any information from the JSON profile in order to inject AllSocialNetworks into the User object?

  2. 2 tucaz Feb 25th, 2010 at 5:12 am

    My Java is very bad, but as far as I understand packages are pretty much like .NET DLLs (or projects), am I correct?

    It´s a very common practice to have a package with domain classes and other with infrastructure stuff. However, if you do that one day or another you will end up with this kind of circular reference. I guess this happens because this is the way the system should work (you said that yourself about the invariants) and that´s how life is. If it walk like a duck, fly like a duck and speak like a duck then it´s a duck. Your infrastructure layer should consume your domain layer.

    Unfortunately I think you have three ways of solving this (and I really want to be wrong so I can learn something new) and none being easy:

    1) Find out that you made a really big mistake understanding concepts and fix it so you won´t need this kind of reference anymore (generally this won´t happen)

    2) Live with it and use DI to be able to solve the circular reference injecting the Domain class into the Infrastructure assuming that your Domain package references the Infrastructure package and that Java does not allow circular references (Idk if it does)

    3) Put all your classes into one package and organize them using namespaces. With this option your layers will be a little mixed (bottom layers will reference upper layers) in these specific spots but it´s far less complicated than having a complicated injection configuration.

    Option 3 is my personal choice and so far it´s working better than other options.

  3. 3 Samer Kanjo Feb 25th, 2010 at 7:47 am

    I seems like you actually split your original repository into two repositories, which I will call SocialNetworkRepository and a new UserRepository, while the old UserRepository implementation, implementing the two new interfaces, should be called something else like SocialNetworkGateway.

    I was also confused as to why you would create one User object from each social network when the intent was to have one User object with data bits from all or several social networks. Perhaps my understanding is lacking in that area.

    My first thought was to create separate user classes, one for each social network on the infrastructure side and provide a Factory, again on the infrastructure side, used by your UserRepository to reconstitute a User domain object. Since your UserRepository is coordinating these activities and implements the AllSocialNetworks interface it can simply pass in itself to the User constructor.

  4. 4 Rodrigo Yoshima Feb 25th, 2010 at 2:07 pm

    Nice to help you guys from TW.

    (BTW #UML #FTW ;))

  5. 5 Luis Sergio Oliveira Feb 25th, 2010 at 3:16 pm

    Hi Phillip,

    very nice and clear write-up of the circular dependency problems, which normally occur in almost all real world projects I’ve seen so far.

    The solution will be to take the plunge and break-up UserRepository in pieces, right?

  6. 6 Luis Sergio Oliveira Feb 25th, 2010 at 3:31 pm

    To elaborate a bit more on my comment, I think that UserRepository continued to be a class with more than one responsibility, although you had the nice interfaces for it. So, it is like having one object that is actually two objects and has two set of needs to satisfy two related, but, different responsibilities…

    But, I don’t want to spoil the story, please give your readers the 2nd part in the same step by step and clear style :-)

  7. 7 Leandro Feb 25th, 2010 at 11:31 pm

    I enjoy this ‘Everyday Tales’ new journey!
    “writing about some smaller/simpler problems and solutions would be better than not blogging at all.” :) yeah… I must confess that I had should read it twice to understand the solution. My brain is very slow. The “models” (#UML) helped a lot. Keep telling us more the ‘Everyday Tales’!

  8. 8 Camilo Telles Feb 27th, 2010 at 9:18 pm

    Dear Mr. Calçado,

    What is your contact? I want to hire you (ThoughtWorks) for a job in Brasil.

    Regards

    Camilo Telles

  9. 9 Leandro Herrera Mar 9th, 2010 at 11:07 pm

    Hey Phillip! How did you solve the problem? I’m curious about it… :P

  1. 1 Everyday Tales: Anatomy of a Refactoring – Part 2 at Fragmental.tw Pingback on Mar 10th, 2010 at 10:44 pm
  2. 2 Palestra BDD – Unifor 2010 - Milfont Consulting Pingback on May 29th, 2010 at 3:49 am

Leave a Reply








Creative Commons License

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.