Friday, October 24, 2014

Packages and separation of concerns

Recently I had short discussion with my coworker about how to package java code, and it brought back some of my earlier thoughts about it, mostly because over time some of my opinions about the subject changed significantly.

Within a single code base, packages are highest-level way to modularize our code, so they play significant role when one wants to grasp what the code is all about. Thing is that, each class (or interface, or actually any other code artifact represented as a file), can often be looked upon from multiple different point of views, but since we have to place that class in one and only one package, we have to make decision what point of view (out of many) is most significant to us.

For example, in java community, DAOs (data access object) are common "type" of classes which encapsulate data access logic. Thus, we commonly have something like UserDao that contains logic to do CRUD operations on user entities. Actually, following good rule to separate interface from implementation, we most often have UserDao interface and something like HibernateUserDaoImpl implementation (in case Hibernate is used as our ORM framework) or maybe JdbcUserDaoImpl (in case plain JDBC is used).

However, this UserDao class can be viewed from 2 points of view:
  • it is DAO (technical point of view)
  • it is user-related (business domain point of view)
Maybe our app even have multiple deployments based on our customers, so we maybe even have something like AcmeJdbcUserDaoImpl, so beside 2 aspects mentioned above, we can even say for that class to have additional one - it is ACME-deployment-related functionality.

To keep things simple, let's just look at two-aspect version of this DAO. So basically we need to decide how will we package our code, whether by technical aspect (layering), such as:
, or by business aspect:

Actually, the latter approach isn't packaged by business domains in the fullest sense, but I'll explain that in a moment.

Earlier in my career I used former approach, but over time I mostly adopted later one. Maybe it's just a matter of preference, but my reasoning is as follows.

As we already mentioned, in both examples the code is structured based on some concern, technical or domain related, so it seems that in both cases some sense of order is kept there. But following the reasoning that Robert C. Martin gave when explaining single responsibility principle (close cousin to principles of coupling, cohesion and separation of concern), we should group pieces of code that are likely to change due to same reason.

Now, in our example of UserDao, if you frequently have to change DAO implementations in whole app due to desire to use different persistence framework for example, then you would certainly find useful if all DAOs are under one package. But I find these cases quite rare. On the other hand, if some business feature is changed, such as some which is related to user "module", and if for that reason you have to change User entity, as well as UserDao which persists it, and also UserManager that exposes service operation related to user functionality, then you should certainly look to group those code artifacts together. And in my experience, these kind of changes are constantly present.

I can see some other people already blogged about this kind of "vertical" modularization, so here is one example. It also goes kind of hand in hand with SOA/microservices architectures, where you identify pieces of your application from functional perspective, which when grown, can be separated into its own services, thus deployed separately.

Now, I said before that I cheated a bit regarding the second approach presented above. One can notice that I still have web layer separated, although I have user-related web functionality there which is out of "users" package. I guess I still consider web layer such a different concern from business layer that I like to keep it separated. Lot of times we just have changes to web layer that doesn't touch business side, thus this separatiuon. But it definitely got me thinking that it would maybe make nice experiment to just try once putting even UserWebController within "users" package because it often updated when some change is required to user-related functionality.

Another thing - this whole story reminded me of a moment in a past when I was frequently using Tapestry as a web framework of my choice. Common thing in web development back then, as it is also now, was to have HTML template files separated from web controller classes, usually under web app root directory. But I heard occasionally that some Tapestry users like to keep HTML templates under the classpath, in other words, placed within packages, together with other web related code. Thus, one would have something like:
I resisted this idea for some time because it felt so wrong to keep HTML files together with the code, but when I eventually gave it a try, I was enlightened. It was so much easier not having to jump through different project directories and search files whenever I had to change some web feature, and I almost always have to simultaneously change HTML template together with the related controller code.

Of course, if I had been working on some other type of project where I had some dedicated web designer to work solely on HTML templates, then it would be different situation so this way of packaging would probably be totally unsuitable, but since I was both the HTML designer as well as the coder, it really made sense.

You can also notice in picture above that localization messages (.properties files) are also placed within the package and separated for each HTML page (EditUser_en.properties, EditCompany_en.properties...). Although I never practiced this much, and was also using single localization file for all web messages as usual, it was also practiced by some people due to benefits of having related messages as close as possible to related web artifacts (page controller class, page HTML template) for easier maintenance.  Of course, the benefit of usual approach with single localization file is that it is certainly easier to translate all web messages to some new language if you have all of them placed in single file (web_messages_en.properties, web_messages_de.properties...). But even nowadays, whenever I work on some larger project, I stumble upon situations where these single localization message files contain much of old garbage messages. It is a garbage left due to exact reason that it is quite easy to forget to clean up these localization messages once some web stuff has been changed/removed. With the approach given above, you usually never forget to do that because these localization files are so close to file that you just changed/removed.