Batch processing best practices


Most applications have at least one batch processing task, executing a particular logic in the background. Writing a batch job is not complicated but there are some basic rules you need to be aware of, and I am going to enumerate the ones I found to be most important.

From an input type point of view, the processing items may come through polling a processing item repository or by being pushed them into the system through a queue. The following diagram shows the three main components of a typical batch processing system:

  • the input component (loading items by polling or from an input queue)
  • the processor: the main processing logic component
  • the output component: the output channel or store where results will sent

Continue reading “Batch processing best practices”

21st century logging

I think logging should get more attention than we are currently giving it. When designing an application a great deal of effort goes into modeling the customer business logic, making sure all use cases are covered and handled properly. The business model is mapped to a persistence storage (be at a RDBMS or a NoSQL solution), frameworks are being chosen: web, middle-ware, batch jobs, and probably SLF4J with log4j or logback.

This has been the case of almost all applications I’ve been involved with, and logging was always a second-class citizen, relying on good old string logging frameworks for that.

But recently I come to realize there is much more to logging than the current string based logging systems. Especially if my system gets deployed in the cloud and takes advantage of auto-scaling, then gathering text files and aggregating them to a common place smells like hacking.

In my latest application, we implemented a notification mechanism that holds more complex information since the String based log wasn’t sufficient. I have to thank one of my colleagues I work with who opened my eyes when saying “Notifications are at the heart of our application”. I haven’t ever thought of logging as the heart of any application. Business Logic is the heart of the application, not logging. But there is a lot of truth in his words since you can’t deploy something without a good mechanism of knowing if your system is actually doing what it was meant for.

Continue reading “21st century logging”

How to implement Equals and HashCode for JPA entities


Every Java object inherits the equals and hashCode methods, yet they are useful only for Value objects, being of no use for stateless behavior-oriented objects.

While comparing references using the “==” operator is straightforward, for object equality things are a little bit more complicated.

Continue reading “How to implement Equals and HashCode for JPA entities”

How to fetch entities multiple levels deep with Hibernate


It’s quite common to retrieve a root entity along with its children associations on multiple levels.

In our example, we need to load a Forest with its Trees and Branches and Leaves, and we will try to see have Hibernate behaves for three collection types: Sets, Indexed Lists, and Bags.

Continue reading “How to fetch entities multiple levels deep with Hibernate”

A beginner’s guide to Hibernate fetching strategies


When it comes to working with an ORM tool, everybody acknowledges the importance of database design and Entity-to-Table mapping. These aspects get a lot of attention, while things like fetching strategy might be simply put-off.

In my opinion, the entity fetching strategy shouldn’t ever be separated from the entity mapping design, since it might affect the overall application performance unless properly designed.

Before Hibernate and JPA got so popular, there was a great deal of effort put into designing each query, because you had to explicitly select all the joins you wanted to select from, and all the columns you were interested in. And if that was not enough, the DBA would optimize the slow running queries.

In JPA times, the JPQL or HQL queries are fetching Entities along with some of their associated relationships. This eases development, as it frees us from manually choosing all table fields we are interested in, and sometimes joins or additional queries are automatically generated for serving our needs.

This is a double-edged sword. On one hand, you can deliver features faster, but if your automatically generated SQL queries are not efficient, your overall application performance might suffer significantly.

Continue reading “A beginner’s guide to Hibernate fetching strategies”