Vlad Mihalcea

Hibernate integration testing strategies

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Introduction

I like integration testing. As I explained in this article, it’s a good way to check what SQL queries are generated by Hibernate behind the scenes. But integration tests require a running database server, and this is the first choice you have to make.

Using a production-like local database server for Integration Testing

For a production environment, I always prefer using incremental DDL scripts, since I can always know what version is deployed on a given server, and which scripts required to be deployed. I’ve been relying on Flyway to manage the schema updates for me, and I’m very content with it.

On a small project, where the amount of integration tests is rather small, you can employ a production-like local database server for testing as well. This is the safest option since it guarantees you’re testing against a very similar environment with the production setup.

Some people believe that using a production-like environment would affect the test execution time, but that’s not the case. Nowadays, you can use Docker with tmpfs to speed up your tests and run them almost as fast as with an in-memory database.

Read More

MongoDB Facts: 80000+ inserts/second on commodity hardware

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Introduction

While experimenting with some time series collections I needed a large data set to check that our aggregation queries don’t become a bottleneck in case of increasing data loads. We settled for 50 million documents since beyond this number we would consider sharding anyway.

Each time event looks like this:

{
        "_id" : ObjectId("5298a5a03b3f4220588fe57c"),
        "created_on" : ISODate("2012-04-22T01:09:53Z"),
        "value" : 0.1647851116706831
}

As we wanted to get random values, we thought of generating them using JavaScript or Python (we could have tried in in Java, but we wanted to write it as fast as possible). We didn’t know which one will be faster so we decided to test them.

Read More

JVM Boolean Options

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Introduction

While trying to generate a Java Heap Dump, I remembered there is one JVM option I could use for this purpose. Since I can’t always remember these options’ names, I went to the Oracle documentation.

The problem

So I could extract the following arguments:

-XX:-HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/logs/jvm/dumps/

After limiting my Java Heap size to a value that I know it’s too low, I was expecting the Heap Dump to be generated whenever I got an OutOfMemoryError. But no dump got generated. I googled the issue, checked for JVM bugs but the only reported issue was a miss-usage when you give the JVM options after the Java Main class, but that wasn’t my case.

The fix

Then I stumbled on a slightly different version of my original setting (the one that I copy-pasted from the Oracle site):

-XX:+HeapDumpOnOutOfMemoryError

Then, I remember I once read about Boolean JVM options, and the very same Oracle site details this usage:

“Boolean options are turned on with -XX:+ and turned off with -XX:-.”

I think the Oracle JVM options table should show the “+” version, since that’s usually what you are looking for, especially because it’s disabled by default (so the “-” version behaves like not giving it at all).

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

The HeapDumpPath should point to a folder, but if your setting is something like /logs/jvm/dumps/, and your OS only contains /logs/jvm, then you won’t get a “java_pid.hprof” file within the /logs/jvm/dumps/ folder, but a dump file in /logs/jvm/, as the JVM doesn’t create the missing folders (a.k.a mkdirs).

Transactions and Concurrency Control eBook

Teaching is the best way to learn

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Introduction

Software development is all about knowledge, and nowadays the number of things a programmer needs to know skyrocketed. Most of the time developers are hired by matching their current skills with some project requirements. The project eventually ends, and the developer is assigned to a new project, sometimes using different technologies than what he was previously hired for. What’s the policy for training this guy to deliver his best the soonest possible?

Usually, training and coaching are left-out, so each programmer is on his own. Every time we leave things to chance a huge risk is implicitly undertaken. I believe we should give more to training and coaching and see them more of an investment rather than a spending.

Read More

Why I never blame open source projects

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Every now and then I get to read someone’s bad thought towards a given open-source framework. When I started programming Struts web framework was at its prime, everybody loved it. But then, little by little people started blaming it and then hate followed.

Then people started blaming Hibernate and recently MongoDB. I’ve even read that “I shouldn’t use MongoDB“. Well, I delivered projects on Struts, Hibernate and MongoDB, and none of those was ever a blocker.

If there is someone to blame it’s usually us, not the frameworks we use. If you download and browse the source code of a given open-source project, you’ll be pleasantly surprised by the quality of code. Most of the time I find it at least as good as I’d do it myself. Many open-source projects are the result of endless hard-working hours of many passionate developers, so why should we blame their frameworks then?

Like any other thing on earth, all of those have strengths and weaknesses and it’s us to decide which features fit in our projects, or whether we should even consider employing the framework after all.

While learning Struts, Spring or jQuery wasn’t that difficult, when it comes to databases, Hibernate and now NoSQL things get trickier. Both Hibernate and MongoDB are quality products, and I know many successful projects built on top of them, but that doesn’t mean they are easy to use. If you want to employ them, be prepared to learn a lot, there is no other way.

When I started using Hibernate I was overwhelmed by its complexity. I soon understood I couldn’t catch things up without thoroughly studying it, and that’s why I decided to fully read all the 900 pages of Java Persistence with Hibernate. And that was just the beginning, as even now I continue reading and checking its source code every now and then.

Then MongoDB seemed like a good fit in many of our projects, and since I knew nothing of NoSQL, I had to invest quite some time to be productive. Luckily MongoDB offers free online classes and once again I had to get back to studying. If you ever had to work with MongoDB aggregation framework you know what I mean.

Read More

How to retry JPA transactions after an OptimisticLockException

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Introduction

This is the third part of the optimistic locking series, and I will discuss how we can implement the automatic retry mechanism when dealing with JPA repositories.

You can find the introductory part here and the MongoDB implementation here.

Read More

Optimistic locking retry with MongoDB

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

In my previous post I talked about the benefit of employing optimistic locking for MongoDB batch processors. As I wrote before, the optimistic locking exception is a recoverable one, as long as we fetch the latest Entity, we update and save it.

Because we are using MongoDB we don’t have to worry about local or XA transactions. In a future post, I’ll demonstrate how you can build the same mechanism when using JPA.

The Spring framework offers a very good AOP support and, therefore, it makes easy implementing an automatic retry mechanism, and this is how I did it.

Read More

MongoDB optimistic locking

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Introduction

When moving from JPA to MongoDB you start to realize how many JPA features you’ve previously taken for granted. JPA prevents “lost updates” through both pessimistic and optimistic locking. Optimistic locking doesn’t end up locking anything, and it would have been better named optimistic locking-free or optimistic concurrency control because that’s what it does anyway.

Lost updates

So, what does it mean to “lose updates”?

A real-life example would be when multiple background tasks update different attributes of some common Entity.

optimisticlocking

In our example, we have a Product Entity with a quantity and a discount which are resolved by two separate batch processors.

  1. the Stock batch loads the Product with {quantity:1, discount: 0}
  2. the Stock changes the quantity, so we have {quantity:5, discount: 0}
  3. the Discount batch loads the Product with {quantity:1, discount: 0}
  4. the Discount changes the discount, so we have {quantity:1, discount: 15}
  5. Stock saves the Product {quantity:5, discount: 0}
  6. Discount saves the Product {quantity:1, discount: 15}
  7. the saved quantity is 1, and the Stock update is lost

In JPA you may provide the @Version field (usually an auto-incremented number) and Hibernate takes care of the rest. Behind the scenes there is a safety mechanism that checks the updated rows number when given a specific version. If no row was updated, then the version has changed and an optimistic locking exception is thrown.

Read More

Open-minded architect

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

While chit-chatting with one of my colleagues, I was surprised to hear they use a PHP team for developing their front-end application, while the back-end services are implemented using Java. Since their project is doing great, this really got my thinking why I haven’t ever considered such an architecture.

Most large Java web application I’ve been involved with have shone on the server-side part, while the client-side has been the Achilles heel.

While you can find great Java web developers, not every Java developer has web-based skills. But PHP developers are great when it comes to web programming, and they don’t have a zillion of frameworks to specialize in. PHP developing is pretty much standard, as opposed to Java web programming. I have always been anxious when joining a project using a new web framework I didn’t know anything about (e.g. Wicket), but that’s not the case for a PHP developer. They can always join a new project, and the learning curve is not that steep.

I remember reading many comparisons tests for Java vs PHP or Python, and I don’t remember seeing a single test  not aiming to pick-up a winner. Such test targets only the language, but disregards the community and especially its developers.

Sometimes the winning solution is not a single technology but a clever mix of those that are best suited within a given context. A similar concept is the polyglot persistence.

So as an architect you always have to stay open-minded and be objective of any technology you happen to love. After all, I love Java, but I also know it’s not always the best solution to all my clients’ problems.

Transactions and Concurrency Control eBook

Batch processing best practices

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Introduction

Most applications have at least one batch processing task, executing a particular logic in the background. Writing a batch job is not complicated but there are some basic rules you need to be aware of, and I am going to enumerate the ones I found to be most important.

From an input type point of view, the processing items may come through polling a processing item repository or by being pushed them into the system through a queue. The following diagram shows the three main components of a typical batch processing system:

  • the input component (loading items by polling or from an input queue)
  • the processor: the main processing logic component
  • the output component: the output channel or store where results will sent

Read More