Seize the deal!
Caching Best Practices
Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!
In this article, we are going to discuss several data pagination best and worst practices.
Data pagination is omnipresent in enterprise applications. Yet, most solutions, not only they offer a bad user experience, but they are also inefficient.
If you only had a dozen of entries in your database, then you can just simply fetch all data and display it to the user. However, this is almost never the case. Most often, database table entries range from tens of rows to billions of records.
Fetching a large amount of data takes a significant amount of time. That’s because a lot of work needs to be done to move large volumes of data from the database server to the user interface:
Not only that fetching large volumes of data is extremely inefficient, but it also hurts the user experience. Imagine having to load a list of tens of thousands of entries on your mobile phone over a metered connection. Loading this particular large list will be slow, expensive, and impossible to navigate on a mobile phone with a very limited viewport.
So, for all these reasons, pagination is used so that only a small subset of the entire dataset is fetched and displayed at once.
Now, the most common way of sampling a result set is to split it into multiple pages or subsets of data. One such example can be seen on the old Hibernate forum:
There are over 66k posts which are split into 2600 pages. While you can practically navigate to any of those pages, in reality, this is not very useful.
Why would I want to go to page number 1758? Or, how easily could I find the exact topic I’m interested in by jumping thousands of times from one page to the next?
When searching for a less-selective term on Google, you might end up getting a very large result set list of possible pages matching the searched keywords.
So, when searching for “Hibernate”, Google says it has 22 million results. However, Google only provides the most relevant 16 or 17 pages:
The thing is, few users ever navigate to the second or third pages. Most users don’t even need to go from one page to another because there is a much better way to find exactly what you are interested in. You just need more selective search terms.
So, when refining the search terms, we get a much better chance of finding what we were interested in:
Pagination is good, especially because it allows you to fetch only a small subset of data at a time. However, page navigation is only useful if the number of pages is small. If you have tens or hundreds of pages, then it’s going to be very difficult for the users to find what they are interested in.
Instead, what the user wants is a better filtering tool. Instead of manually scanning each page, it would be much better if the application could do that automatically as long as you provide highly-selective search terms.
For the application developers, it means they have to provide with a way of refining the result set, either by providing more filtering criteria or via a more detailed search term description.
This way, the application can fetch the exact subset of data the user is interested in, and the user will quickly find the right entry from the narrowed result set.
If you enjoyed this article, I bet you are going to love my upcoming Online Workshops!
- Caching Best Practices with JPA and Hibernate (2.5 hours) on the 30th of September
- High-Performance SQL (4 hours) on the 6th of October in collaboration with Voxxed Days Ticino
- High-Performance SQL (12 hours) starting on the 28th of October in collaboration with Bouvet
Although data pagination is a very widespread requirement, many implementations treat the problem only from a mathematical perspective instead of solving it from the user perspective.
So, limiting the number of pages is actually a very useful trick since it allows the user to focus on providing the right search terms. From the database perspective, limiting the number of pages is a safety measure as well as a single query, fetching billions of records, can end up consuming all database resources (CPU, memory, IO bandwidth), therefore affecting application availability.
Hypersistence Optimizer 2.2 has been released!