If you only had a dozen of entries in your database, then you can just simply fetch all data and display it to the user. However, this is almost never the case. Most often, database table entries range from tens of rows to billions of records.
Fetching a large amount of data takes a significant amount of time. That’s because a lot of work needs to be done to move large volumes of data from the database server to the user interface:
the data has to be scanned from the disk and loaded into the database server buffer pool
the loaded data is sent over the network
the application server will get the data in tabular fashion (e.g., the JDBC ResultSet)
the application transforms the tabular-based data into tree-based structures (e.g., entities, DTOs)
the tree-based structure is transformed to JSON and sent over the network to the browser
the browser needs to load the entire JSON and use it to build the UI
Not only that fetching large volumes of data is extremely inefficient, but it also hurts the user experience. Imagine having to load a list of tens of thousands of entries on your mobile phone over a metered connection. Loading this particular large list will be slow, expensive, and impossible to navigate on a mobile phone with a very limited viewport.
So, for all these reasons, pagination is used so that only a small subset of the entire dataset is fetched and displayed at once.
Classic pagination layout
Now, the most common way of sampling a result set is to split it into multiple pages or subsets of data. One such example can be seen on the old Hibernate forum:
There are over 66k posts which are split into 2600 pages. While you can practically navigate to any of those pages, in reality, this is not very useful.
Why would I want to go to page number 1758? Or, how easily could I find the exact topic I’m interested in by jumping thousands of times from one page to the next?
Page number limit
When searching for a less-selective term on Google, you might end up getting a very large result set list of possible pages matching the searched keywords.
So, when searching for “Hibernate”, Google says it has 22 million results. However, Google only provides the most relevant 16 or 17 pages:
The thing is, few users ever navigate to the second or third pages. Most users don’t even need to go from one page to another because there is a much better way to find exactly what you are interested in. You just need more selective search terms.
So, when refining the search terms, we get a much better chance of finding what we were interested in:
Better ways of filtering
Pagination is good, especially because it allows you to fetch only a small subset of data at a time. However, page navigation is only useful if the number of pages is small. If you have tens or hundreds of pages, then it’s going to be very difficult for the users to find what they are interested in.
Instead, what the user wants is a better filtering tool. Instead of manually scanning each page, it would be much better if the application could do that automatically as long as you provide highly-selective search terms.
For the application developers, it means they have to provide with a way of refining the result set, either by providing more filtering criteria or via a more detailed search term description.
This way, the application can fetch the exact subset of data the user is interested in, and the user will quickly find the right entry from the narrowed result set.
I'm running an online workshopk on the 14th of May about The Best Way to Fetch Data with Java Persistence and Hibernate.
Although data pagination is a very widespread requirement, many implementations treat the problem only from a mathematical perspective instead of solving it from the user perspective.
So, limiting the number of pages is actually a very useful trick since it allows the user to focus on providing the right search terms. From the database perspective, limiting the number of pages is a safety measure as well as a single query, fetching billions of records, can end up consuming all database resources (CPU, memory, IO bandwidth), therefore affecting application availability.