The most common mistake in the implementation is the way in which data is retrieved from the database, and especially an excessive usage of database calls. Even if a single database call could be very simple and maybe only take one or a few milliseconds to complete when multiplying it with a large number it will instantly grow into seconds.
A call to the database, first of all, has a static overhead of its own that each call has to take, and second it is usually easier and less consuming in total for the database to do the same query on multiple objects at once instead of having to redo the same query over and over again for each single object.
Example of Inefficient Data Retrieval
The example below tries to illustrate what the difference could be by making the implementation in different ways, even though the visual end result for the user is the same.
In this example, a table containing Y number of columns and X number of rows is used, but the same idea of data retrieval is applicable for a lot of other scenarios as well.
Let us assume that each column contains a program to retrieve the data. A minimum of one retrieved value per cell is needed. In many cases probably multiple values are needed to render the correct data.
Approach 1) Each cell:
Each column’s program loops the rows in the table and retrieves the information to show, i.e. each cell is responsible for retrieving its own data. → Minimum calls to the database are Y columns * X rows (e.g. 10 columns and 1000 rows gives 10.000 database calls).
Based on numerous reviews made by TechniaTranscat (previously Technia), this approach is heavily used in customers’ applications.
Approach 2) Each column:
Each column’s program makes a combined call to the database asking for the same information for all rows at once, i.e. each column is responsible for retrieving its own data.
→ Minimum calls to the database are Y columns (e.g. 10 columns and 1000 rows gives 10 database calls).
Approach 3) Each table:
Each column’s program needed calls are combined together for retrieving the data, i.e. each table is responsible for retrieving its own data.
→ Minimum calls to the database are N tables (e.g. 10 columns and 1000 rows gives 1 database call).
Approach 1 above can very quickly grow into a lot of calls. If the number of rows is double, the number of database calls are doubled, the same if the columns are doubled. If both the columns and the rows are doubled the cells and thereby the database calls are quadrupled.
It might be the easiest approach from a development point of view, writing the retrieval code in a sequential way mixed with the rendering code but it has a huge impact on performance.
The second approach is much better, letting each column make the same query/queries for all the rows at the same time. The implementation effort for this approach should not be higher than on approach number one.
It is still not optimal. Often similar or even the same calls are made in more than one column as part of what is going to be rendered. It would be beneficial not just to combine the select calls into one, but do not make the same select call multiple times.
The third approach uses as little database calls as possible to make the retrieval. In that approach, there is also the possibility to not have to make duplicate calls as well as combining similar to one. This approach is the most complex one to design but will give the best results in the end.
The conclusion of this is that trying to combine calls to the database to as few as possible will in most cases improve performance and system load rapidly.
Questions or comments? Feel free to post something in the comments below.