Divante.com Blog

Creating stores in Magento is a constant strive for performance. It can be improved thanks to caching tools (e.g. Varnish, Redis – on the caching layer), server load balancing or smart database management. Still, the best performance is achieved with a well-written code.

 

Headless eCommerce - Download FREE Book

 

In this article I am going to focus on the database. In Magento, we use two configurations of ORM for database communication, that is the flat (flat table structure) and EAV (Entity Attribute Value). Frequently, the developers perceive no difference between the two types. The truth is, however, that if it is not understood correctly, the code is exposed to significant efficiency reduction.

The article shall discuss both configurations, specifying the advantages and disadvantages of each. I will also include the SQL functions executions during the standard queries to the database which should solidify your view of the more useful configuration in a specific case.

ORM – Object Relational Mapping

ORM is a software development technique used for converting various data types into objects and vice versa. In Magento, ORM takes place by means of models in two configurations:

  • Flat – flat table structure
  • EAV (Entity Attribute Value)

Data retrieval or saving in both ORM types is either identical or very similar in Magento which may make you think there is no difference in the access or committing the data. The differences are, however, significant as you will surely notice further on.

Even the inexperienced developer knows what flat table structure is. But what is EAV?

EAV (Entity Attribute Value)

EAV is a development pattern, saving data depending on the value type. The data distribution is presented in the diagram below:
1
Source: http://www.magentocommerce.com/knowledge-base/entry/magento-for-dev-part-7-advanced-orm-entity-attribute-value

The diagram depicts EAV table configuration describing goods in the store.

The first table contains a set of product entities (entity_id) that is base product indexes with the type (entity_type_id) indicating the entity type in EAV structure, together with base keys (e.g. for products this can be their SKUs).

The middle table contains all the attributes found in Magento. By filtering the tables by the entity type (entity_type_id) we can learn all the possible product attributes (e.g. price, dimensions, colour etc.).

The tables to the right present the final values of a given attribute. The column where a given value is saved depends on the value type. That is when you wish to record “colour” product attribute, you should record it in catalog_product_entity_varachar table as we know that the “colour” attribute values are of the string type.

To describe a product in the flat table structure, we could convert the above diagram into one table with the number of columns corresponding to the number of attributes required. So why EAV?

Flat vs EAV

EAV possesses something the flat table is deprived of, that is flexibility. At any time of the store development we can add any number of new attributes which will be new records in eav_attribute table. In the flat table, you would have add a new column to the table.

When you look at all that from the Magento perspective, EAV is an excellent choice as various stores or applications may have different characteristics and thanks to EAV you can play with the attributes with no need to redesign the database.

What is more, flat tables have a specific maximum number of columns (e.g. inoDB can have max. 1000 columns) which limits the maximum number of attributes. In EAV, this is not a problem.

EAV, however, has another significant drawback, that is performance. When you wish to get to a specific attribute, you must get data from several tables which is connected with the need to make a couple of SQL queries in Magento. When we also reach for additional attributes with another data type, new JOINs will appear in the queries. When we add an inexperienced developer who retrieves much data in batches via a load in a loop, the linear capacity fall is certain.

Let’s look at both types in practice.

ORM – flat table structure

Let’s create a flat table containing information on the hotel guests:
2

The table contains basic information on the guests and we will use only two types of data, that is INT and VARCHAR.

Next, let’s fill in the table with data:
3

and retrieve the data for the second guest:

4

The above code refers to divante_training/residents model and retrieves the data for the second guest using the load method. The result of executing the above Magento code is displaying the name and surname of the guests:

5

ORM has done everything for us. Let’s us not trust it and check what queries were executed:

6

As we see, ORM created a simply database query, nothing surprising.

ORM – EAV

Now, let’s create an analogous table structure presenting hotel guests but this time in the EAV context.

7

8

9

As we see above, this time I have had to create three tables: The first will contain guest’s entities (indexes) while the second and third will contain the values depending on the data type (INT → residents_entity_int, and VARCHAR → residents_entity_varchar).

An alert reader should have noticed we miss one more table with attributes (eav_attribute – described while discussing EAV theory). That was, however, created earlier together with the entire Magento database as all EAV Magento table types use it (products, orders etc.).

Let’s fill in the tables with the data analogous to the ones in the flat table.

10

11

12

and let’s retrieve the same guest:

13


This time we used divante_barcamp/residents model and retrieved the data for 14th guest:

14

As we see, both the reference context and the result is identical but what ORM did is different:

15

Retrieving data, ORM executed several queries and in the last of them we see the operation of combining the results for two queries by means of SQL UNION ALL command.

It could not have been different as ORM had to retrieve EAV entity type from tabeli eav_entity_type table, then data from our major residents_entity table, information on the attributes we need (eav_attribute), and finally the values of those attributes (residents_entity_int and residents_entity_varchar).

At this stage, the performance decrease can be seen for EAV with an identical number of database queries.

In the next part, I’ll make the situation even more complicated, by retrieving multiple data and committing the gravest of Magento developer’s sins. Can you feel the chills already?

Go to Efficient Magento code – database (Flat vs EAV). Part 2.

Marek Borzęcki

Senior programmer and Magento expert at Divante eCommerce Software House | LinkedIn | Twitter

Share your comment

Comments

  1. Reply

    Where is the next part of this tutorial? i did not find any link. can you provide me or you can send me in my email id: swadesh.singh08@gmail.com

  2. Reply

    Thanks a lot for the tutorial, the instructions are very nice and clear!

  3. Reply

    Do you have any suggestions on how to overcome the performance concerns on an EAV model ? Especially when you want to index on certain columns(rows in EAV) it gets challenging when you are building filters on top of it.

  4. Reply

    At the end of Jon Bentley’s book “Writing Efficient Programs”, the author warns that making code more efficient generally also makes it harder to understand and maintain, and so one does not rush in and tweak code unless one has first determined that there is a performance problem, and measures such as code profiling have pinpointed the exact location of the bottleneck. Once you have done so, you modify only the specific code that needs to run faster. Similar considerations apply to EAV modeling: you apply it only to the sub-system where traditional relational modeling is known