Divante.com Blog

In previous post we learned which tools could be used to check performance bottlenecks of PHP application – specially Magento. Now it’s time to deal with bottlenecks we found.

Application layer

We’re checking cache layer – we found that Redis is better choice for cache than popular Memcached. It works faster as Magento cache backend. Redis is harder to scale up (only master-slave replication is possible out-of-the box) – but for us it’s enough.

Probably the good choice for You will be to use Varnish proxy as we do. Not only for static content (product images, css’es – about 70% of HTTP requests could end-up on Varnish not hitting application server) but also for output caching of Magento-generated pages and blocks. If You use varnish-powered caching module You will be able to use ESI mechanism on Varnish to refresh dynamic blocks and cache all other page content.

In next step we added 2nd application server and use HAProxy as load balancerer. HAProxy is also replicated (to not cause SPoF) using IP-failover technique. Magento heavily uses app-servers because a lot of calculations are done in application logic; application loads hundrets of classes  and php files (which causes heavy I/O load). Of course before adding servers we checked if APC code caching is ensured (using PHP5.5 is better to use native OPCache because APC in some cases is unstable and could cause SEGFAULTs).

We added data layer caching to Mage_Catalog_Model_Product::load. Using caching on this low level method it’s crucial to perform cache-invalidation (using observers and cache taging). As we written in previous post – we cannot use flat tables. So each $product->load() causes a lot of SELECT queries on database to load adttributes (Magento uses EAV heavily). It’s worth to consider: adding cache to Mage_Eav_Model_Entity_Abstract which could by-pass EAV db operations.

Untitled

As we have 2400 attributes – we were bounded to InnoDB limit of columns per table – in our cause about 900. So we’ve added 900 attributes to flat. Most popular attributes – whose are used to filtering, product listing etc. We noticed huge query-count decrease on DB server.

Sessions have to be stored in cache (Redis/Memcached) with fallback to databse (on write You could use DB and cache or queue writes). There A LOT of queries about session state so there is no sense of quering DB if You could do some fast checking in cache. Sessions are key-value stores.

Consider using Fast-Async Reindexing module. We’ve problems with many reindexations. These operations locks db (of course in InnoDB only one-record at time, but with huge record-sets this could be dangerous).

We checked Magento indexation code and it looks like this:

Zrzut ekranu 2014-01-04 o 23.48.10

 

So – transaction starts (row is locked now) -> calculations are performed (could take even few seconds as we saw) -> row is unlocked. This is not optimal (in case of latency and locking). Some way to safely deal with this problem is to use master-slave replication of database. In this case – indexing is done on master (which is locked) but reads could be done from slaves (which are not locked). Replication uses binary-log copying so lock is ensured only for INSERT/UPDATE/DELETE operations – no additional locking time for computations.

Caveats

If You use multi-server environment probably You’re using some kind of Distributed File System (DFS) to handle user uploads, and code deployment between machines. This is OK – we’ve used GlusterFS. But You have to be carefull. On some kind of DFSes stat(), open() syscalls could be slower – really slower than on local FS. Gluster uses kernel IO buffers so in this case problem doesn’t exist.

APC is not 100% stable when used with PHP5.4. Some segfaults we discovered in logs. It’s better to use Zend Optimizer or OpCache (in PHP5.5)

Better try nginx+php-fpm than Apache. Main reasons? RAM usage, speed, stability. More on this here: http://info.magento.com/rs/magentocommerce/images/MagentoECG-PoweringMagentowithNgnixandPHP-FPM.pdf.

How to deal with EAV

If (as we) have problem with FLAT. You couldn’t turn it on or so – consider using tools like Lucene SOLR or Sphinx Search to bypass EAV on frontend. Removing EAV from Magento is hard – this mechanism is heart of Magento architecture. You could bypass it – for example – overriding search and filtering models on frontend. This allows You to use full-text search also. Ready-to-use modules for SOLR and Sphinx could be found without any problem.

What’s next?

Stay tuned. In next post in this serie – I describe how to deal with Database layer of Magento. This post will be the hard-one!

 

Share your comment

Comments

  1. Reply

    > Redis is harder to scale up (only master-slave replication is possible out-of-the box)

    There is twemproxy which will let you to automaticaly shard your data, so scaling horizontaly is also not soo hard :-)
    https://github.com/twitter/twemproxy

    > As we have 2400 attributes – we were bounded to InnoDB limit of columns per table – in our cause about 900. So we’ve added 900 attributes to flat.

    Is it possible to use some NoSQL like for e.g. MongoDB as main database in Magento? Won’t it fit nice as we need schema-less structure?

    Also – have you tried implementing attributes storage in Redis using Hashes data types? It could be superb fast and solve problems with eav in mysql. Going that way one must use additional solr/spinh etc to be able to search by product attributes.

    > Sessions have to be stored in cache (Redis/Memcached) with fallback to databse (on write You could use DB and cache or queue writes).

    Why there is need to fallback to the database? Using Redis you can achieve fully persistent and reliable storage. Or there are other caveats with sessions in Mage?

    1. Reply

      Thanks for great answer Antoni. I didn’t know twemproxy – thanks I’ll check it out. Unfortunatelly it’s not possible to use non-relational databse as Magento Backend at the time. Ivan Chempurnyi from EComDev (http://www.ecomdev.org/blog/magento) is working on it – but as I asked him about 6-7 weeks ago, work is still in progress. Magento models uses deeply SQL-related operations (grouping and so on). Ivan tries to rewrite Magento flat mechanism – to store data in Sphinx search (probably SOLR/ElasticSearch and so on will be also good choices. We used FactFinder in one of projects to bypass all product-browsing functions on frontend). As we discovered the simplest way is to rewrite Mage_Model_Eav_* classess backends to use some kind of cache. Redis will be great here to bypass hiting database on EAV queries.

      Of course – fallback to database (on sessions) is required only when You use Memcached. Redis doesn’t require it because of persitency which works great.

  2. Reply

    Love the post. Would you share the code in your load() function? I’m curious to see the caching code you wrote inside.

  3. Reply

    @Dimitry thanks for Your comment. Caching code in Product::load() is very simple and I’ll send You more details via e-mail

  4. Reply

    The Website is Very Useful. Have a look at ownmyshop.com Ecommerce Software
    Online Store Builder Grocery Store Website.