E-commerce platforms usually consist of multiple systems such as CRM or ERP. Every day, each of these systems generates millions of logs that provide us not only with the technical data about the state of the software but also can deliver a lot of business information, e.g. about user behavior in our store.
In this article I want to show you how we deal with logs in Divante so that the information they contain is effectively used either in emergency situations or for the purpose of carrying out advanced analyses and statistics. However, before we go any further, let us consider the problems we encounter in our daily work with logs.
Lack of consistent log format
Unfortunately, logs don’t have a coherent structure, rarely there are two different systems with the same log structure. One of the main reasons is that logs of different systems have to describe completely different entities or events, which is why it is so difficult to unify their structure.
Difficult log analysis
Classic logs are accessible and comprehensible only to those with adequate technical knowledge, such as administrators or software developers. Even for them, analyzing logs is often not an easy task, especially with a distributed IT system. Let’s not forget that in the case of production environment problems every second counts, so the sooner a team diagnoses the problem, the faster it is able to take appropriate corrective actions. It’s especially important in e-commerce, where every second of online store unavailability causes financial losses.
Lack of constant monitoring
We already know that our logs don’t have a common format and are not easy to analyze. How can we then conduct continuous monitoring of all parameters of our application? When will we know that our system encountered an unexpected problem? There are solutions such as Zabbix or NewRelic, which are very useful as a monitoring tool, but don’t allow us to investigate the problem in depth.
Log record is usually a synchronous record to a file that is blocking next program instructions. When saving a lot of logs, we encounter a limit of i-node number in the file system, so we are forced to implement additional mechanisms cleaning old logs. How can we determine which logs are old and unnecessary?
The answer to these and many other problems we encounter in our daily work with logs is a set of tools called ELK Stack offered by Elastic, which provides a highly scalable centralized log management system.
What is ELK Stack?
ELK Stack is a set of the following tools: Elasticsearch, Logstash, Kibana
E for ElasticSearch
ElasticSearch is a modern and efficient data indexer and a search engine based on Apache Lucene, which allows you to search and analyze the collected data.
L for Logstash
Logstash is a tool for collecting logs from the indicated sources and structuring them to the desired format through a number of filters.
The last step of Logstash activity is sending prepared data to the specified directory, which, in this case, is ElasticSearch.
As a source of logs we can configure one of (41 sources), e.g.: elasticsearch, imap, rabbitmq, redis, sqlite, syslog, tcp, twitter, varnishlog.
We can send the prepared data to one of 55 different types, e.g. elasticsearch, mongodb, redis, email, file, csv, hipchat, http, jira, redmine, rabbitmq, tcp, zabb
With Logstash, there are 50 filters at our disposal that help us parse the collected data, e.g. csv, date, fingerprint, geoip, grep, grok, json, ruby, split, urldecode, useragent, xml.
K for Kibana
Kibana is a user-friendly and fully definable interface which allows to browse, search, and visualize the collected data through a variety of graphs.
ELKStack – how does it work?
- Suppliers, i.e. Logstash instances installed on particular servers from which we want to collect log files and send them to Redis fast database. Redis serves as middleware, its job is to queue the aggregated data. At this point it is worth noting that the logs don’t need to be saved in files on each server; equally good solution would be a direct record of our logs to Redis database, which has a very fast record speed.
- The next step is normalizing the logs stored in Redis through the indexer. Another Logstash instance serves as an indexer in this model. The filtered logs are be sent directly to ElasticSearch.
- We can explore the data prepared in such way using Kibana panel that reads data from Elastichsearch database in order to present them to a user through the dashboard.
Kibana dashboard and its great potential in individual configuration, searching, filtering and analyzing the collected data is, in my opinion, the best summary of all the labor we need to put in the deployment of a complete, centralized system of logs. See and judge for yourself if you would like to have such solution in your project.