by HEIG-Cloud
Posted on Wed, Dec 16, 2015
We are talking about Elasticsearch, Logstash and Kibana.
This is a very powerful combination of applications that allows you to centralize all of your logs, index them and consult them very easily and efficiently.
This stack is quite powerful but is not exactly easy to install. We won’t show you the whole process but there are a few points of interests that we will try to cover. Stick with us ;-)
First of all, let’s see what these applications can do.
That’s it for the stack, of course, this is a very simple and naive way to describe it, but you don’t need to know much more than that.
As the name implies, OpenStack is a stack of components. It means that all of them can be managed separately. One of the big drawbacks is the fact that every component creates its own logs. Why is this a problem you may ask? Well, let’s say we have a problem somewhere, for instance, users can’t create new instances. If you are using Horizon as the dashboard, you probably know that the error messages are not always very self-explanatory. You will need to check every log file on every compute nodes (you could have a lot of them), the controller and who knows, maybe even the network node.
It would be nice if you could just see the messages that present the keyword error for example, right? Well guess what, you can :)
We can use Kibana to search the logs by specifying the time and a filter. By adding this filter:
message=error
We make sure to only get the logs that contain the word error (which is a simple way to find easy errors). You can filter by host, time, type of the message, etc. There are so much possibilities and we are only scratching the surface there. This should have convinced you to try ELK for your installation.
We’ve been following the tutorial on digitalocean for our own installation, it’s very well made and up to date. We will not provide a step-by-step installation in this post, but we may do it if there is a demand ;-)
As always, the biggest problem we encountered is the fact that we have to modify a lot of different files, on different hosts, so it’s quite easy to make a mistake and not so easy to find it. As always, we’ve been using Ansible to install the ELK stack and it prevents us from committing a lot of stupid errors and mistakes. In this setup, we have:
It means that there are only 2 differents setups. All the nodes except from the controller, which is also the ELK server, have the same configuration. That’s why it makes even more sense to automatize the process with Ansible.
This is probably the biggest problem you’ll encounter while installing the stack. The conf file used by logstash is often “cut” in multiple parts. It means that you can have 3 or 5 different files to configure your logstash service. The truth is when logstash loads the conf, it takes all the files and just makes a new one by copying in succession the content. Example:
File A
Content of file A
File B
Content of file B
File C
Content of file C
In the end, the conf file will be like this:
Content of file A
Content of file B
Content of file C
Well, why not? The problem is, if you have twice the same information on different files, let’s say a bind command. The following example is used to configure the input sources for logstash:
02-input-filebeat.conf
input {
beats {
port => 5044
type => "logs"
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
Now, what happens if by any chance you have this file TWICE. You may wonder why that would happend, well it does sometimes. If by any chance your OS makes a copy of this file (let’s say a backup for whatever the reason), you’ll have twice the same command. It becomes a problem with a line like this one:
port => 5044
Basically, you will try to bind twice the same port, which will obviously end up with an error. So the morale of the story is:
ALWAYS check in your /etc/logstash/conf.d/ directory that you don’t have twice the same file!
This has happened to a lot of people (us too) and can easily make you waste a few hours :(
The ELK Stack is a really good thing to have on your OpenStack cluster to be efficient and find errors that you probably wouldn’t find otherwise (or with a lot more effort).