Visualizing server performance with Elasticsearch and Kibana: Part 2, Visualization

Now that I have my data from part 1, I wish to build some visualizations from it using Kibana. I like Kibana 4 as it does not require me to set up some separate web servers or whatever. I always was a simple kind of a guy..

EDIT: I somehow managed to lose all the figures/images when moving around this blog here so I had to redo them. I still had the minecraft data around so that is correct here. The system resource data was long gone, so I created some synthetic data. It should illustrate how to use the tool etc. But trust me it was real once.. for sure it was.. 🙂 The data was still collected using the same real probes, just using synthetic load generators not the real server with real users. Anyway..

So I start up Kibana 4 and get something like this:

add index

The box “logstash-*” is where I need to put the pattern to find the index I have created. In this case, I created two indices in part 1. One is called “minecraft” and is the one where the Prism collected data related to the inner workings of the Minecraft server is stored. The other one is called “session1” and is the one where the resource usage data for the Ubuntu VM hosting the Minecraft server is stored. I have already created separate patterns for these in this case, shown in the top left in the image.

After choosing an index, something like this is displayed, showing the index fields (mapping):

index properties

When setting up the index in Kibana, it should present a question if you want it to be time-series data. And a choice of the column/field to represent the time. If the selection box is empty, the index most likely does not contain a “date” format data type. For example, if we just dump Epoch time data (milliseconds since 1970 for ES)  in ES without setting up the mapping (schema) first as “date” type, ES/Kibana will not recognize this as a potential date type and will not propose it here. Here the type named “time” has the “date” type as shown above and has been chosen as the field to represent time. The name is just a user given name (the name “time” here was defined by the SQL query in part 1, and defined in an explicit schema/mapping definition I used to create it in ES before the import).

Also of interest here is the “analyzed” property of the field. This is “true” here for two fields, “action” and “player”. These are the two “string” data types. When this is true, ES will analyze the data in this field and split it into “terms”. Which can affect the visualizations in different ways as I will note later for the Minecraft event data.

To start building visualizations from the data, I first select the “discover” tab as shown below.

Screen Shot 2015-08-18 at 11.51.25 PM

In this case, Kibana tells me “no results found” as shown above. This is because I recorded test data for a short time period previously and as shown in top right corner of the figure above, the current analysis time slot is set to “last 15 minutes”. Because this time slot has no data, Kibana says “no results found”. To find some data, I need to set the correct time frame. To do this, I click on the “last 15 minutes” text in the top right corner as shown in the figure above.

Clicking on the time selection opens a screen similar to shown below:

discover tab time selection

In the figure above, the “last 15 minutes” is shown as one of the quick selections. These “quick” selections allow one to choose one of these items defining some common choices. Since I ran the original test session about half a year ago (and now need to fix these missing images.. hah), I choose here “year to now”. This allows me to see where I have data available as shown by the bars in the figure above.

To set us up with a good set of analytics, I need to choose a timeline where data is available to create a visualization. So I need to refine the timeline to the time when data was collected as evidenced by the green bars.

To do this, I can “mouseover” the green bar in the timeline, and get the timestamp. Kibana is handy in that pretty much every chart it draws can be “mouseovered” in a similar way to get detailed information about different parts of it. I could manually use this information to enter the correct timestamps for the period of interest, and refine from there.

However, these modern webapps are much more handy and we can just point and drag the barchart over the green bars to choose the timeframe, as shown below:


Doing this a few times, I end up with the part I want to look at:

time select refined

So now I have data for the whole selected timeline and can create visualizations. What I can see from this (and from knowing my initial recording session that I observed when recording it), this is actually missing some data. It seems Prism does not actually collect everything I might be interested in. It only collects player actions, and missed a number of events related to use of Minecraft command blocks. So my data quality would need improvement. But more about that later.. However, this illustrates how to set my timeline for visualization to get the initial idea about my data and build the first draft of my visualizations.

So now that the timeline is set up, I can set us up the charts. I click on the “visualize” tab and for the first chart choose the “vertical bar chart” as shown below.

vertical bar chart choice

The index pattern defines what I want to visualize, either the Minecraft specific data (minecraft index) or the VM resource use specific data (session1 index):

Screen Shot 2015-08-19 at 12.17.39 AM

Once I choose the index, which in this case is the “minecraft” index, I get to the part where I define my visualization:

Screen Shot 2015-08-19 at 6.54.13 PM

In the figure above, I have fist narrowed the scope down a bit more, and chosen to represent the y-axis as the count of events (in this case anything in the “minecraft” index for a given second). X-axis is represented by a time-series as defined by the “time” field defined when defining the index. Here, Kibana has actually combined data for 1 minute into one “bucket” (represented in the visualization as a single bar in the chart). So each bar in this case showcases the number of events observed in 1 minute time interval.

This shows the number of events observed at different times (at 1 minute granularity as noted). However, I would like to see in more detail what these events consist of. The figure below illustrates how to do this:

split bars

The sub-aggregation “split bars” allows me to split each bar into smaller parts, showing what it consists of. Setting this up and visualizing it is shown in the figure above.

In this case, I have chosen the “term” aggregation over the “action” field, ordered by the number (count) of items for each. So the visualization now shows each bar split into a set of smaller bars stacked on top of each other. That is, it shows how many times different actions appeared in each of the bucket represented by a bar.

On the right hand side of this figure is also the legend showing each “term”. Something of note here is the previously mentioned “analyzed” setting for the ES index mapping. If the “action” field is defined as “analyzed”, ES splits each of these action names in the legend into two parts separated by the “-” character. This would totally mess up the visualization, which is why I have defined it as “index: not analyzed” in the ES mapping definition. For some reason, Kibana shows it still as an “analyzed field” but at least it seems to work fine. Initially I made the mistake of not having it correctly defined and the buckets as well as the visualization were messed up. Luckily I could just delete the index, redefine it as “not analyzed” and re-run the import from MariaDB where the original Minecraft events were stored.

Finally, we can also mouseover the legend or parts of the chart to highlight specific event types in the chart. The figure below shows this for the highest bars where the “tnt-explode” event is the most dominant:

mouseover element

Now, to keep my visualizations around for later (and to add them to a dashboard), I need to save them. This can be done using the disk icon in the top right corner.

Now, to try a different visualization I wish to see how the server resource use was during this same time period. So I create another visualization, this time a line chart:

line chart choice

Note that you might need to click the + icon for new visualization in the chart window (visualize tab) if it is showing the previous chart (minecraft bar chart from before here). Instead of the minecraft index, I pick the session1 index where the resource use data was stored.

The resource data has a field called “percent” to define the percentage of resources used. So I set my Y-axis to represent that:

initial line chart

I set it to “average” to get the average value for the bucket, which is what I want. However, this chart has some problems. The resource probes I use collect data for all processes and various resource types (CPU and memory being main targets) at both system and every process level. So this average messes up different types for different targets, as by default ES does not discriminate between different data types, just looks for any document with the “percent” field in it and averages them all for me.

To fix this I can se the query to filter only data I am interested in as shown below:

system cpu

This chart now shows only system CPU resource usage level, and is correctly between 0 and 100 percent. We can set the filter as we like, and the chart below illustrates this for system memory use:

Screen Shot 2015-08-19 at 8.15.52 PM

In the original data (before I managed to delete the images..), this showed a strange looking graph as as it never seemed to go higher than 30. Then I remember I did not set the Java maximum memory limit when starting the Spigot Minecraft server. And by default the Java VM reserves only a part of the system memory, this time my guess is it reserved about 30 percent of the total system memory. So, while the system is only at about 30 percent capacity, the server is likely at almost top capacity for memory most of the time. This is useful information in itself, but not a very fancy visualization example since it is almost capped out most of the time (hard to see how different events impact memory use). Of course, the figure above is different as I created it for this re-write using synthetic load.

But I would like to see all the resource use (well, CPU and memory anyway) in a single chart. This can be achieved as shown below:

combined lines

Here I have defined a sub-aggregation on the x-axis, moving the two query filters into the sub-aggregation. So the “split lines” sub aggregation splits the chart into two different lines in the same chart, and the query 1 and query 2 define what data those lines should show. Since both have the same field named “percent”, they can both be visualized using the same Y-axis configuration as shown higher above. They also have the same value range of 0-100. These properties allow me to dump them in the same chart this way.

We can do the same thing with an area chart, using a similar sub-aggregation:

Screen Shot 2015-08-19 at 8.32.54 PM

The main difference to note here is the “chart mode” option in the “Options” tab. Using this, it is possible to define how the sub-aggregations are combined. The default setting is “stacked”, which results in drawing the second area on stacked on top of the first one, as shown by the small area part on the right hand side of the figure above. That is, where the CPU load ends (green part), the memory part (blue) starts. In this case it looks confusing but for other data might be more appropriate. In this case I choose “overlap” and hit “apply” (the green arrow) to draw them both in range of 0-100 (on top of each other as shown below:

Screen Shot 2015-08-19 at 8.36.12 PM

Similar choices as are given for the area chart can be applied to the bar chart I illustrated in the beginning as well. Since the Minecraft actions are drawn in the same bar for each bucket in my bar chart example, it is a “stacked” bar chart type. Another example would be “grouped” which would build N bars for each bucket, one for each “term” or “action” is my bar chart example.

Next up, we should build a dashboard or two from these visualizations. The dashboards combine several visualizations in one, allowing me to compare, for example, resource use and server events in one view. Part 3 is for dashboards..


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s