data-analytics-with-elasticsearch-logstash-and-kibana-elk

ELK stack, scales nicely and works together seamlessly, is a combination of three open source projects –

  • Elasticsearch: founded in 2012, commercially supported open-source, built on top of Lucene, uses JSON and has rich API
  • Logstash: it’s there since 2009, as a method to stash logs
  • Kibana: it’s around since 2011, to visualize event data

ELK is mostly used in log analysis and end to end Big Data analytics. This is a mini tutorial on setting up ELK stack so that you can implement the solution on top of it.

ELK Stack Installation Steps

  1. Go to its official website https://www.elastic.co/downloads and download below products in a separate directoryELK Stack Installation steps
  2. Extract all the three downloads. Here in this tutorial we are using windows10 as a host or OS.
  3. To start Elasticsearch
    • Go to the <<Elasticsearch>>/bin and run elasticsearch.bat as an administrator.
    • After starting Elasticsearch server check http://localhost:9200 in browser to confirm the startup.
  4. To start Kibana
    • Go to the <<Kibana>>/bin and run kibana.bat as an administrator.
    • After Kibana server is started check http://localhost:5601 in web browser.
  5. To start Logstash
    • Go to the bin directory of Logstash and open command prompt as an administrator
      logstash -e 'input { stdin { } } output { stdout {} }'
    • When the main pipeline starts (“Pipeline main started”), type any message in the command prompt.
    • If everything is working seamlessly, Logstash will return your message with appended timestamp and IP.

Architectural Description of ELK Stack

Architectural Description of ELK Stack

As we can see in the above architecture, Logstash collects the raw data from various sources like HDFS, logs (system logs, HTTP logs, proxy logs etc.), Twitter streams, MySQL, etc and sends for further processes. Let’s try to nibble every component from this ELK stack and

1. Elasticsearch

Elasticsearch is a highly scalable real-time distributed search engine, which is mostly used for analysing and indexing the data.

  • It uses Lucene engine for fast searching and indexing.
  • It uses full text based searching.
  • Elasticsearch is an unstructured database which stores the data in the documents.
  • Elasticsearch runs in cluster mode and data is distributed on every node.
    Elasticsearch RDBMS
    Index Database
    Shard Shard
    Mapping Table
    Field Field
    JSON Object Tuple
  • Comparison between Relational database and Elasticsearch
  • “Index” in Elasticsearch is a collection of different type of documents and document properties. When data is pushed to the Elasticsearch, the data is arranged in indexes of Lucene, then Elasticsearch uses the Lucene indexes to read/write operations.
  • To create Index, raise a PUT request http://localhost:9200/index_name

You can search your data with http://localhost:9200/index_name/_search? As shown in below screenshotElasticsearch

2. Logstash

As shown in the above architectural diagram

  • Logstash collects logs and events from various sources like HDFS, MySql, logs (system logs, application logs, network logs), twitter etc and.
  • It transforms the data and sends to the Elasticsearch database.
  • At the same time Logstash uses a number of inputs, filters and output plugins. It transforms the raw data based on specified filters in its configuration file.
  • Here is an example of Logstash configuration fileLogstash configuration file
  • Above file contains the information of input location, output location and the filter (This needs to be applied to the processed data.)

The following command will help you to start Logstash with configuration fileCommand Logstash configuration fileAs shown above, Logstash has started the pipeline between Elasticsearch and Logstash and then parsing the data to Elasticsearch has started. If we want to visualize the data, we will use Kibana, the visualization tool.

3. Kibana

Kibana is an opensource visualization tool which provides a beautiful web interface to visualize the Elasticsearch data.

  • Kibana allows us to create real-time dashboards in browser based interfaces.
  • Kibana has different visualization effects like bar charts, graphs, pie charts, maps, tables etc.
  • It allows to save, edit, delete and share the dashboards.
  • After starting Kibana.bat file open http://localhost:5601 in browser and go to Management View like in the below screenshotManagement View
  • From the above picture select your “Index_name” and move ahead to work on that Index.
  • Discover option will allow you to see the data as shown in the below screenshotDiscover option
  • Dashboard option will allow you to create your own dashboard which can have multiple visuals as in the below screenshotDashboard

Kibana “DevTool” option helps you to interact with elasticsearch data. For example, if I want to search records of my Index. , we can see how it works belowDevTool

4. Elasticsearch-Hadoop

Elasticsearch-Hadoop

Use Cases or Examples of ELK Implementations

  1. DELL – Powering the Search to Put the Customer First.
  2. Facebook– Delivering a better help experience for over a billion users
  3. Microsoft– Providing search on Azure and powering Social Dynamics
  4. IBM– Providing the operational log analysis engine for Bluemix Apps
  5. Salesforce– Empowering businesses with log analysis for usage trends
  6. Accenture– Powering the search for the best client service
  7. Sprint– Analyzing 200 dashboards to search for better retail operations insight
  8. Symantec– Successfully switched from Solr to Elasticsearch with Elastic Support
  9. SunHotels– Scaling anomaly detection across 1000+ bookings a day with Elastic machine learning
  10. BBC– Unlocking yesterday’s content for the future of media search

TatvaSoft being a Software Development Company and working over the time with various projects have a deal with the Big Data Analytics services and consultancy for the clients from various industries. We even conveyed a project to the Media & Entertainment Industry using Elastic Search functionality for boosting up the purpose and process.

To know more about the project performed – Digital Distribution Platform

Want to Hire Skilled Developers?


    Comments

    • Leave a message...