Sending PowerStore logs to Syslog

In this article we will explore how to get the logs from a DellEMC PowerStore array to a Syslog server. For this purpose we will use the PowerStore’s REST API, which is a great piece of engineering and a joy to work with as a developer. If you want to learn more about the PowerStore REST API I strongly recommend you quickly skim through the 2 articles I have written about the REST API.

In particular the second article demonstrates the capabilities of the REST API query language. This is a great feature I will use heavily in the last section so I strongly recommend you read that one at least

As a side note, Syslog was developed in 1980. So this year it has turn 40 years old! That is a long time by any measure ... many of my colleagues were not even born in 1980. But in the technology scale it looks even scarier. 1980 is closer in time to the ENIAC than to Kubernetes which is like the Pleistocene in human terms 😃.

For easier navigation, this blog article has been divided into three sections:

  1. Understand the types of Logs in PowerStore
  2. Introduction to Logstash
  3. Sending PowerStore logs to a Syslog server

If you can't wait to see all of this in action you can watch the video now. Otherwise read on!

1 - Understand the logs in PowerStore

PowerStore exposes 4 sources of information that can be considered logs. Depending on the requirement you will take some or all of them:

  • Alerts
  • Events
  • Jobs
  • Audit logs

The first 3 can be seen in the GUI under the monitoring tab


Additionally, the audit logs are accessible in the GUI through Settings > Audit Logs.


The PowerStore API exposes all 4 logs mentioned above as 4 different resources. The following are the API calls to extract the collection from each of them

  • GET /alert
  • GET /event
  • GET /job
  • GET /audit_event

You can use the powerful query features described in my previous article to tweak the output and extract when you need

Most modern tools support extracting data directly from REST API (Splunk, Solar Winds, Site Scope, …). In fact, chances are that if you are trying to do something smart with the logs you are using a tool that already supports extracting data from a REST API. In this case you don't need Syslog.

As example you can take a look at what my colleague Kyle Prins has done for PowerStore with Splunk. He created a Splunk “app” that can be accessed directly from “Splunkbase”. The app is very comprehensive as it allows you to retrieve data from 87 different API endpoints



2 - Introduction to Logstash

For the purpose of sending the PowerStore logs to Syslog you could use Logstash. This is one of the 3 components of the original ELK stack which consisted of (ElasticSearch, Logstash and Kibana). After a while a fourth component called “Beats” was introduced and the whole package was simply renamed to “Elastic stack”


ElasticSearch is a powerful Open Source search and analytics solution. It plays in the same space as Splunk. ElasticSearch does the indexing but you interact with it mostly through its REST API. In other words it doesn’t provide a GUI. For that, you need to install Kibana, and that’s where the “K” in ELK comes from.

The third member of the family is Logstash, which is a clever piece of software to collect data from multiple data sources, transform it on the fly and ship it to one or multiple destinations. This flow (collect > transform > ship) is appropriately referred to as a “pipeline”. Of course the natural destination for the data is ElasticSearch itself but Logstash does support many other destinations outside the Elastic stack, and in fact it can be used in isolation with the other stack components


If you are going to play with this you can follow the installation steps in the official Logstatsh page.


It is a 5 minute installation for most Operating Systems once you have the Java prerequisites installed. There is even a Docker installation available. In my environment I am using CentOS 7

Let’s quickly introduce Logstash. If you are proficient with Logstash you can skip ahead to section 3

Once Logstash is installed you need to tell it what to do, ie you need to define the pipelines (what to collect, how to transform it and where to send it to). Logtash requires a "pipeline configuration file" for each pipeline. These files have to have a “.conf” extension and they need to be added to the “/etc/logstash/conf.d/” directory

 When the service starts it will load them (files with an extension other than “.conf” will be ignored)

 The “pipeline” file has 3 sections: input, filter, output. The filter stage is optional, i.e. you can ship the data without modifying it. The actions for each section go in between the curly brackets

# This is the skeleton of a pipeline configuration file
input {}
filter { # If this section is empty it can be omitted }
output {}

The typical "hello world" for Logstash is to read from “stdin” and write to “stdout”. However we are going to start with something different. Since many logs are usually dumped to files our first basic pipeline will read lines from a file as they are written and create entries into another file. Create a “.conf” file and place in the “conf.d” folder


cat /etc/logstash/conf.d/logstash-simple.conf

# "Hello World" pipeline example
input {
  file {
    path => "/tmp/access_log"
    start_position => "beginning"
  }
}
output {
  file {
    path => "/tmp/logstash-test.txt"
  }
}

Now we are ready to start the logstash service. If there are any errors they will show up here

service logstash start
service logstash status -l


Let's test it. I will write some text to the input file and then examine the output file. It should have entries for the changes I made


When dealing with files make sure the right permissions are in place. As you can see below in my implementation Logstash is running as the user “logstash

So I need to make sure the “logstash” user has access to the input and output files

You can get an idea of what’s going on in the background by checking the log file. If there are any errors you will see them here too

tail /var/log/logstash/logstash-plain.log


I am going to stop it for now

service logstash stop


I agree! That was not very exciting, but it offers me the perfect segway to introduce you to something very important: plugins. Plugins offer more functionality for all 3 stages. Some plugins come preinstalled. You can check what plugins are installed like this

cd /usr/share/logstash
bin/logstash-plugin list

Wow! In my system, the command returned 109 plugins. That’s a lot of plugins! There seems to be 4 types of plugins by looking at their names:
  • logstash-codec-*
  • logstash-filter-*
  • logstash-input-*
  • logstash-output-*
I can see two “file” plugins in the list, an “input” and an “output” plugin. That explains why our “hello world” worked


To see what plugins are available check this out. These are only the “input” plugins. There is a link in the right pane to see plugins of the other types:

In the next section we will see how to use these Logstash plugins to connect PowerStore to a Syslog server

3 - Sending PowerStore logs to a Syslog server

Now that you are familiar with the PowerStore REST API and with how Logstash works let's see how you could look at sending PowerStore logs to Syslog. This is a two part problem: extracting and sending

3.1 - Extracting the logs from PowerStore

So, going back to our original problem we know all the PowerStore log information is accessible via the REST API.
  • Can we retrieve the data with Logstash?
Let me rephrase the question
  • Is there an “input plugin" for Logstash that allows us to query the REST API?
The answer is yes. It is called “http_poller” and it seems to have very nice options. However, after reading a bit further it looks like it might give me some problem with self-signed certificates. If you are not using self-signed certificates you could use this plugin. However in my environment I am using them so I need an alternative

We also have the “exec” plugin. This plugin captures the output of a shell command as an event. In Linux shell we have this great tool called cURL that allows us, amongst other things to interact with web servers. More importantly it gives us the flexibility that we need  in this case. We can ignore self-signed certificates with the “-k” parameter.

For example we can get the entries in the “audit_event” log like this (please replace 10.1.1.1 with the actual IP address of your PowerStore)

curl -k -u admin:P@sw0rdZ --request GET https://10.1.1.1/api/rest/audit_event

The “-u” parameter is used to specify username and password. Remember this user has to have enough rights to access the log in question. The “user” and the “password” need to be separated by “:”

 If you cannot afford to leave the username and password in clear text like that you can instead use “--header" like this.

curl -k --header 'Authorization: Basic YWRtaW46UEBzdzByZFo=' --request GET https://10.1.1.1/api/rest/audit_event

The string that follows the “--header” parameter is the Base64 encoding of the “username:password” string. You can create this with a “basic authentication header generator” online or with another tool like Postman. Put the username and password in the “Authorization” tab


And then in the “Headers” tab you can see the resulting string


The “--request" parameter is specifying the GET method and the URL for “audit_event” logs. As we covered in the PowerStore REST API best practices article, the API tries to be as efficient as possible and only returns the “id” of each object. So we need to be explicit if we want more information (or use the “*” wildcard). Additionally it would be good to sort the entries by showing the newest first. 

When we put this into a pipeline configuration file, it will look like this

input {
  exec {
    command => "curl -k --header 'Authorization: Basic YWRtaW46UEBzdzByZFo=' --request GET https://10.1.1.1/api/rest/audit_event?select=id,type,resource_type,resource_action,message_code,message_arguments,timestamp?order=timestamp.desc"
    schedule => "0 * * * *"
    }
}
output {
  file {
      path => "/tmp/logstash-test.txt"
  }
}

As you can see the “exec” plugin includes a very handy cron-like scheduler. In this case Logstash will retrieve the 100 (by default) most recent entries in the audit_event log every hour. Feel free to configure the URL to suit your needs based on the PowerStore REST API best practices.

3.2 - Sending the logs to Syslog

If you have been following along, at this stage I’m sure you have guessed where we are going. Logstash has a lot of output plugins so …  
  • Is there an output plugin to send logs to Syslog?
Of course there is and unsurprisingly it is called the “logstash-output-syslog” plugin. If it is not installed already you can install it like this  

bin/logstash-plugin list
bin/logstash-plugin install logstash-output-syslog

We can see in this plugin's online guide that it supports a few settings. At a minimum you will have specify the IP, port and protocol for your syslog server. In my case I have set up a syslog server in the same VM and it is listening on udp port 514. So the resulting pipeline configuration file looks like this

input {
  exec {
    command => "curl -k --header 'Authorization: Basic YWRtaW46UEBzdzByZFo=' --request GET https://10.1.1.1/api/rest/audit_event?select=id,type,resource_type,resource_action,message_code,message_arguments,timestamp?order=timestamp.desc"
    schedule => "0 * * * *"
    }
}
output {
  syslog {
    host => "127.0.0.1"
    port => 514
    protocol => "udp"
    rfc => "rfc3164"
  }
}

There you have it. Now you would only have to place the configuration file in the “/etc/logstash/conf.d/” directory and start the service as we did before. You will see the PowerStore logs coming in.

Just as a quick disclaimer, this is not an official solution and it can be used at one's own risk, but we live in a world were open source and community developed solutions are a way of accelerating the delivery of value. I hope that you find this write-up useful and you can find a way of customizing it to become useful to you

3.3 – Additional considerations

It is a best practice in Logstash to run only the pipelines we need, so if you have been following along, feel free to remove the hello world example from the “conf.d” directory.

Resending log duplicate entries shouldn't be a problem for most systems as they can determine that an entry has been seen before and store only one copy. Furthermore, very often the main objective for organizations is to make sure they don't miss any entry. So with that in mind it is better to have duplicate entries than missing entries

However, what if we wanted to minimize how much duplicated entries are sent? For this purpose we could use the "timestamp" in conjunction with the advanced search capabilities the PowerStore REST API offers as follows:
  • Firstly, we could wrap the cURL command into a shell script
  • The shell script could start by calculating the previous interval's timestamp in the format that the REST API expects
  • We then send a request to the API to retrieve entries with a timestamp greater than the previous interval
It could look like this if we want to retrieve logs at 5 minute interval

cat /tmp/getev5mins.sh

TSTART=`date -u -d 'now -5 minutes' +"%Y-%m-%dT%H:%M:%S.%3N"`
URL="https://10.1.1.1/api/rest/audit_event?select=id,type,resource_type,resource_action,message_code,message_arguments,timestamp?order=timestamp.desc?timestamp=gt.${TSTART}"
curl -k --location  --header 'Authorization: Basic YWRtaW46UEBzdzByZFo=' --request GET $URL

Make sure file permissions and ownership allow the "logstash" user to run the shell script

chown logstash:logstash /tmp/getev5mins.sh
chmod 777 /tmp/getev5mins.sh

Finally we need to place our shell script inside the pipeline configuration file using the "exec" plugin

input {

  exec {
    command => "sh -c /tmp/getev5mins.sh
    schedule => "*/5 * * * *"
    }
}
output {
  syslog {
    host => "127.0.0.1"
    port => 514
    protocol => "udp"
    rfc => "rfc3164"
  }
}

If one wants to use this method but want minimize the chances of missing log entries, the "TSTART" definition could be adjusted to calculate a timestamp further back in time, ex: "now -10 minute.

Again no responsibilities from me here. Just take the example and modify it to suit your needs. Ultimately even if the device is the one that sends Syslog messages, some messages might get lost in transit, so there is no perfect solution. One thing to bear in mind is that the logs are not deleted from the array when you extract them via the REST API. So you could always go back and get log entries that you are missing by, for example requesting a specific "id" or even to collect the whole thing from scratch if you need to.


Comments

Post a Comment

Popular posts from this blog

Sending PowerStore alerts via SNMP

Electronic Nose - eNose

Use Vagrant to deploy to AWS