Linux SysAdmin & DevOps

Jasmin SMS Gateway - logstash grok filter

Jasmin SMS Gateway - We use ELK as a centralized logging solution. That means we have an ElasticSearch cluster, a LogStash Cluster, Kibana and Grafana.

I was looking for a GROK filter so we can export our Jasmin SMS gateway logs to Grafana and then be able to search based on different patterns through all the logs. Unfortunatelly I was not able to find a real working grok filter for jasmin. Jasmin SMS Gateway logs are basically in the standard linux syslog format and after a few failed attempts of trying to convert them in JSON format I decided that I should use them in the syslog format and find another way to read and import them in Grafana. The big issue with the logs is that some of Jasmin SMS DEBUG logs are multiline so they have to be interpreted by Logstash using the multiline codec/filter.

1) Install logstash multiline filter

/usr/share/logstash/bin/logstash-plugin install logstash-filter-multiline

Be sure that you have java-openjdk and java-openjdk-devel installed (since the devel package contains the javac binary which is required by logstash to install the multiline filter)!

2) Install Filebeat from the ElasticSearch repo (I won’t detail this step since it’s pretty straightforward and you can find plenty of information on the elasticsearch web page). As soon as filebeat is installed, please configure it accordingly to forward the logs to your logstash server/cluster. That’s pretty much the default filebeat config, adapted for our environment. Here is the content of my /etc/filebeat/filebeat.yml file:

#=========================== Filebeat prospectors =============================
filebeat.prospectors:

- input_type: log
  paths: ["/var/log/jasmin/messages.log"]
  tags: ["jasmin", "sms"]
  document_type: log

  fields:
    env: my-cluster
  fields_under_root: true
  ignore_older: 24h

  harvester_buffer_size: 16384
  max_bytes: 10485760
  
  processors:
  - drop_fields:
    fields: ['beat']

#----------------------------- Logstash output --------------------------------
output.logstash:
 enabled: true
 worker: 1
 hosts: ["server1-ip:5044", "server2-ip:5044"]
 compression_level: 3
 loadbalance: false
 pipelining: 0
 index: jasmin

#================================ Logging =====================================
logging.level: debug
logging.to_files: true
logging.to_syslog: false
logging.files:
 path: /var/log/beats
 name: filebeat.log
 keepfiles: 7

As you can see, I am setting the tag jasmin when sending the logs, just to be easier later when creating the logstash filter. Be sure to enable filebeat and restart the service after changing the configuration!

3) Create an elasticearch index template for jasmin logs and save it as jasmin-template.json:

{
  "mappings": {
    "_default_": {
      "_all": {
        "norms": false
      },
      "_meta": {
        "version": "5.2.2"
      },
      "dynamic_templates": [
        {
          "strings_as_keyword": {
            "mapping": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "match_mapping_type": "string"
          }
        }
      ],
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "beat": {
          "properties": {
            "hostname": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "version": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "input_type": {
          "ignore_above": 1024,
          "type": "keyword"
        },
        "message": {
          "norms": false,
          "type": "text"
        },
        "meta": {
          "properties": {
            "cloud": {
              "properties": {
                "availability_zone": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "instance_id": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "machine_type": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "project_id": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "provider": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "region": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            }
          }
        },
        "offset": {
          "type": "long"
        },
        "source": {
          "ignore_above": 1024,
          "type": "keyword"
        },
        "tags": {
          "ignore_above": 1024,
          "type": "keyword"
        },
        "type": {
          "ignore_above": 1024,
          "type": "keyword"
        }
      }
    }
  },
  "order": 0,
  "settings": {
    "index.mapping.total_fields.limit": 10000,
    "index.refresh_interval": "5s",
    "index.codec": "best_compression",
    "index.number_of_replicas": "3"
  },
  "template": "jasmin-*"
}

4) Load the template into elasticsearch:

curl -XPUT 'http://elastic-search-server-ip:9200/_template/jasmin' -d@/path-to/jasmin-template.json

So now you have filebeat configured to send the jasmin sms gateway’s logs to logstash to an elasticsearch index called jasmin.

5) Create the grok filter for jasmin on the logstash server. So on the logstash server, via a ssh console, go to /etc/logstash/ and create a patterns file (which contains grok patterns). In my case, I have created a folder grok inside /etc/logstash and I named the file as patterns. Here is the content of the /etc/logstash/grok/patterns file:

USERNAME [a-zA-Z0-9._-]+
USER %{USERNAME}
INT (?:[+-]?(?:[0-9]+))
BASE10NUM (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))
NUMBER (?:%{BASE10NUM})
BASE16NUM (?<![0-9A-Fa-f])(?:[+-]?(?:0x)?(?:[0-9A-Fa-f]+))
BASE16FLOAT \b(?<![0-9A-Fa-f.])(?:[+-]?(?:0x)?(?:(?:[0-9A-Fa-f]+(?:\.[0-9A-Fa-f]*)?)|(?:\.[0-9A-Fa-f]+)))\b

POSINT \b(?:[1-9][0-9]*)\b
NONNEGINT \b(?:[0-9]+)\b
WORD \b\w+\b
NOTSPACE \S+
SPACE \s*
DATA .*?
GREEDYDATA .*
QUOTEDSTRING (?>(?<!\\)(?>"(?>\\.|[^\\"]+)+"|""|(?>'(?>\\.|[^\\']+)+')|''|(?>`(?>\\.|[^\\`]+)+`)|``))
UUID [A-Fa-f0-9]{8}-(?:[A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12}

# Networking
MAC (?:%{CISCOMAC}|%{WINDOWSMAC}|%{COMMONMAC})
CISCOMAC (?:(?:[A-Fa-f0-9]{4}\.){2}[A-Fa-f0-9]{4})
WINDOWSMAC (?:(?:[A-Fa-f0-9]{2}-){5}[A-Fa-f0-9]{2})
COMMONMAC (?:(?:[A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2})
IPV6 ((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?
IPV4 (?<![0-9])(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))(?![0-9])
IP (?:%{IPV6}|%{IPV4})
HOSTNAME \b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b)
HOST %{HOSTNAME}
IPORHOST (?:%{HOSTNAME}|%{IP})
HOSTPORT %{IPORHOST}:%{POSINT}

# paths
PATH (?:%{UNIXPATH}|%{WINPATH})
UNIXPATH (?>/(?>[\w_%!$@:.,-]+|\\.)*)+
TTY (?:/dev/(pts|tty([pq])?)(\w+)?/?(?:[0-9]+))
WINPATH (?>[A-Za-z]+:|\\)(?:\\[^\\?*]*)+
URIPROTO [A-Za-z]+(\+[A-Za-z+]+)?
URIHOST %{IPORHOST}(?::%{POSINT:port})?
# uripath comes loosely from RFC1738, but mostly from what Firefox
# doesn't turn into %XX
URIPATH (?:/[A-Za-z0-9$.+!*'(){},~:;=@#%_\-]*)+
#URIPARAM \?(?:[A-Za-z0-9]+(?:=(?:[^&]*))?(?:&(?:[A-Za-z0-9]+(?:=(?:[^&]*))?)?)*)?
URIPARAM \?[A-Za-z0-9$.+!*'|(){},~@#%&/=:;_?\-\[\]]*
URIPATHPARAM %{URIPATH}(?:%{URIPARAM})?
URI %{URIPROTO}://(?:%{USER}(?::[^@]*)?@)?(?:%{URIHOST})?(?:%{URIPATHPARAM})?

# Months: January, Feb, 3, 03, 12, December
MONTH \b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\b
MONTHNUM (?:0?[1-9]|1[0-2])
MONTHNUM2 (?:0[1-9]|1[0-2])
MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])

# Days: Monday, Tue, Thu, etc...
DAY (?:Mon(?:day)?|Tue(?:sday)?|Wed(?:nesday)?|Thu(?:rsday)?|Fri(?:day)?|Sat(?:urday)?|Sun(?:day)?)

# Years?
YEAR (?>\d\d){1,2}
HOUR (?:2[0123]|[01]?[0-9])
MINUTE (?:[0-5][0-9])
# '60' is a leap second in most time standards and thus is valid.
SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
# datestamp is YYYY/MM/DD-HH:MM:SS.UUUU (or something like it)
DATE_US %{MONTHNUM}[/-]%{MONTHDAY}[/-]%{YEAR}
DATE_EU %{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}
ISO8601_TIMEZONE (?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
ISO8601_SECOND (?:%{SECOND}|60)
TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?
DATE %{DATE_US}|%{DATE_EU}
DATESTAMP %{DATE}[- ]%{TIME}
TZ (?:[PMCE][SD]T|UTC)
DATESTAMP_RFC822 %{DAY} %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} %{TZ}
DATESTAMP_RFC2822 %{DAY}, %{MONTHDAY} %{MONTH} %{YEAR} %{TIME} %{ISO8601_TIMEZONE}
DATESTAMP_OTHER %{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{TZ} %{YEAR}
DATESTAMP_EVENTLOG %{YEAR}%{MONTHNUM2}%{MONTHDAY}%{HOUR}%{MINUTE}%{SECOND}

# Syslog Dates: Month Day HH:MM:SS
SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}
PROG (?:[\w._/%-]+)
SYSLOGPROG %{PROG:program}(?:\[%{POSINT:pid}\])?
SYSLOGHOST %{IPORHOST}
SYSLOGFACILITY <%{NONNEGINT:facility}.%{NONNEGINT:priority}>
HTTPDATE %{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{INT}

# Shortcuts
QS %{QUOTEDSTRING}

# Log formats
SYSLOGBASE %{SYSLOGTIMESTAMP:timestamp} (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG}:
COMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-)
COMBINEDAPACHELOG %{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}

# Log Levels
LOGLEVEL ([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)

6) Go to /etc/logstash/conf.d and be sure you have a 02-beats-input.conf file with this content:

input {
  beats {
    port => 5044
  }
}

7) Create jasmin logstash filter file 07-jasmin-filter.conf with this content:

filter {
  if "jasmin" in [tags] {

	# all lines that does not start with %{TIMESTAMP} or ' ' + %{TIMESTAMP} belong to the previous event
	multiline {
		pattern => "(([\s]+)20[0-9]{2}-)|20[0-9]{2}-"
		negate => true
		what => "previous"
	}	
 
	# apply to logs with jasmin tag
	grok {
			patterns_dir => ["/etc/logstash/grok"]
			match => [ "message", "%{DATESTAMP:Timestamp} %{LOGLEVEL:Level}    %{BASE10NUM:Pid} %{GREEDYDATA:Message}" ]
    }

	# something wrong occurred !!! 
		if "_grokparsefailure" in [tags] {
			grok {
				 patterns_dir => "/etc/logstash/grok"
				 match=>[ "message","(?<content>(.|\r|\n)*)" ]
				 add_tag => "jasmin-grok_error"
	        }
		}
	}
}

First file takes input from filebeat, second file takes logs from jasmin (both single lines and multilines, only where the tag is jasmin).

8) Restart logstash

systemctl restart logstash

In a future post, I’ll explain how to create a log browser in Grafana so you can explore your freshly exported Jasmin logs.