Keep Looking Up...

image

"That’s the secret to life" as Snoopy says. And in Splunk, lookup tables are the secret to data enrichment.

 

Lookup Basics

For those of you who may be new to Splunk, lookups are tables that allow you to enhance your data. You can create fields to add to your events from a Python command or CSV file.  Lookups are located in the lookups directory in the app ($SPLUNK_HOME/etc/apps/<appname>/lookups).

 

Let’s take for example web server logs.  Say you want an HTTP status code of 202 to appear in a new field called “http_description” as “Success”.  The CSV file for this lookup table would look something like this:

 

http_status_code,http_status_description

202,Success

 

The following stanza would be added to your transforms.conf file:

 

[http_status_description_lookup]

filename = http_status_description.csv

 

Finally, a lookup statement would be added to the props.conf file to do an automatic lookup

[http_status_description]

LOOKUP-http = http_status_description.csv userid AS myuserid OUTPUT username AS myusername

 

Temporal Lookups

But what if you need a time-based lookup? For example, you have a dashboard with a form that allows employees to submit data on a daily basis.  Once you submit, the data will actually be saved into a lookup table, including a timestamp. Your lookup table could look something like:

 

ip,timestamp,data

95.177.23.172,10/3/2014 08:30:45,network_security

56.48.75.240,10/3/2014 09:14:32,operations

39.153.79.50,10/3/2014 10:02:53,network_security

231.238.251.138,10/3/2014 12:24:17,sales

 

Therefore, if you wanted to see all data input for the past 24 hours, 90 days, etc., you can search your lookup for that time range. The search will return all data inputs that occurred.

 

To do this, add the following stanza to your transforms.conf:

 

[Employee_Input]

filename = employee_input_data.csv

time_field = timestamp

time_format = %d/%m/%y %H:%M:%S

 

*This time_format is strptime format; time_format otherwise defaults to epoch time

 

Additionally, if you wanted to be more lenient in time frames, you can set the max_offset_secs. This property will even match events occurring outside the exact time frame you searched by the max number of seconds you set. The max_offset_data defaults to 2000000000 (two billion). There is also a min_offset_secs, which defaults to 0.

 

The setting of:

max_offset_secs = 120

 

This setting would allow for your search to return events up to 2 minutes outside of your time search.

 

If your setting is:

max_offset_secs = 0

 

This setting would only return events that happened within your exact timestamp listed in your search.

 

For more information on lookup tables and how to set them up, visit http://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf and http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Aboutlookupsandfieldactions

 

Happy Splunking!

Subscribe to Our Newsletter

Stay In Touch