Splunk SDK for Python: Getting Data In


Splunk SDK for Python: Getting Data In

Data is a pivotal part of a Splunk Enterprise deployment.  Every configuration and enhancement we make is centered on a particular dataset. As a result, Splunk provides different options for getting data into Splunk Enterprise in order to turn that data into decision-making information. The most common ways of getting data into Splunk are via UF forwarders, syslog, scripted inputs, and modular inputs.  For this post, I’m going to focus on getting data from a remote interface into Splunk via HTTP utilizing Splunk SDK for Python. This post assumes some familiarity with Python.

Splunk SDK for Python

Splunk SDK for Python allows developers to interact with Splunk utilizing python.  This allows Splunk Enterprise customers to create modular inputs for Splunk apps, develop applications that interact with Splunk, and integrate Splunk with currently existing applications in your environment. With the modules provided by the SDK for Python you can run searches, create modular inputs, manage and create indexes, create your own search commands, display search results, edit and create roles, and index data from remote interfaces all done with very few lines of code (This is demonstrated below).  To learn more about the Splunk SDK modules and what you can do with the Splunk SDK for Python please see: http://dev.splunk.com/python

Splunk SDK for Example: Getting data in from remote interfaces.

For this tutorial, I decided to utilize data from Locu since it has a very friendly and easy-to-use API that only requires developers to signup for a developers account to obtain an API key to access their data. Locu maintains local business data and allows those businesses to be found by customers and potential customers everywhere. Here, I’m searching for data on businesses in the New York and Philadelphia areas. See the link at the end of this post for more information about creating a Locu developers account.

Getting the data:

Before we can send data into Splunk, we must first obtain the data from locu by querying the locu via a GET request. Here I’m using URLLIB2 without proxy credentials. If http requests require proxy authorization and authentication, add your proxy credentials to the proxy handler or else you may encounter a connection refused.

The results of calling this function should be a JSON formatted data set that includes 25 search results: