Best Practices for your Splunk Deployment: Indexer Performance
Function1 is a Professional Services Partner to Splunk. We have gathered a wealth of experience in conducting dozens of consulting engagements in conjunction with Splunk PS, and now we want to share some of our Best Practices with you. The most common growth of a Splunk deployment starts from a utility used by your Systems Administrator to find that needle in the haystack issue, and grows to an enterprise scale reporting tool used by the leaders of your organization. Splunk as an enterprise platform is typically situated in a very prominent place in the decision-making process of your Business Intelligence and Operational Intelligence processes. As the deployment gains traction within your enterprise there are often questions around Indexer and Search Head performance, high availability, data storage, retention policies, security, and general taxonomy. Over the course of the next several weeks, I will cover all of these topics in detail. Today, we will focus on Indexer performance. There are two principles that you must consider with regard to Indexer Performance before researching a workaround. These two principles are:
- Indexer Scalability
- Timestamp Recognition
On the topic of Indexer Scalability, we have often found that customers have more than one Splunk deployment within their organization. To get started, create an inventory management of your deployment. Find out how may silos of Splunk installs you have in your organization, whether or not they are licensed, the total volume of data indexed per day, and the total number of Indexes across all the deployments. In the case of multiple licenses, you will be able to setup a License Master and aggregate your license keys into that central repository. While creating the inventory, pay close attention to the names and locations of your Indexes. The easiest way to do this is to look at /opt/splunk/var/lib/splunk on each Indexer. Likewise, if you have already configured your Enterprise Search Head, a few quick clicks in Manager will show you the same information. Be sure to annotate which host they are on and where on the file system the Indexes live. The best practice is to have all your Indexes on all your Indexers. If your Indexes do not have a home on all your Indexers, we strongly recommend making that change for future scalability; this is also a great way to drive the total cost of expansion down. Conceptually, Indexes are orthogonal in that they do not overlap their data between Indexers. By design, Splunk’s parallelization allows for the Indexers (and Search Heads) to scale linearly across CPU Cores. This means that you can nearly halve the time from search request to search response by doubling the number of Indexers in your environment and spraying you data across all of them. We say nearly halve because there is some overhead associated to other processes running on the host OS. It is most often the case that we see customers who have setup multiple aliases to point to sets of specific Indexers, and have a Search Head configured in front of the set of Indexers to give users one place to do their search across all their data. If this sounds familiar, you may have noticed, this isn’t the most efficient use of your hardware, and you may have one server that is spiking in its usage with requests while the others in the pool are significantly lower. This is almost always due to your data not being sprayed across all available Indexers. While the architecture of your Indexers and Search Heads may or may not be optimal, there are a couple performance-gaining parameters that you can alter in your various properties (props.conf) files. By far the most significant gain is achieved by fine-tuning your timestamp recognition. As you know, Splunk depends on the timestamp of your data to order your events chronologically from all your systems. In an effort to speed up your initial time to deployment, Splunk’s out of the box time processors handles the vast majority of the timestamp extraction for you automatically. It is able to perform these extractions based on a very extensive pre-defined set of timestamp field extractions based on common date and time models. While this is definitely a “cool factor” for using Splunk in the short term, it is also an Indexer performance hurdle that it must go through for each incoming query. You can turn off the time processor by setting the ‘DATETIME_CONFIG’ to ‘CURRENT’ or to ‘NONE’.
- By setting this value to CURRENT, Splunk will use the Indexer system time for the event.
- By setting this value to NONE, Splunk will use the input system time for the event.
Timestamp is not the only configurable parameter though. For more advanced regular expressionists, you can also disable the automatic key value extraction (also known as field discovery). This parameter is called KV_MODE in your props.conf. Similar to the timestamp, by setting this mode to ‘NONE’, the Indexer will not spend CPU cycles on trying to match expressions from its extensive library and will solely rely on the definitions provided in the applications respective props.conf. We understand that managing these configurations can be a daunting task, and strongly encourage using Splunk’s Deployment Server for Splunk-to-Splunk configurations as well as Splunk application distribution. We would be happy to help get you back on track, whether you have a quick question or need to review your deployment architecture. Shoot us a line (firstname.lastname@example.org), we would love to chat with you!