Are you a candidate? Complete list of Splunk interview questions 👇
- Why is Splunk used for analyzing machine data?
- Name the common port numbers used by Splunk.
- What are the components of Splunk?
- Which is the latest Splunk version in use?
- What is Splunk Indexer? What are the stages of Splunk Indexing?
- What is a Splunk Forwarder? What are the types of Splunk Forwarders?
- Can you name a few most important configuration files in Splunk?
- What are the types of Splunk Licenses?
- What is Splunk App?
- Where is Splunk Default Configuration stored?
- What are the features not available in Splunk Free?
- What happens if the License Master is unreachable?
- What is Summary Index in Splunk?
- What is Splunk DB Connect?
- Name the types of search modes supported in Splunk.
- What are the different types of Splunk dashboards?
- Explain Stats vs Transaction commands.
- How to troubleshoot Splunk performance issues?
- What are Buckets? Explain Splunk Bucket Lifecycle.
- What is the difference between stats and eventstats commands?
- Who are the top direct competitors to Splunk?
- What do Splunk Licenses specify?
- How does Splunk determine 1 day, from a licensing perspective?
- How are Forwarder Licenses purchased?
- What is the command for restarting Splunk web server?
- What is the command for restarting Splunk Daemon?
- What is the command used to check the running Splunk processes on Unix/Linux?
- What is the command used for enabling Splunk to boot start?
- How to disable Splunk boot-start?
- What is Source Type in Splunk?
- How to reset Splunk Admin password?
- How to disable Splunk Launch Message?
- How to clear Splunk Search History?
- What is Btool?/How will you troubleshoot Splunk configuration files?
- What is the difference between Splunk App and Splunk Add-on?
- What is .conf files precedence in Splunk?
- What is Fishbucket? What is Fishbucket Index?
- How do I exclude some events from being indexed by Splunk?
- How to set the default search time in Splunk 6?
- What is Dispatch Directory?
- What is the difference between Search Head Pooling and Search Head Clustering?
- If I want to add folder access logs from a windows machine to Splunk, how do I do it?
- How would you handle/troubleshoot Splunk License Violation Warning?
- What is MapReduce algorithm?
- What is the difference between Splunk SDK and Splunk Framework?
Splunk is used for analyzing machine data because:
- It offers business insights – Splunk understands the patterns hidden within the data and turns it into real-time business insights that can be used to make informed business decisions.
- It provides operational visibility – Splunk leverages machine data to get end-to-end visibility into company operations and then breaks it down across the infrastructure.
- It facilitates proactive monitoring – Splunk uses machine data to monitor systems in real-time to identify system issues and vulnerabilities (external/internal breaches and attacks).
The common port numbers for Splunk are:
- Splunk Web Port: 8000
- Splunk Management Port: 8089
- Splunk Network port: 514
- Splunk Index Replication Port: 8080
- Splunk Indexing Port: 9997
- KV store: 8191
Below are the components of Splunk:
- Search Head: Provides the GUI for searching.
- Indexer: Indexes the machine data.
- Forwarder: Forwards logs to the Indexer.
- Deployment Server: Manges Splunk components in a distributed environment.
Splunk Indexer is the Splunk Enterprise component that creates and manages indexes. The primary functions of an indexer are:
- Indexing incoming data
- Searching the indexed data
There are two types of Splunk Forwarders as below:
- Universal Forwarder (UF): The Splunk agent installed on a non-Splunk system to gather data locally; it can’t parse or index data.
- Heavyweight Forwarder (HWF): A full instance of Splunk with advanced functionalities. It generally works as a remote collector, intermediate forwarder, and possible data filter, and since it parses data, it is not recommended for production systems.
props.conf indexes.conf inputs.conf transforms.conf server.conf
Enterprise license Free license Forwarder license Beta license Licenses for search heads (for distributed search) Licenses for cluster members (for index replication)
Splunk app is a container/directory of configurations, searches, dashboards, etc. in Splunk.
Splunk Free does not include below features:
Authentication and scheduled searches/alerting Distributed search Forwarding in TCP/HTTP (to non-Splunk) Deployment management
If the license master is not available, the license slave will start a 24-hour timer, after which the search will be blocked on the license slave (though indexing continues). However, users will not be able to search for data in that slave until it can reach the license master again.
A summary index is the default Splunk index (the index that Splunk Enterprise uses if we do not indicate another one).
If we plan to run a variety of summary index reports, we may need to create additional summary indexes.
Splunk DB Connect is a generic SQL database plugin for Splunk that allows us to easily integrate database information with Splunk queries and reports.
Splunk supports three types of dashboards, namely:
- Fast mode
- Smart mode
- Verbose mode
There are three different kinds of Splunk dashboards:
- Real-time dashboards.
- Dynamic form-based dashboards.
- Dashboards for scheduled reports.
The transaction command is the most useful in two specific cases:
- When the unique ID (from one or more fields) alone is not sufficient to discriminate between two transactions. This is the case when the identifier is reused, for example, web sessions identified by a cookie/client IP. In this case, the time span or pauses are also used to segment the data into transactions.
- When an identifier is reused, say in DHCP logs, a particular message identifies the beginning or end of a transaction.
- When it is desirable to see the raw text of events combined rather than an analysis of the constituent fields of the events.
In other cases, it’s usually better to use stats.
- As the performance of the stats command is higher, it can be used especially in a distributed search environment
- If there is a unique ID, the stats command can be used
The answer to this question would be very wide, but mostly an interviewer would be looking for the following keywords:
- Check splunkd.log for errors.
- Check server performance issues, i.e., CPU, memory usage, disk I/O, etc.
- Install the SOS (Splunk on Splunk) app and check for warnings and errors in its dashboard
- Check the number of saved searches currently running and their consumption of system resources
- Install and enable Firebug, a Firefox extension. Log into Splunk (using Firefox) and open Firebug’s panels. Then, switch to the ‘Net’ panel (we will have to enable it). The Net panel will show us the HTTP requests and responses, along with the time spent in each. This will give us a lot of information quickly such as which requests are hanging Splunk, which requests are blameless, etc.
Splunk places indexed data in directories, called ‘buckets.’ It is physically a directory containing events of a certain period.
A bucket moves through several stages as it ages. Below are the various stages it goes through:
Hot: A hot bucket contains newly indexed data. It is open for writing. There can be one or more hot buckets for each index.
Warm: A warm bucket consists of data rolled out from a hot bucket. There are many warm buckets.
Cold: A cold bucket has data that is rolled out from a warm bucket. There are many cold buckets.
Frozen: A frozen bucket is comprised of data rolled out from a cold bucket. The indexer deletes frozen data by default, but we can archive it. Archived data can later be thawed (data in a frozen bucket is not searchable).
By default, the buckets are located in:
We should see the hot-db there, and any warm buckets we have. By default, Splunk sets the bucket size to 10 GB for 64-bit systems and 750 MB on 32-bit systems.
- The stats command generates summary statistics of all the existing fields in the search results and saves them as values in new fields.
- Eventstats is similar to the stats command, except that the aggregation results are added inline to each event and only if the aggregation is pertinent to that event. The eventstats command computes requested statistics, like stats does, but aggregates them to the original raw data.
Logstash, Loggly, LogLogic, Sumo Logic, etc. are some of the top direct competitors to Splunk.
Splunk licenses specify how much data we can index per calendar day.
In terms of licensing, for Splunk, 1 day is from midnight to midnight on the clock of the license master.
They are included with Splunk. Therefore, no need to purchase separately.
We can restart Splunk web server by using the following command:
splunk start splunkweb
Splunk Deamon can be restarted with the below command:
splunk start splunkd
If we want to check the running Splunk Enterprise processes on Unix/Linux, we can make use of the following command:
ps aux | grep splunk
To boot start Splunk, we have to use the following command:
$SPLUNK_HOME/bin/splunk enable boot-start
In order to disable Splunk boot-start, we can use the following:
$SPLUNK_HOME/bin/splunk disable boot-start
Source type is Splunk way of identifying data
Resetting Splunk Admin password depends on the version of Splunk. If we are using Splunk 7.1 and above, then we have to follow the below steps:
- First, we have to stop our Splunk Enterprise
- Now, we need to find the ‘passwd’ file and rename it to ‘passwd.bk’
- Then, we have to create a file named ‘user-seed.conf’ in the below directory: $SPLUNK_HOME/etc/system/local/
In the file, we will have to use the following command (here, in the place of ‘NEW_PASSWORD’, we will add our own new password):
PASSWORD = NEW_PASSWORD
After that, we can just restart the Splunk Enterprise and use the new password to log in.
Now, if we are using the versions prior to 7.1, we will follow the below steps:
- First, stop the Splunk Enterprise
- Find the passwd file and rename it to ‘passw.bk’
- Start Splunk Enterprise and log in using the default credentials of admin/changeme
- Here, when asked to enter a new password for our admin account, we will follow the instructions
Note: In case we have created other users earlier and know their login details, copy and paste their credentials from the passwd.bk file into the passwd file and restart Splunk.
Set value OFFENSIVE=Less in splunk_launch.conf
We can clear Splunk search history by deleting the following file from Splunk server:
Splunk Btool is a command-line tool that helps us troubleshoot configuration file issues or just see what values are being used by our Splunk Enterprise installation in the existing environment.
In fact, both contain preconfigured configuration, reports, etc., but Splunk add-on do not have a visual app. On the other hand, a Splunk app has a preconfigured visual app.
File precedence is as follows:
System local directory — highest priority
App local directories
App default directories
System default directory — lowest priority
Fishbucket is a directory or index at the default location:
/opt/splunk/var/lib/splunk It contains seek pointers and CRCs for the files we are indexing, so ‘splunkd’ can tell us if it has read them already. We can access it through the GUI by searching for:
You might not want to index all your events in Splunk instance. In that case, how will you exclude the entry of events to Splunk. An example of this is the debug messages in your application development cycle. You can exclude such debug messages by putting those events in the null queue. These null queues are put into transforms.conf at the forwarder level itself.
To do this in Splunk Enterprise 6.0, we have to use ‘ui-prefs.conf’. If we set the value in the following, all our users would see it as the default setting:
$SPLUNK_HOME/etc/system/local For example, if our
$SPLUNK_HOME/etc/system/local/ui-prefs.conf file includes:
[search] dispatch.earliest_time = @d dispatch.latest_time = now The default time range that all users will see in the search app will be today.
The configuration file reference for ui-prefs.conf is here:
$SPLUNK_HOME/var/run/splunk/dispatch contains a directory for each search that is running or has completed. For example, a directory named 1434308943.358 will contain a CSV file of its search results, a search.log with details about the search execution, and other stuff. Using the defaults (which we can override in limits.conf), these directories will be deleted 10 minutes after the search completes—unless the user saves the search results, in which case the results will be deleted after 7 days.
Both are features provided by Splunk for the high availability of Splunk search head in case any search head goes down. However, the search head cluster is newly introduced and search head pooling will be removed in the next upcoming versions.
The search head cluster is managed by a captain, and the captain controls its slaves. The search head cluster is more reliable and efficient than the search head pooling.
Below are the steps to add folder access logs to Splunk:
Enable Object Access Audit through group policy on the Windows machine on which the folder is located Enable auditing on a specific folder for which we want to monitor logs Install Splunk universal forwarder on the Windows machine Configure universal forwarder to send security logs to Splunk indexer
A license violation warning means that Splunk has indexed more data than our purchased license quota. We have to identify which index/source type has received more data recently than the usual daily data volume. We can check the Splunk license master pool-wise available quota and identify the pool for which the violation has occurred. Once we know the pool for which we are receiving more data, then we have to identify the top source type for which we are receiving more data than the usual data. Once the source type is identified, then we have to find out the source machine which is sending the huge number of logs and the root cause for the same and troubleshoot it, accordingly.
MapReduce algorithm is the secret behind Splunk’s faster data searching. It’s an algorithm typically used for batch-based large-scale parallelization. It’s inspired by functional programming’s map() and reduce() functions.
Splunk SDKs are designed to allow us to develop applications from scratch and they do not require Splunk Web or any components from the Splunk App Framework. These are separately licensed from Splunk and do not alter the Splunk Software.
Splunk App Framework resides within the Splunk web server and permits us to customize the Splunk Web UI that comes with the product and develop Splunk apps using the Splunk web server. It is an important part of the features and functionalities of Splunk, which does not license users to modify anything in Splunk.