Mastering Splunk
上QQ阅读APP看书,第一时间看更新

Tables, charts, and fields

After reading search, your command pipeline determines which search result's tab (or tabs) will get populated. We know that if you are concentrating on retrieving events, your results will be returned in the Events tab, while event transformations will be visible in the Statistics and Visualization tabs.

In this chapter, we will cover the transformation of event data, and therefore, the Statistics and Visualization tabs.

Splunking into tables

Splunking your search results into a table might be the easiest and most straightforward method of transforming your search results into a more readable form. Rather than looking at raw event data, you can use Splunk commands to reduce the noise of the raw events into the Splunk Statistics tab, presented as a table in the tab.

You can utilize Splunk's fields command to improve the level of readability of the Statistics tab by keeping or removing a field (or multiple fields) from your Splunk search results:

  • Use + to keep only the fields that match one of the fields in the (fields) list
  • Use to remove the field(s) that matches the (fields) list

It's common practice to be specific in what you want your Splunk search results to return. The fields command allows you to do this. Splunk's table command is (somewhat) similar to the fields command (discussed later in this chapter). The table command enables you to specify (limit) the fields that you want to keep in your results (in your table). However, keep in mind that Splunk requires certain internal fields to be present in a search to perform some commands (such as the chart command), and the table command (by default) might pull these fields out of the search results. As a rule, the best approach for limiting results is to use the fields command (because it always retains all the internal fields).

The table command

The table command is simply the command "table" and a (required) "field list." A table is created using only the fields you named. Wildcards can be used as part of the field list. Columns are displayed in the same order in which the fields are specified.

Note

Please note the following cases:

  • Column headers = field names
  • Rows = field values
  • Each row = 1 event

The following example uses the table command to create a three-column table, date_year, date_month, and date_wday, as shown in the following screenshot:

The result looks like the following screenshot:

Search results

Splunk's table command does not allow you to rename the Splunk fields. You can only rename the fields that you specify and want to show in your results table. You need to use the rename command if you want to rename a field.

The Splunk rename command

You can use Splunk's rename command to rename a specific field or multiple fields. With this command, you can give your fields more meaningful names, such as month instead of date_month. To rename multiple fields, you can use wildcards. If you want to use a phrase (if there are spaces in your new field name), you need to wrap the phrase within quotes. The syntax is simple, as follows:

rename old-field-name as new-field-name

To rename a field to a text phrase, you can use quotes as shown in the following syntax:

... | rename SESSIONID AS "The Session"

You can also use wildcards to rename multiple fields:

... | rename *ip AS IPaddress_*

In the following example, I've used the rename command to rename all the three fields to what I want and then I've used those names in my table command:

The rename command

The results of the search, using the rename command, look like the following screenshot:

Search result of the rename command

Another example of using the Splunk table command to transform your search results is explained here. In this case, the Splunk server has indexed a raw CSV file exported from a Cognos TM1 model. Because there are no headings in the file, Splunk has interpreted the data as field names. In addition, Splunk interpreted each record's forecast amount as a string. I've utilized Splunk's rename command to rename the fields with names that are more meaningful, such as:

  • May as Month
  • Actual as Version
  • FY 2012 as Year
  • Many others

In addition, I've used Splunk's eval command to create a rounded forecast amount:

Eval = RFCST= round(FCST)

Finally, I used the table command to present my search results in a more readable fashion:

sourcetype=csv 2014 "Current Forecast" "Direct"  "513500" |  rename May as "Month" Actual as "Version" "FY 2012" as Year 650693NLR001 as "Business Unit" 100000 as "FCST" "09997_Eliminations Co 2" as "Account" "451200" as "Activity" | eval RFCST= round(FCST) | Table Month, "Business Unit", Activity, Account, RFCST, FCST

After running the (preceding) Splunk search pipeline, the following results are obtained:

Search results of the table command

Limits

As you might know by now, Splunk uses configuration (or conf) files to allow you to override its default attributes, parameters, and thresholds. The limits.conf file contains possible attribute-value pairs for configuring limits for search commands. Note that there is a limits.conf file at the following location:

$SPLUNK_HOME/etc/system/default/.

Note

Note that the changes to the limits.conf file should be made to the file located in your Splunk local directory, not in the Splunk home directory.

In a default installation, the Splunk table command will shorten the total number of results returned if the truncate_report parameter in the Splunk configuration file, limits.conf, is set to 1.

Fields

When Splunk indexes your data, it labels each event with a number of fields. These fields become a part of the index's event data and are returned as part of the search results. Splunk also adds some data a number of default fields that serve particular purposes within Splunk's internal processing. The following are some of Splunk's default fields along with their purposes (you can refer to the product's documentation for a complete list):

  • index: This identifies the index in which the event is located
  • linecount: This describes the number of lines that the event contains

Once the data has been indexed, you can use these default fields in your Splunk searches. If you don't need them, you might want to consider removing them from your search results to improve performance and possibly the readability of your results. You can use the Splunk fields command to tell Splunk to keep or remove a field (or fields) from your search results.

Note

Keep in mind, though, that some default fields might be needed by Splunk internally based on your search pipeline. For example, most statistical commands require the default _time field.

The fields command is simple:

fields [+|-] <field-list>

The field list is a comma-delimited list of fields to keep (+) or remove (-) a field and can include wildcards. A leading + sign will keep the field list, while - will remove the fields listed. Note that if you do not include + or -, Splunk assumes the value to be +.

An example of the fields command

Consider the following code, which we used earlier to present the search results:

sourcetype=csv 2014 "Current Forecast" "Direct"  "513500" | rename May as "Month" Actual as "Version" "FY 2012" as Year 650693NLR001 as "Business Unit" 100000 as "FCST" "09997_Eliminations Co 2" as "Account" "451200" as "Activity" | eval RFCST= round(FCST) | Table Month, "Business Unit", Activity, Account, RFCST, FCST

The result that we obtained using the preceding code is shown as follows:

We'll now take a look at the same code (used previously), using the fields command:

sourcetype=csv 2014 "Current Forecast" "Direct"  "513500" | fields - punct |  rename May as "Month" Actual as "Version" "FY 2012" as Year 650693NLR001 as "Business Unit" 100000 as "FCST" "09997_Eliminations Co 2" as "Account" "451200" as "Activity" | eval RFCST= round(FCST) | Table Month, "Business Unit", Activity, Account, RFCST, FCST

The result obtained (using the fields command to remove the field named punct) is as follows:

Search result of the fields command

Returning search results as charts

We've covered the Events and Statistics tabs until this point, so now we will take a look at the Visualizations tab.

Basically, Splunk delivers the simple "list of events" visualization as the standard search result option. In addition, other options (covered in this chapter) include tables and charts such as column, line, area, and pie chart (which are displayed on the Splunk Visualizations tab).

Splunk's chart command is a reporting command that returns your search results in a data structure (described in a tabular output) that supports visualization such as a chart.

The chart command

The chart command is a bit more complex than the Splunk table command. It has both required and optional arguments. Charted fields are converted automatically to numerical quantities as required by Splunk. With the chart command (as opposed to the somewhat similar timechart command that always generates a _time x-axis as is discussed later in this chapter), you are able to set your own x-axis for your chart visualization.

The required arguments for the chart command are aggregator, sparkline-agg-term, and eval-expression, which are explained as follows (note that if you don't use sparklines in your visualization, sparkline-agg-term is not required.):

  • aggregator: This argument specifies an aggregator or function
  • sparkline-agg-term: This argument is the sparkline (sparklines are discussed later in this chapter) specifier
  • eval-expression: This argument is a combination of literals, fields, operators, and functions that represent the value of your destination field

A simple example of the chart command is shown as follows:

sourcetype=csv "Current Forecast" "Direct"  "513500" | rename 100000 as "FCST", "FY 2012" as "Year"| eval RFCST= round(FCST) | chart avg(RFCST) by Year

In this example (using a Cognos TM1 exported CSV file as the source), I use a common Splunk statistics function, avg, as the aggregator and specify the x-axis of the chart as year using by (the over command will work here as well). I've also created a value named FCST using the rename command, which I then use as eval-expression of this search. I don't need sparklines in this visualization, so there is no sparkline-agg-term used in the command.

The search command shown in Splunk Web is as follows:

The search command

The result obtained by running the previous search command is as follows:

Result of the search command

The split-by fields

When using Splunk's chart command, you have the ability to designate a "split-by field." This means that your Splunk search output will be a table where each column represents a distinct value of the split-by field, as shown here:

sourcetype=csv "2014" "Current Forecast" "Direct"  "513500" |  rename 100000 as "FCST", "May" as "Month" | eval RFCST= round(FCST) |  sort by Month | chart sum(FCST) by FCST, Month

In the preceding example, we have chart sum(FCST) by FCST, Month; so, the first field after by FCST ends up being represented as one-field-per-row (Splunk refers to this as group by). The second field after by Month ends up being represented as one-field-per-column (this is the split-by field). The resulting visualization is different, as shown here:

The where clause

You can think of the Splunk where clause as being similar to the where clause in a SQL query. The "where" specifies the criteria for including (or excluding) particular data within a Splunk search pipeline. For example, consider the following search:

sourcetype=csv "2014"  "Direct"  "513500" ("Current Forecast" OR "Budget")  | rename 100000 as "FCST", "May" as "Month", Actual as "Version" | eval RFCST= round(FCST) | chart var(FCST) over Month by Version

The preceding search generates the following output:

The previous code can be changed as follows using the where clause:

sourcetype=csv "2014"  "Direct"  "513500" ("Current Forecast" OR "Budget")  | rename 100000 as "FCST", "May" as "Month", Actual as "Version" | eval RFCST= round(FCST) |  where FCST > 99999 | chart var(FCST) over Month by Version 

The given code will generate the following output:

More visualization examples

In the following example, we're interested in events returned by a Cognos TM1 transaction log with the mention of the TM1 control dimension named }clients. We want to see this information visualized by the hour, over weekdays, and then month:

tm1* }Clients| chart count(date_hour) over date_wday by date_month  | sort by date_wday

The chart obtained after running this code is as follows:

This example visualizes the earliest hour by week day when a Cognos TM1 "Error" occurred, using the earliest command, as shown here:

tm1* "Error" | chart earliest(date_hour) over date_wday

This command generates the following output:

In the next example, we will visualize the median of the FCST value by month for each version of the data (actual, budget, current, and prior forecast) by using the median command (along with over and by):

sourcetype=csv "2014"  "Direct"  "513500" |  rename 100000 as "FCST", "May" as "Month", Actual as "Version" | eval RFCST= round(FCST) |  chart Median(FCST) over Month by Version

The preceding search command generates the following output:

In the following example, we visualize the sample variance of the FCST value by month for the versions of the data Budget and Current Forecast by using the var command (and over and by):

sourcetype=csv "2014"  "Direct"  "513500" ("Current Forecast" OR "Budget")  | rename 100000 as "FCST", "May" as "Month", Actual as "Version" | eval RFCST= round(FCST) |  chart var(FCST) over Month by Version

Some additional functions

When using the chart command, there is a list of powerful functions that you should be aware of.

These include avg, C or Count, dc or distinct_count, earliest, estdc, estdc_error, First, Last, latest, List, max, Median, Min, Mode, Range, Stdev, Stdevp, sum, sumsq, Values, Var, and varp.

You can refer to the product documentation for the purpose and syntax of each of these commands.