

The index attribute is required and you should not use your default unless you know what your doing. The following assumes you have made an index named “cache”. Once you have a search you want to cache – add the following "reportcache index=cache path=/tmp file=testcache.log notimestamp" command to the end. Index=_internal metrics "group=queue" timechart avg(current_size) by name Simple candidate is something like the following report against the internal index that shows queue sizes by queue name. I recommend that you first test reportcache by having it output to a file that you scan to make sure things look right. Add a cache index – either add the following to your etc/bundles/local/nf or create a new bundle and add to that nf You will need to restart splunk after adding the index. Grab the reportcache search script from “** here ** and put it in your SPLUNK_HOME/etc/searchscripts directory – no need to restart you can now cache any search/report data. Make sense ? – its really simple but odd to explain. Doing searches/reports on the later dataset are sub second versus the few minutes it would take to run across the 100M.

If instead I were in the background running that same search/report over each hour interval, then saving the data back into splunk, I would reduce the data i’m operating on from 100M down to 1200 ( 24*500 ) (assuming that i’m getting top 500). To report on “top bandwidth by users” I need to run a search to get the 100M events then run the report across all 100M.

If not obvious why it’s faster, suppose you are indexing 500M events a day and 100M of those have bandwidth data. To get fast results you can then search/report on the summarized cached data. Think of this as creating “logs” that are the output of a search/report and then having Splunk index those “logs”. This way if I ever want to do an adhoc search on “top users” or if I want to do “weekly reports by day” all the data is precalculated. Instead, I’ll just save the search/report and have Splunk run it every 15 minutes with the results being sent to a “cache” index. Every time I run the search/report I need to search and recalculate “top users by bandwidth”, which if over billions of events can take time 😉 Its easy enough to run the report every night, but suppose there are times during the day when I want incrementals, or I want to look at last week, or perhaps get dailies over a month. If I have a search/report that I want to run faster, I will save that search and have splunk run it over a small timeframe (5,15,30,60 min) taking the results of that search/report and feeding them back into an index i create to hold cached results.įor example, suppose I like to run nightly reports where I show “top users by bandwidth”. It would be nice to get some feedback since I think we want to productize the idea but we are not clear on what makes sense. I find this hard to explain even though its an extremely simple concept.
