WEBINAR | A Deep-Dive into 2023 Cyber Threats
Reduce Alert Noise and False Positives
Boost your team's productivity by cutting down alert noise and false positives.
Automate Security Operations
Boost efficiency, reduce burnout, and better manage risk through automation.
Dark Web Monitoring
Online protection tuned to the need of your business.
Maximize Existing Security Investments
Improve efficiencies from existing investments in security tools.
Beyond MDR
Move your security operations beyond the limitations of MDR.
Secure with Microsoft 365 E5
Boost the power of Microsoft 365 E5 security.
Secure Multi-Cloud Environments
Improve cloud security and overcome complexity across multi-cloud environments.
Secure Mergers and Acquisitions
Control cyber risk for business acquisitions and dispersed business units.
Operational Technology
Solve security operations challenges affecting critical operational technology (OT) infrastructure.
Force-Multiply Your Security Operations
Whether you’re just starting your security journey, need to up your game, or you’re not happy with an existing service, we can help you to achieve your security goals.
Detection Investigation Response
Modernize Detection, Investigation, Response with a Security Operations Platform.
Threat Hunting
Locate and eliminate lurking threats with ReliaQuest GreyMatter
Threat Intelligence
Find cyber threats that have evaded your defenses.
Model Index
Security metrics to manage and improve security operations.
Breach and Attack Simulation
GreyMatter Verify is ReliaQuest’s automated breach and attack simulation capability.
Digital Risk Protection
Continuous monitoring of open, deep, and dark web sources to identify threats.
Phishing Analyzer
GreyMatter Phishing Analyzer removes the abuse mailbox management by automating the DIR process for you.
Integration Partners
The GreyMatter cloud-native Open XDR platform integrates with a fast-growing number of market-leading technologies.
Unify and Optimize Your Security Operations
ReliaQuest GreyMatter is a security operations platform built on an open XDR architecture and designed to help security teams increase visibility, reduce complexity, and manage risk across their security tools, including on-premises, clouds, networks, and endpoints.
Blog
Company Blog
Case Studies
Brands of the world trust ReliaQuest to achieve their security goals.
Data Sheets
Learn how to achieve your security outcomes faster with ReliaQuest GreyMatter.
eBooks
The latest security trends and perspectives to help inform your security operations.
Industry Guides and Reports
The latest security research and industry reports.
Podcasts
Catch up on the latest cybersecurity podcasts, and mindset moments from our very own mental performance coaches.
Solution Briefs
A deep dive on how ReliaQuest GreyMatter addresses security challenges.
White Papers
The latest white papers focused on security operations strategy, technology & insight.
Videos
Current and future SOC trends presented by our security experts.
Events & Webinars
Explore all upcoming company events, in-person and on-demand webinars
ReliaQuest ResourceCenter
From prevention techniques to emerging security trends, our comprehensive library can arm you with the tools you need to improve your security posture.
Threat Research
Get the latest threat analysis from the ReliaQuest Threat Research Team. ReliaQuest ShadowTalk Weekly podcast featuring discussions on the latest cybersecurity news and threat research.
Shadow Talk
ReliaQuest's ShadowTalk is a weekly podcast featuring discussions on the latest cybersecurity news and threat research. ShadowTalk's hosts come from threat intelligence, threat hunting, security research, and leadership backgrounds providing practical perspectives on the week's top cybersecurity stories.
April 25, 2024
About ReliaQuest
We bring our best attitude, energy and effort to everything we do, every day, to make security possible.
Leadership
Security is a team sport.
No Show Dogs Podcast
Mental Performance Coaches Derin McMains and Dr. Nicole Detling interview world-class performers across multiple industries.
Make It Possible
Make It Possible reflects our focus on bringing cybersecurity awareness to our communities and enabling the next generation of cybersecurity professionals.
Careers
Join our world-class team.
Press and Media Coverage
ReliaQuest newsroom covering the latest press release and media coverage.
Become a Channel Partner
When you partner with ReliaQuest, you help deliver world-class cybersecurity solutions.
Contact Us
How can we help you?
A Mindset Like No Other in the Industry
Many companies tout their cultures; at ReliaQuest, we share a mindset. We focus on four values every day to make security possible: being accountable, helpful, adaptable, and focused. These values drive development of our platform, relationships with our customers and partners, and further the ReliaQuest promise of security confidence across our customers and our own teams.
More results...
Time is one of the most critical details included in any logs, whether they are security related or not. This is doubly important for Splunk, as it can affect whether alerts trigger, whether analysis is performed correctly for those alerts, and even whether the logs are stored and retained properly.
Time issues often go unnoticed, as it’s natural for people to organize thoughts or events around time, often assuming time is correct and if anything is wrong, it has to be somewhere else. And sometimes time-based issues go misunderstood or poorly communicated, as any Sci-Fi fan who has dealt with time travel could tell you.
Today I’m going to go over how Splunk processes time, how to find problems with time, and how to fix some common (and not-so-common) issues we’ve seen before. This will help you identify gaps in your alerting and visibility that you weren’t aware of, and give you better understanding of some of your log sources as well.
Splunk excels at historical searches looking back in time and generates alerts on a near real-time basis instead of leveraging real-time correlation like traditional SIEMs use. For example, you can design an alert that looks over the last 70 minutes and runs once an hour, or design one that runs every minute and looks at the last 2 minutes. However, if your timestamps are wrong, you could easily fall outside of the range selected.
Splunk also stores events and handles retention based on buckets organized by event time, so if you have events with timestamp parsing issues, you could lose data before you intend to, or get stuck holding data for decades if Splunk thinks events are from the future
When responding to security alerts, the goal is to gain a full understanding of the chain of events leading up to compromise and identify if the infection has spread from there. If some of the log sources have the wrong timestamp, it’d be easy to misinterpret the chain of events – that is, if you even saw all the logs relevant within the time range you chose.
In Splunk, the primary internal time field is “_time”. This is stored in metadata in GMT and is converted to whatever a user’s time zone preference is. When integrating log sources, most will have TAs that will have props.conf to identify what portion of the log is the event timestamp and should be used as the “_time”. The props.conf will either specify a prefix for the timestamp or specify field if it’s JSON or KV extraction. You can even specify a time zone in the props.conf if you really need to, but we’ll talk more about that later.
Splunk will use a timestamp processor to interpret the timestamp. If nothing was set in the props.conf to tell Splunk where the timestamp is, it’ll use the timestamp processor to try to find the first thing that looks to Splunk like a timestamp. You can look at the timestamp processor at $SPLUNK_HOME/etc/datetime.xml. With a notable exception in 2019, this isn’t something you should need to ever edit.
The timestamp processor uses regex to identify timestamps, and having it try to ‘guess’ what in the logs is the timestamp is asking for trouble, since epoch timestamps are just 10 digit strings, and sometimes the first actual timestamp is not the event time. The timestamp processor also helps identify what time zone the log is in, assuming the log source has a GMT offset or some form of time zone info.
So what happens if Splunk looks at a log but doesn’t see anything like time? Splunk is built on _time, it needs to be something. If there is no timestamp found, Splunk will use the time from the Splunk server that received the log, which is also stored as _indextime for all logs. If there is a timestamp, but no time zone, Splunk will treat it as GMT. As you can imagine, this can cause issues, so let’s look at how to find and fix them
Luckily, it’s pretty easy to find if there are _time issues in Splunk. If you are trying to figure out if any of the timestamps in your logs are getting parsed wrong, just compare the _time (Event Time) to the _indextime (Splunk Receipt time).
Both _time and _indextime are normalized in GMT and stored in metadata for Splunk. They will always exist, so they are reliable sources in every environment. Just keep in mind when comparing _time and _indextime:
You can use eval and stats to look at things in a variety of different ways. Here is an example of a search I ran to check some firewalls for issues:
As you can see, some of them are off by quite a lot. To help understand how timestamps get messed up, lets go over some types of issues we’ve seen before:
In distributed systems, there are plenty of cases where log sources aggregate and forward their logs before even reaching Splunk. Sometimes they end up in Splunk with double syslog headers, with one being from the original log source and another being from the syslog aggregator/forwarder. These are normally fairly minor _time discrepancies, so many might just ignore them in favor of larger priorities. Typically, to fix these within Splunk, you need to update the props.conf to account for the extra header, either by modifying the regex used to extract the log, or by adding in a TIME_PREFIX to match what’s before the true timestamp – even if that’s the first timestamp.
If you can have the syslog forwarder not add on a second header, that’d be the best fix to be applied outside of Splunk.
Some log sources have multiple timestamps by design. These can range from minutes, to hours, days, or even decades. For example, antivirus logs might tell you when a file is downloaded, when it was detected as malicious, and when it was removed from the device. It might also tell you when it last updated its malware database, when the specific IOC was added, etc. If not told anything specifically, Splunk will pick whichever one is first. Sometimes that might even be a 10/16 digit ID number in the beginning of the log that gets picked up as epoch time.
The solution again is in the props.conf. The ‘TIME_PREFIX’ allows you to tell it what is before the timestamp you want to use, but if it’s JSON or in a Key-Value extracted setup, you can also just specify the TIME_FIELD. For both of these, you can configure the TIME_FORMAT for the exact format of the timestamp.
Source: XKCD
One of the most easily identified issues are time zone issues. If there is an issue, you’ll see differences between _time and _indextime that the size of common GMT offsets, such as 4 hours often for US Eastern time, or 7 hours for US Western time. It won’t always be whole numbers – remember that latency and half-hour offset time zones exist – but if it’s really close to matching up to an offset, time zones are the issue.
This happens when Splunk incorrectly identifies the time zone of the logs – either because there isn’t a time zone in the logs, or there is an incorrect one, or Splunk doesn’t understand the format the timestamp is in. As mentioned before, this means Splunk defaults to assuming GMT/UTC.
If the timestamp is in the wrong format, you can configure the TIME_FORMAT in the props.conf for Splunk to understand it. If the log source has the wrong time zone, you’ll need to fix that on the log source side.
Most vendors either have timestamps formatted with time zones by default or allow you to configure them that way. However, not all do – two notable exceptions are Cisco ASA and Palo Alto firewall logs.
If you can’t set a time zone in the timestamp in the logs, your best bet is to make sure all those log sources are set to GMT/UTC. Even the documentation from Palo Alto and Cisco recommend keeping their Firewalls on GMT.
It’s important to note that Splunk CAN adjust time zones to accommodate – as mentioned before, you can set time zone in the props.conf with the TZ setting. However, this gets messy quickly and is not normally recommended.
For example, you could set up a props.conf on a forwarder to handle Hong Kong ASA logs sending in GMT +8, but need another to handle your Tampa logs coming in GMT -4, and your other data centers will all need their own to cover them sending in their own local times. Managing all those configurations to ensure things come in correctly will add a lot of extra work for little to no gain. Not to mention a good chunk may break when daylight savings happens, or you open a small office which sends to an existing data center, but exists in a different time zone.
Now, this can be manageable if your business is all in one time zone, or you want all devices on one time zone. However, if you need to unify under one time zone anyway, GMT/UTC is often the best one for consistency, instead of the local time of your main office.
If you are seeing time zone issues like these, it means that time settings across your organization are inconsistent. This will require ensuring the team or individuals managing the firewalls update their settings to conform to a uniform standard – and GMT/UTC is typically the universal standard.
Of all the issues in the article, this is both the most confusing and oftentimes the most impactful. Thankfully, it’s relatively rare, and understanding it will help you better understand how your own log sources work.
We first noticed this issue when investigating some false negatives for a customer. Proofpoint had alerted the customer that some potentially malicious activity had occurred; however our alerts to detect the same conditions didn’t trigger. While originally it was thought to be an issue with the wrong timestamp being parsed, the real issue was much more complicated.
This isn’t limited to Proofpoint TAP logs – for example, the Microsoft Office 365 Reporting Add-on also mentions that it “May delay trace logs up to 24 hours”. But Proofpoint TAP is a great use case to dig into.
Proofpoint TAP is an advanced email security tool. Like many such products, it analyzes emails for potential malicious attachments, and analyzes URLs included in emails and attachments. If it identifies the email as malicious, it will block the email. URLs are rewritten to go through Proofpoint – when clicked, Proofpoint will block the link if the URL is detected as malicious. All of this is logged, as is all allowed activity.
No security device is perfect, however, and sometimes Proofpoint allows malicious emails through or allows someone to click on a malicious link. However, Proofpoint will also keep track of what URL clicks and emails were allowed. If it later receives new intel that the URL or email it previously allowed was malicious, it will send a new log – telling you something previously allowed has been re-identified as potentially malicious.
Sometimes the update is within minutes, but sometimes it’s hours, days, or in rare cases weeks later that the threat is identified. Regardless of when the update happened, the _time of the original email/URL click stays the same – after all, that’s when the user clicked the URL, or got that malicious attachment.
So how do you handle cases where Splunk gets a log now describing activity from last week? Especially one you likely want alerting on since potentially malicious activity occurred.
Well, let’s first discuss what NOT to do. DO NOT try to treat this as a parsing issue and remap the _time to some other timestamp in the logs. For starters, the updated timestamp isn’t going to be in every log so it’d make parsing inconsistent. But more importantly, when the alert does fire, the investigation/analysis would go poorly since you’d likely be looking in the wrong time window. If I clicked a phishing URL two weeks ago, and you get an update today saying it was malicious, you need to be looking at my web traffic two weeks ago, not what I did today, to understand the situation. Having _time correct matters for more than just alerting – it matters for analyzing and context.
We can’t ignore the discrepancy and we want to make sure we are getting alerts and analyzing them correctly, but we don’t want to have huge searches looking over several weeks running constantly. There are two things that need to be done to correct the issue. First, adjust alerts to search based on both _time and _indextime. Second, make sure those responding to the incident understand how _time/_indextime works.
Both _time and _indextime can be used to filter searches. Although _indextime doesn’t have a handy drop down, it is just as potent as _time filters for creating efficient searches.
See the below search:
As you can see, the search is setup to look for the last 90 days’ worth of traffic, but it also uses the ‘_index_earliest=-3d@d’. This tells Splunk to look at events indexed in the last three days, but whose event timestamps are within the last 90 days. I went back 3 days to just show examples where the ‘clickTime’ of an even was drastically different than the indextime. For an alert, setting ‘_index_earliest=-70m’ and running it every 60 minutes covers all gaps.
Now, you may think running a 90 day search every hour is inefficient and would take forever to process, and normally you’d be right. However, ‘_index_earliest’ is nearly identical in efficiency as selecting the same range for _time. You could search ‘_index_earliest=-70m’ from ‘All Time’, and it would finish in nearly the same time as just searching for the last 70 minutes. The example search I ran above took only a few seconds to run since Splunk is smart enough to know I don’t need to look at events that weren’t indexed within the last three days.
At ReliaQuest, we understand that ensuring Splunk is accurately keeping track of the _time of your logs is a critical part of making sure you have proper visibility into your environment. That’s why checking and remediating time-based issues is a consistent part of our deep dive health checks that ReliaQuest GreyMatter customers receive. Between how Splunk stores data to how Splunk searches data, even to how people interpret data when looking at Splunk, everything is based on understanding the flow of time.
Hopefully, this article has given you insight into why time is such a critical component, how to detect issues, and how to resolve them when they do come up. While the types of errors covered isn’t an exhaustive list, it should give you starting point to address time-based concerns at various points. From log source settings, to parsing updates and search configuration changes, each proposed fix can be leveraged to accommodate a wide range of issues and help highlight Splunk’s flexibility.