BLUF: Having bad data creates really bad problems. Here are some situations to lookout for bad data and the impact it has.
In today’s data-driven world, organizations are heavily reliant on data to make informed decisions and drive business growth. Data quality plays a critical role in ensuring the information used for analysis and decision-making is accurate, reliable, and relevant.
Together, let’s dive into data quality and explore its dimensions, discuss the impact of bad data, and strategize solutions for improvement. By the end, you will have a clear understanding of how to distinguish good data from bad and the importance of embracing data quality for success.
Not All Data is Created Equal
Good data is crucial for organizations as it forms the foundation for effective decision-making and successful business operations. Investing in strategies to ensure good data quality is not just an option, but a necessity for any organization.
Poor Data Quality Impacts Business Success
When inaccurate, incomplete, or inconsistent data infiltrates an organization’s decision-making processes, it can lead to misleading insights, wrong conclusions, and misguided actions.
A relative SOC example:
A security tool is upgraded and the new log format no longer matches the Security Information and Event Management (SIEM) logic. When this happens, alerts aren’t triggered and security analysts are unaware leaving the compromise to go unnoticed for… days? weeks? The result, the intruder has already released sensitive customer data and ultimately damaged the organization’s brand.
To avoid this scenario, we review data for Accuracy, Reliability, and Relevance. We can ensure accuracy by monitoring the shape and schema of data with Cribl to ensure fields and format are present. Cribl gives us reliability by monitoring the flow of traffic and determining if we have a drop in logs or using the Health Check to ensure a load balancer is still operational. Relevance is always the hardest; Cribl gives us the ability to tag data with a relevance classification to ensure it’s going to the right place.
The Three Dimensions of Data Quality
To effectively assess and improve data quality, it is essential to understand the dimensions defining it. Ask yourself, is this data accurate, reliable and relevant? If the answer to all three isn’t “yes”, then your analysis will be skewed, having a negative impact on decision making.
Let’s look at example scenarios for these dimensions to better understand them in context.
ACCURACY
The correctness and precision of data. Accurate data is error free, consistently formatted, and appropriately represents its value.
Failure Example: Data timestamps aren’t normalized to UTC by default .
Consequence: Event-time comparisons aren’t relative across time zones skewing analysis.
RELIABILITY
Confidence in data arriving at the intended destination, on time, and in the proper format.
Failure Example: Over a 4 day weekend, the syslog received a larger than normal burst of firewall data and is now out of space and no longer sending firewall data to the SIEM.
Consequence: The SOC has received no alerts or correlations regarding firewall data (or any syslog data) for a period of time. Also, being that it’s syslog, all that data is lost and unrecoverable.
RELEVANCE
The usefulness and applicability of data to the specific context or purpose. Relevant data is aligned with the objectives and requirements of the analysis or decision-making process.
Failure Example: A developer deploys a new application which writes logs to a security monitored location. These logs are formatted similar to other security logs, but are operational logs and begin to feed additional information into the wrong data models.
Consequence: Not only does this destroy the SOC licensing, but it impacts the analytics store which slows investigations with increased query time and returns incorrect results.
By assessing data quality in your business against these three dimensions (accuracy, reliability, and relevance) organizations gain a comprehensive understanding of the strengths and weaknesses of their data, enabling targeted efforts for improvement.
SOI Solutions Can Help
In our age of digital transformation, data is the backbone of every organization. A slight discrepancy can lead to faulty decisions, impacting the overall growth of your company. With the support of SOI Solutions, you will have confidence in your data again. Our comprehensive services not only improve the quality of your data but also enhance its usability, paving the way for informed decision-making and successful business strategies.
Up Next! Read Part 2 of the series: Four Tactical Solutions for Achieving High Quality Data