Today, data is used for everything from business intelligence to marketing campaigns to new product development. High quality data can tremendously improve the performance of all these endeavors, and more. However, when you start with poor quality data, you will almost always end up with undesirable results. As the old adage goes, "Garbage in, garbage out."
Database administrators have lived by that motto for decades, but now that there are virtually limitless reservoirs of data at our disposal, it seems we've forgotten. Businesses seem to believe that more data somehow makes up for low-quality data. Not so.
Here's why data quality is still important and how you can ensure your data meets the highest data quality standards.
The Most Common Data Quality Issues
Data quality issues can creep in as early as the entry process, but can also occur when data is improperly formatted or when poor processes or systems introduce duplicates and other errors into the data sets.
Bad data can include:
- Data that is out of date
- Duplicate data
- Data in poor format
- Error from manual data entry
The Costs of Poor Quality Data
How do these things happen? Well, most databases, especially one that includes consumer information like names, email addresses and phone numbers, begins to deteriorate as soon as it is assembled. People change jobs, move, get married, and swap Internet providers -- all leading to changes in their contact information.
Errors also get introduced when data is entered manually, either by customers or by your own employees. Data can be accidentally entered twice, or deliberately entered with errors. For example, marketers and salespeople continuously battle fake contact information entered by prospects who want to remain "unreachable" when filling out lead generating forms.
Bad data costs you in many different ways. Obviously, a large database that's 10, 25 or even 40 percent erroneous costs that much more to store. This is true whether you're operating an in-house datacenter or contract with a cloud service provider. Every time you back up the data, it costs more and consumes more storage. Plus, any decisions, marketing campaigns, or product development you do based on the data will be skewed, because your results are thrown off by bad data.
How to Achieve Better Quality Data
Now that you understand the cost of bad data and how it gets into your databases, it's time to figure out how to improve the quality.
Here are three steps to start addressing the data quality issues your organization is facing:
1. Assess your data
Before you begin correcting your data quality issues, you need to gain some visibility into how clean your data is right now. It is important to identify your business' core assets and determine what issues are slowing your employees down. For example, if you were an Oil & Gas company, your core asset would be your wells. With your core asset in mind, you can start identifying the data quality checks that will provide information on the bad data in your systems.
Executing the data quality checks will gather the data you need to start correcting the issues. Make sure you store the results in a central location so your team can quickly and effectively correct the issues.
2. Correct existing data quality issues
Having identified the data quality issues that exist in your data, you can start correcting them. It's important to work with individuals who understand how the data is stored in your applications, as well as how it is used by your business. This ensures that any corrective actions are able to be communicated to the rest of the organization.
After the clean-up process is complete, you will want to run your data quality checks again to ensure everything is in order.
3. Set up on-going monitoring
One of the most common questions I receive when correcting data quality issues is, "How will we know if it happens again?" Asking this makes sense, because the process we just performed is not expected to continue indefinitely. In fact, these types of data quality clean-up processes often take place inside the scope of a one-time project.
So how will you know if the problem happens again? Well, you repeat the assessment process over and over again. Most organizations don't want to dedicate their existing resources to this repetitive task, and let's face it, humans aren't the best at it either. In order to accomplish this goal, you need to have a system do it for you.