Get expert advice and insights on managing your data quality, delivered straight to your inbox every week.
How good is your data?
At a time when major advances in machine learning (ML), artificial intelligence (AI), and internet of things (IoT) are being promoted in the industry, many manufacturers continue to struggle to incorporate advanced analytics into their business processes. Critical to the success of any data analytics initiative is having clean data. The data must be accurately labeled, free of duplicate records, and blended to generate the correct results. These are key to becoming a truly data-driven enterprise.
Adding to the challenge are the rapidly increasing volumes and variety of the types of data coming from business applications, sensors, third-party sources and e-commerce transactions. Together, these have contributed to creating bad data. To compensate, data analysts and data scientists must cleanse the data before incorporating it into their analytics dashboards and models, which is a time-consuming and expensive process. Most industry estimates show that, on average, it costs $10 per record to clean up the bad data and $100 if you do nothing. Moreover, the ramifications of doing nothing will continue to grow. Business users waste time dealing with bad data, data scientists spend an excessive amount of time cleaning up the data, and IT must invest in developing processes to keep various systems that are not integrated.
A classic use case
Take, for example, a manufacturing company that wants to create a quality dashboard as an integral part of their continuous improvement initiatives. An important first step is to consider the quantity of data needed to get a complete and accurate view of the critical quality metrics (data points) the company has at its disposal. Most companies create and collect large quantities of process data but typically use them for tracking purposes only. To get a complete and accurate picture of the “quality health” of the company, the analytics dashboard needs to include scrap data, warranty claim data, inbound inspection data, rework data, and that from other sources.
The simple answer to the question of where the data for a quality dashboard is to be obtained is the enterprise resource planning (ERP) system. Unfortunately, this is not even close to being correct in most cases. Much of the detail data from the shop floor is not captured in the ERP system or even stored in the enterprise data warehouse (EDW). The problem is that this data is processed and stored in multiple, fragmented systems.
With so many data points, companies need to take a repeatable, proactive approach to data accuracy and related issues in order to build trust in the information that leads to increased adoption. Production, quality control, and demand planning are among the many functions that manufacturers can enhance through improved data quality. Having trusted and complete data improves visibility into manufacturing processes and thereby reduces or eliminates engineering flaws, manufacturing overruns and underruns, product defects, and other problems related to quality.
In this scenario, the opportunity is to invest in a system and the skill sets that will allow them to quickly and easily prep, blend, and cleanse their existing process data with the data in their ERP and EDW systems, which in turn can be analyzed more easily in spotting patterns and drawing actionable insights from the information.
The Naveego Complete Data Accuracy Platform
Having clean data is at the core of every manufacturing company’s continuous-improvement programs. A crucial step to maintaining data quality over time is investing in the right technology.
Naveego is a cloud-first data accuracy management service that identifies and continually monitors data problems at the source systems before that bad data creates inaccuracies in reports, dashboards, or other decision-making tools. In addition, our Master Data Management solution compares data in all systems and delivers one version of the truth by ensuring that changes to data in one system are reflected in all other systems that also contain that same data.
The right data quality management strategy will not only help you make better and more-informed business decisions but will also maximize the success of your current and future business initiatives.
From a healthcare system perspective, the use and sources of data are even broader and more complex than the number of EHR systems that are in play. Throw in some claims data, financial data, user-generated data, labor/HR data, and patient-satisfaction data as a start. Harmonization of all that data is way beyond the core capabilities of the EHR vendors. They should stop presenting themselves as the data hub to their customers. Healthcare needs to invest in an enterprise data layer outside the EHR.