We have all heard of many horrors of poor data quality. Companies have millions of records with "(000)000-0000" as customer contact numbers, "99/99/99" as date of purchase, 12 different gender values and shipping addresses with no state information. The cost of poor data quality to enterprises and organizations is real. For example, US Postal Service estimated that it spent $1.5 billion in processing undeliverable mail in 2013 because of bad data. There can be many sources of poor data quality but all of them can be categorized into data entry, data processing, data integration, data conversion, and stale data (over time).
What can you do to make sure that your data is consistently of high quality? The challenge lies in ensuring that enterprises collect and source relevant data for their business, manage/govern that data in a meaningful and sustainable way to ensure quality golden records for key master data and analyze the high quality data to accomplish stated business objectives. Here is a 6-step data quality framework that we use based on the best practices from data quality experts and practitioners.
Step 1 – Definition
Define business goals for data quality improvement along with data owners and impacted business processes.
Examples for customer data
Goal: Ensure all customer records are unique with accurate information (ex: address, phone numbers, etc.) and consistent data across multiple systems.
Data owner: Sales Vice President
Stakeholders: Finance, Marketing and Production
Impacted business processes: Order entry, Invoicing, Fulfilment.
Data Rules: Rule 1 – Customer name and address should be unique. Rule 2 – All addresses should be verified against an approved address reference database.
Step 2 – Assessment
Assess existing data against rules specified in step 1. Assess data against multiple dimensions, such as accuracy of key attributes, completeness of all required attributes, consistency of attributes across multiple data sets and timeliness of data. Depending on the volume and variety of data and the scope of data quality project in each enterprise, we might perform qualitative and/or quantitative assessment using some profiling tools. Existing policies (ex: data access, data security, adherence to specific industry standards/guidelines) should be examined in this step as well.
Examples
Assess percent of customer records that are unique (with name and address together); Assess percent of non-null values in key attributes.
Step 3 – Analysis
Analyze the assessment results obtained in step 2 on multiple fronts. One area to analyze is the gap between data quality business goals and current data. Another area to analyze is the root cause for inferior data quality.
Examples
What is the root cause for higher percent of inaccurate customer addresses than specified in business goal? Is data validation in order entry application a root cause?
What is the root cause for inconsistent customer names between order entry system and financial system?
Step 4 – Improvement
Design and develop improvement plans based on prior analysis. The plans should comprehend time frames, resources, and costs involved.
Examples
All applications modifying addresses must validate against selected address reference database; Customer name can only be modified via order entry application; The intended changes to systems will take six months to implement and requires XYZ resources and money.
Step 5 – Implementation
Implement solutions determined in stage 4. Comprehend both technical as well as any business process related changes. Implement a comprehensive change management plan to ensure that all stakeholders are appropriately trained.
Step 6 – Control
Verify at periodic intervals that the data is consistent with business goals and data rules specified in step 1. Communicate data quality metrics and current status to all stakeholders on a regular basis to ensure that data quality discipline is maintained on an ongoing basis across the organization.
Data quality is not a one-time project but a continuous process and requires the entire organization to be data-driven and data-focused. With appropriate focus from the top, data quality management can reap rich dividends for organizations.
Article sources
Digital Transformation Pro (www.digitaltransformationpro.com)