Using Metrics To Assert A Business Case For Data Quality By Ed Wrazen

Cost of Poor Data Quality Impacts Every Organisation

Money and resources wasted, sales missed, extra costs incurred. Recent research by industry analyst firm Gartner shows the shocking price that companies are paying because of poor quality data. And it adds up to a staggering $8.2 million annually.

That’s the average loss that the 140 companies surveyed put on it. But 22 per cent of them thought it was closer to $20 million and 4 percent even put the figure as high as $100 million. [1]

It’s a sobering thought. But do business managers really appreciate the scale of the problem? Not according to a recent report from The Data Warehousing Institute (TDWI). More than 80 per cent of the business managers surveyed by TDWI believed that their business data was just fine. Yet half of their own technical people took a very different view to their executives.[2]

The truth is that few business managers appreciate the extent to which data quality issues impact their company, since typically, no quantified measurements are made. This means proposals for data quality improvement projects fall on deaf ears. To have any chance of budget approval, data quality project proposals need an assertive business case. They need the backing of metrics that communicate evidence of a real problem that business managers readily understand; a problem that poses a risk to the business and to the key performance indicators (KPIs) by which its success is measured.

Data Quality Metrics – What Are They?

By developing a programme of data quality metrics and measuring and reporting regularly, organisations can build increased awareness of what data quality means for the business. 

Metrics can help demonstrate what risks/issues might be presented by any decline in data quality levels, and what opportunities might be gained by investing in improvement. 

Metrics also support objective judgement and reduce the influence of assumptions, politics, emotions and vested interests.

But there’s no point in measuring and reporting on all of an organisation’s data, or every aspect of that data. Be selective. A metric showing that nine percent of customer records in a marketing database lack a middle name is likely of little consequence to KPIs. But if five percent are missing a postal code, then it could be of some importance since if one million mailings are sent annually, then 50,000 would be returned and and if the metric referred to a billing database, it could make a strong business case indeed for a data quality project. 

Invoices worth millions of dollars might not be reaching customers, delaying or even threatening receipt of revenue.

Where to measure: Key Processes

For most organisations, business KPIs and the executive decisions aligned with them will most likely relate to cost, revenue, profitability, procurement, logistics, products, customers, suppliers and other important assets. 

Identifying the processes supporting these KPIs, the data required for these to operate effectively, and the quality of that data, enables organisations to determine the impact of poor quality in tangible terms. 

They are then much better placed to gain business understanding and support for building the business case for data quality.

For example, you might establish that:

  • Revenue is typically 40 percent repeat sales, and half of repeat sales come through contact made by customer service representatives. Good contact data is vital to them. They cannot make follow-ups where customer contact information is missing or wrong.
  • With direct print and mailing costs being significant and increasing, and the CEO keen to show ‘green’ achievements in the annual report, the current 100,000 plus pieces of mail returned per annum must be cut down. Customer address data accuracy is the key, and would appear to be an issue.
  • In the last six months, eight percent of customers who purchased online had to wait longer for delivery than expected, despite the products being in stock. Product codes in the order system are inconsistent with the stock system requiring manual inspection and resolution.
  • Senior management is considering rationalising product lines. But the sales ledger reports show discrepancies with the marketing department’s business intelligence system. Management cannot trust the cost/sales figures, delaying decisions and incurring unnecessary costs.

What To Measure: Data Quality Dimensions

Having identified which data to produce metrics on, the next step is to define which of the many aspects of its quality to measure. These dimensions might include:

  • Structure: Is the data in the right format for it to be usable?
  • Conformity: Does it conform to critical rules?
  • Accuracy: Does it reflect the real world?
  • Completeness: Is business required information present?
  • Timeliness: Is it sufficiently current?
  • Uniqueness: Are duplicate records creating confusion?
  • Consistency: Is the data the same, regardless of where it resides?
  • Relevance: Is it useful to the business in its pursuit of objectives?

Defining which data quality dimensions are important, prioritising them and producing data quality metrics that are meaningful for business owners is typically the job of one or more ‘data stewards’: individuals who understand the key business processes, the role of data in those processes and the intricacies of what makes ‘good data.’

How To Measure: Using Data Quality Rules

Having determined the data to be measured and by which of its dimensions, it is then possible to build a set of data quality rules against which to profile the data and compute compliance metrics. 

For example, if repeat sales from customer service representatives (CSRs) are key, and customer contact information must be present and accurate for the CSRs to make a follow-up/sales call, then perhaps ‘completeness’ of fields at original purchase is important, together with ‘accuracy’ and ‘structure’ of telephone numbers. 

If customers are waiting too long for the delivery of goods that would appear to have been in stock at time of order, then perhaps there is a disparity between product codes in the order processing system and the warehouse stock and dispatch system, so data ‘consistency’ should be measured.

In producing metrics, it’s probably best to be very focused at first, concentrating on just a few areas where data appears critical to business performance. 

It’s also better initially to produce just a small number of metrics on important characteristics that have real meaning to business managers in their roles and responsibilities. Indicating and proving that the business really cannot be totally confident in the data it relies on for certain important processes, decisions or compliance reports should quickly justify further investigation.

 Success in one area, can then be used as a reference to help communicate the value that could be won from metrics in other areas of the organisation.

Making Data Quality Metrics Relevant

Once the use of data quality metrics becomes accepted and management backing is secured, then the practice could extend to cover wider data sources and report in more detail. It’s then important to define who requires what level of reporting.

Management may desire a high level overview, so a data quality dashboard will be helpful. The dashboard should be able to group and aggregate scores into meaningful metrics associating data quality with key business processes or functions. It should also provide drill-down to multiple levels of detail and permit custom reporting allowing business owners, data stewards and data quality analysts to visualise data quality metrics and trends appropriate to their needs.

A facility to monitor and measure data quality over time is fundamental too. Only then is it possible to prove that investments in data quality are making a difference.

Data quality metrics, when aligned to business KPIs have the power to increase business user awareness, understanding and support for data quality investments. Using data quality metrics to identify where the real issues are that impact cost, revenue, profit or other important business metric will most certainly help drive the business case for data quality improvement.

Ed Wrazen

Ed Wrazen is Vice President Product Marketing with Harte-Hanks Trillium Software, a leading provider of total data quality solutions. Working with customers, partners and industry analysts, Ed is responsible for product planning and release co-ordination. 

Ed has over 25 years experience working on IT systems, having started his career as a computer programmer on retail banking and travel reservation systems. He has been heavily involved in database and data management technologies as a product developer, consultant and lecturer, specialising in data architecture, performance design, data integration and data quality. Ed is a regular speaker at industry events worldwide and author on topics relating to data management.

Ed can be reached via: or