When we talk about things like data quality dimensions and the quality of data, we often speak in terms of metrics. We may say that a particular attribute is 88.2% complete or a particular key has 3% duplication.
We may find that in a customer database of 3 million records, finding 150 incorrect records is good quality so it’s passed as “fit for purpose” to the business.
In a consumer context these relatively small issues, statistically speaking, don’t necessarily add up to small impacts. The reason is because whilst, as data quality professionals, we may be comfortable talking about individual data quality metrics, what really matters to consumers of our data is trust.
I like to explain the notion of trust as the pointy end of a pyramid. Trust is what you’re left with at the top of the pyramid once all the other dimensions have been accounted for.
- Incomplete data? It won’t be trusted, what am I missing?
- Duplicated data? It won’t be trusted, which one should I use?
- Poorly formatted data? It won’t be trusted, have I understood this correctly?
I was recently searching for details of a company that we’re about to feature in an interview here on Data Quality Pro. A well known online provider of research about companies came up in my Google listings so I took a look at what they could tell me about the company in question.
At first I was impressed. It even listed sales data for the company and a list of executives. Then I realised that some of the people listed were in my own network on LinkedIn so I checked them out.
What I found was quite alarming.
The person listed as the CEO of the company had left the company 2 years earlier even though it displayed a current email address for them.
So I began to look at other companies that I knew closer to home. I began to find a catalogue of data quality issues such as:
- Duplicated company records that left me with no idea of which one was accurate.
- Incomplete executive titles that gave me no clue as to their real role.
- Project managers listed in the executive section.
There soon became a healthy list of data quality issues that resulted in one clear outcome.
I did not trust this information provider.
From their perspective, the data quality issues could have been measured as miniscule. If they held 250,000,000 company details worldwide and I happened to stumble across a few incorrect values then overall some would say their data quality was well beyond Six Sigma levels and definitely above the industry average.
As a consumer of course, these statistics mean nothing to me, my trust is gone and it’s often terminal.
I won’t pass this off as a statistical anomaly. I simply won’t use them and I’m likely to spread the word.
Data quality trust is binary. It’s on or off. There are no shades of grey. You either trust data or you don’t. There is a tipping point and it’s extremely difficult to come back from the abyss of poor trust because it’s an emotional decision that’s been made.
Internally with organisations, poor trust is why we see islands of hidden data stores because accounting don’t trust the new P&L system. It’s why reports take weeks to be produced because executives don’t trust the sources and want endless checks.
Externally of course, amongst consumers, trust has a far greater impact.
I spoke with one very knowledgeable data quality professional in the retail sector who summed it up for me yesterday.
“When consumers can’t see an image of the product they wish to buy online, even if it’s an everyday item, trust is eliminated and conversion rates plummet”
As equally important, trust in data is also viral. If we don’t trust a piece of information then it’s not a 1-to-1 transaction. We tell others and they share the bad news. In the case of consumer data, a single piece of information may also be viewed by thousands of people so it’s not one instance of defective data, it’s magnified.
Where your data is consumer facing and has the capacity to be seen by thousands of people then think beyond simple stats and metrics, trust is everything and it’s the quickest route to customer churn and bottom-line loss.