|
How to Create a Data Quality Firewall and Data Quality SLA
|
Data Quality Essentials: How to Create a Data Quality Firewall and Data Quality SLA
How do you prevent poor levels of supplier data quality impacting your organisation?
The answer is through the creation of a robust Data Quality Firewall and ongoing Data Quality SLA process. This post creates a blueprint for creating your own data quality firewall and ensuring your data quality SLA provides certified, high quality data throughout your organisation.
The Data Quality Essentials series identifies widely accepted data quality best-practices that your organisation should be adopting in an effort to improve and maintain data quality.
Published: 18th November 2011
Editorial Categories: [TB.3] Data Quality Rules and Requirements, [TB.5] Data Quality Improvement and Control
How to Create a Data Quality Firewall and Data Quality SLA
First, a short
story.
Back in 2005
I was consulting at a small utilities company running a data quality assessment
project prior to a data migration. I got everyone in a room and started a
simple information chain workshop task, sticking a load of Post-It’s on the
whiteboard mapping the people, process and data chains that mattered to this
team.
We talked
about the issues in the data. People were generally happy but one person seemed
a little flat so I asked what they did. It was their role, every month, to take
a vital piece of data from a major supplier of data (also their primary
partner) into the business.
Each month
they laboriously sat at a computer fixing defects in the data by hand. The same
type of defects were fixed every single month.
Now, it’s
obviously reassuring they had some kind of firewall but there were obvious
flaws:
- The company refused to push back on
the supplier and demand a data quality Service Level Agreement (SLA),
(sometimes this is tough to do but in this case they really hadn’t exhausted
the options for push-back)
- This person had huge amounts of
undocumented knowledge in his head and was demoralised, a high employee churn
risk and obviously a major challenge to the company if he quit
- Doing stuff over and over again in
the name of data quality is just bad practice
- The lead times for clean up meant
major hassles downstream
So what were
they doing right? Well you could say that…
- Someone was assigned to this
"firewall”, okay it was a screwed up process but at least there was ownership
- They were diligent, nothing bad
really got through according to downstream users
- They were trapping the data at source,
nothing sneaked past this guy until it was ready
The fact is
that many organisations get the data supplier relationship wrong by:
- Applying blanket trust and assuming
the data will be sound
- Applying manual hacks and tweaks
downstream
- Applying a dedicated person or team
(costly) at the point of entry
It’s
incredible how many companies I meet that accept data from outside the
corporate boundary and assume it is correct. Don't make this assumption, take a proactive stance using these steps as an outline guide.
Simple Roadmap for a Data Quality
Firewall and Data Quality SLA
- Map all the information sources flowing into
your organisation (we’re talking a Post-It session here, not a £20,000 data
modelling tool, and bring donuts, that helps)
- Where information comes from an
external source identify whether the following exists:
- Formal data specifications outlining
things like field formats, frequency of delivery, expected values, permitted
ranges
- Escalation procedures or standard
operating procedures for when defects are found
- No documentation or process exists?
Then create one listing the elements above but also add:
- Simple information chain diagram
explaining where this information comes in and feeds to
- Names and contact details for
everyone involved with this data, both on the supplier and consumer side
- Present this to the relevant stakeholder
explaining your desire to improve this inbound information source with a view
to:
- Preventing embarrassment to the
stakeholder when bad data impacts other business units (I would probably open
with this)
- Reducing the cost savings of building
a process that eliminates endless amounts of scrap and rework activity (costs
based on past incidents is ideal here)
- Increasing overall perception of the
stakeholders team as a highly professional, innovative resource (they will like
this)
- Reducing the lead times and improving
various other metrics the stakeholder will be held accountable for
- Get them to sign off that they are
ultimately responsible for this process and you will provide them regular
reports on how the process is working and the value that it (and they) are
bringing to the business
- Document all known issues with this
inbound information source, surveys and casual conversations with data workers,
DBA’s, app designers and anyone else who touches this data downstream are the
way forward here
- Profile the data using one of the
many free tools now available
- Rigorously document the data quality rules
you feel the data should be managed against e.g.
- Timeliness – the data must arrive
between 9am and 10am every weekday morning
- Completeness – the customer name and
address fields must be populated, there must be a valid order number etc.
- Formatting – the order number must be
in the format of NN-LLLL-NN, with no exceptions
- Overloading – each record must only
have one entry in the part code, there must not be multiple part codes in the
same field
- etc….
- Convert these data quality rules into
a live monitoring process, e.g.
- Use one of the many free data quality
and data integration tools available
- Use standard scripts in SQL or Unix,
whatever your data processing platform uses
- Implement the process and start to
track the issues discovered
- Run this process for at least a
month, discovering issues and fixing manually
- Create a comprehensive file of all
the issues found and the impacts they’re having in your organisation
- Approach the data supplier outlining
your findings, the innovations you have made and the issues you are picking up
- If no SLA or formal definitions and
agreements around data quality exist, work with your stakeholder and the
supplier to get something in place
- Agree to share your technology and
approach with the supplier so that they can improve their data quality (chances
are it’s being supplied to other companies too)
- Work together to iteratively create
the most robust data quality firewall and SLA process possible
Okay, so
this may deviate depending on your particular situation but I’ve used
variations on this in the past and it works. In several cases the data supplier
had no idea their data was defective because no-one had raised it. In my story
above, no-one had pushed back on the supplier so the manual fire-fighting had
continued for many months.
If you have
implemented something similar then I would love to hear about it and perhaps feature
your story on the site.
Useful Resources Related to this Feature:
|
DQ and Business Rules Explained with Ronald G.Ross Link
[ more ]
[ hide ]
|
Administration
|
04/11/2011
|
What is a business rule? Why are they so important to data quality management? How can we use business rules to focus our data quality efforts?
In this interview, Ronald G.Ross, the "father of business rules", provides the answers.
Ronald serves as Executive Editor of www.BRCommunity.comand its flagship publication,Business Rules Journal, and as Co-Chair of the Business Rule Forum Conference.
Ronald is also author of "Business Rule ConceptsGetting to the Point of Knowledge" which has recently been updated to a 3rd edition and is widely regarded as the definitive handbook on this topic.
(Note: There is a summary of key points at the end of this interview)
|
|
Data Quality Rules Process for Data Migration Link
[ more ]
[ hide ]
|
Administration
|
07/10/2011
|
Data migration projects have an extremely high failure rate and one of the principal causes is a lack of effective data quality management across all aspects of the project.
To help members reverse the trend of project failure we have enlisted the support of John Morris, author of Practical Data Migration and creator of the PDMv2 Data Migration Methodology, the only industry standard methodology for data migration currently available. John also provides great advice via his BCS Blog.
In this interview, another installment in the series, John Morris provides detailed insight into the way data quality rules are managed within his unique Practical Data Migration methodology.
|
|
Data Quality-Centric Data Migration, John Morris-2 Link
[ more ]
[ hide ]
|
Administration
|
07/10/2011
|
It's common knowledge that data migration projects can be severely impacted by poor data quality but how do we structure our projects so that data quality is at the very heart of our approach?
In part 2 of this 2 part series, data migration expert practitioner, coach and author, John Morris, outlines the various roles and responsibilities that are required in a data quality-centric data migration.
John Morris is the recognised leader in data migration best-practice having authored "Practical Data Migration" (a BCS publication) and created the PDM V2 data migration methodology / certification scheme. John is also event organiser of Data Migration Matters, an annual UK event dedicated to data migration and is the managing director of Iergo, the specialist data migration coaching and consulting company. John regularly blogs on all things data migration at: Johny's Data Migration Blog.
|
|
Data Quality-Centric Data Migration, John Morris-1 Link
[ more ]
[ hide ]
|
Administration
|
07/10/2011
|
It's common knowledge that data migration projects can be severely impacted by poor data quality but how do we structure our projects so that data quality is at the very heart of our approach?
In part 1 of this 2 part series, data migration expert practitioner, coach and author, John Morris, presents a strategy for managing data quality as part of your data migration project.
John Morris is the recognised leader in data migration best-practice having authored "Practical Data Migration" (a BCS publication) and created the PDM V2 data migration methodology / certification scheme. John is also event organiser of Data Migration Matters, an annual UK event dedicated to data migration and is the managing director of Iergo, the specialist data migration coaching and consulting company. John regularly blogs on all things data migration at: Johny's Data Migration Blog.
|
|
Data Quality Rules: General Attribute Dependencies Link
[ more ]
[ hide ]
|
Administration
|
07/10/2011
|
n this guest post, regular expert panelist and featured contributor Arkady Maydanchik provides yet another detailed extract from his excellent book: "Data Quality Assessment".
The 6th in a series, this post defines the various attribute dependency data quality rules and provides a strategy for discovering them.
Arkady is also the co-founder and Managing Director at eLearningCurve LLC, a provider of on-demand technical training via the Internet. Various curricula, including data quality, are available, see here for more details.
|
|
Data Quality Rules: Rules for Historical Data Link
[ more ]
[ hide ]
|
Administration
|
07/10/2011
|
In this tutorial, Arkady Maydanchik provides an introduction to Data Quality Rules for Historical Data, this is part 3 in a series exploring the different categories of data quality rules as covered in the "Data Quality Assessment" book.
For other tutorials in the series refer to the drop down menu above.
This tutorial first appeared in the Information and Data Quality Newsletter of IAIDQ, September 2008.
|
|
Data Quality Rules: Attribute Domain Constraints Link
[ more ]
[ hide ]
|
Administration
|
07/10/2011
|
by Arkady Maydanchik
In this tutorial, Arkady Maydanchik provides an overview of the 5 categories of data quality rules as covered in his Data Quality Assessment book. He discusses in greater detail the role of Attribute Domain constraints. This article first appeared in the Information and Data Quality Newsletter of IAIDQ, July 2007.
|
|
Data Quality Rules: Integrity Constraints Link
[ more ]
[ hide ]
|
Administration
|
07/10/2011
|
In this tutorial, Arkady Maydanchik provides an introduction to Relational Integrity Constraints, this is part 2 in a series exploring the different categories of data quality rules as covered in his Data Quality Assessment book.
This article first appeared in the Information and Data Quality Newsletter of IAIDQ, September 2007.
|
|
Data Quality Rules: State-Dependent Objects Link
[ more ]
[ hide ]
|
Administration
|
07/10/2011
|
In this article, Arkady Maydanchik of Data Quality Group provides another detailed excerpt from his book "Data Quality Assessment".
This tutorial focuses on how to create a state dependent data quality rules process for your data quality management programme. This is an essential and often overlooked aspect of data quality improvement initiatives.
Note: There are other tutorials in this series on data quality rules, see the useful links section in this post for more details.
|
|
Identifying Duplicate Customers by Jim Harris-1 Link
[ more ]
[ hide ]
|
Administration
|
18/11/2011
|
How do you create a strategy for identifying duplicate customers? What techniques must you draw on?
In this feature by expert panelist and creator of the extremely popular OCDQ Blog, Jim Harris , we learn more about the challenges of managing duplicate records and why a robust methodology, not technology alone, should form the solution.
|
|
Identifying Duplicate Customers by Jim Harris-2 Link
[ more ]
[ hide ]
|
Administration
|
18/11/2011
|
How do you create a strategy for identifying duplicate customers? What techniques must you draw on?
In this series by expert panelist and creator of the extremely popular OCDQ Blog, Jim Harris , we learn more about the challenges of managing duplicate records and why a robust methodology, not technology alone, should form the solution.
In this article Jim provides some fictional examples to highlight the need for a detailed, interrogative data analysis approach for duplicate customer resolution.
|
|
Identifying Duplicate Customers by Jim Harris-3 Link
[ more ]
[ hide ]
|
Administration
|
18/11/2011
|
How do you create a strategy for identifying duplicate customers? What techniques must you draw on?
In this series by expert panelist and creator of the extremely popular OCDQ Blog, Jim Harris , we learn more about the challenges of managing duplicate records and why a robust methodology, not technology alone, should form the solution.
In this article Jim provides some fictional examples to highlight the need for a detailed, interrogative data analysis approach for duplicate customer resolution.
|
|
Data Matching Better Practice Guidelines, Wayne Co Link
[ more ]
[ hide ]
|
Administration
|
18/11/2011
|
his report features a best practice document issued to us by Data Quality Pro member Wayne Colless with the permission for publication granted by the Australian Attorney-General’s Department.
The full title of the guide is:
'Improving the Integrity of Identity Data - Data Matching Better Practice Guidelines, 2009'
Wayne is an adviser on identity security matters in the Identity Security Branch, National Security Resilience Policy Division with the Attorney-General’s Department in Canberra, Australia.
The document is an advisory guide for the Australian Government but has considerable relevance to any public or private organization looking to improve their data matching approach.
|
|
The Freakish World of Root Cause Analysis Link
[ more ]
[ hide ]
|
Administration
|
18/11/2011
|
In this post, Daragh O Brien of Castlebridge Associates provides a personal account of how conventional wisdom can interfere with understanding the root cause of an information quality problem.
Daragh closes out the post with some recommendations for ensuring stakeholders adopt a more pragmatic and sensible approach to root-cause analysis.
Note:This article appears courtesy of the IAIDQ and their excellent publications portal.
|
|
Data Profiling Tutorial - 1 (with Free Software) Link
[ more ]
[ hide ]
|
Administration
|
18/11/2011
|
The aim of this tutorial series is to set a data quality challenge that is typical in modern business. By giving you a detailed workbook and all the technology required, you will be able to deliver a complete data profiling assessment with management recommendations.
The Data Quality Challenge
WonderMart are a (fictional) retail chain that sell household grocery products.
The Head of Beverage Marketing for WonderMart, receives a regular sales performance report of the Ground Coffee business division straight to her reporting dashboard.
Historically, she received a basic two-page printed report based on simple metrics.
However, since the introduction of the new business intelligence reporting tool that now takes data from a more detailed data mart she has begun to notice anomalies in the data.
|
|
7 Simple Ways to Improve Data Entry Data Quality Link
[ more ]
[ hide ]
|
Administration
|
18/11/2011
|
One of the single biggest causes of data defects in any organisation is poor quality information entered at the start of the information chain via data entry interfaces.
This post provides some simple, practical and cheap techniques to dramatically improve the data quality of human entered information.
|
|
How to Create a Data Quality Firewall and SLA Link
[ more ]
[ hide ]
|
Administration
|
18/11/2011
|
How do you prevent poor levels of supplier data quality impacting your organisation?
The answer is through the creation of a robust Data Quality Firewall and ongoing Data Quality SLA process. This post creates a blueprint for creating your own data quality firewall and ensuring your data quality SLA provides certified, high quality data throughout your organisation.
The Data Quality Essentials series identifies widely accepted data quality best-practices that your organisation should be adopting in an effort to improve and maintain data quality.
|
|
|
|