ISO 8000 Certification Options with Project Leader Peter Benson
The ISO 8000 Data Quality standard continues to grow in popularity.
To help our readers understand how to get started Peter Benson of ECCMA was recently interviewed discussing a range of topics relating to the ISO 8000 data quality certification standard of which he is the project leader.
DATA QUALITY PRO: Good afternoon everyone. It’s the top of the hour and another Data Quality Pro Webinar. Today we are delighted to be joined by Peter Benson and we’re going to be discussing ISO 8000, the master data quality and standard.
And ISO 8000 is one of our most popular search terms on the website and it’s obviously there’s a lot of interest there, around what ISO 8000 can do.
So, we just wanted to speak to Peter and dig a little bit deeper into some of the various levels of certification and try and understand a little bit more about the benefits to organizations and professionals alike.
Perhaps we should start really by just a brief overview in terms of what your role is on the ISO 8000 project and so you could just give us a brief overview of your role.
PETER BENSON: Yes. Thank you. I’m actually the project leader for ISO 8000.
The way ISO works, is a series of technical committees and this technical committee, that this comes under is TC184 and in fact the committee deals with automation systems and the subcommittee SE4, Subcommittee 4, deals with data and within that — that committee, a project was submitted for new standards for data quality.
It was given the name ISO 8000 and I’m the project leader of the — of the — of the group that develops the standard and we have an editor who is Dr. Gerald Radack.
So the way ISO works, my job as the project leader, is, to make sure (inaudible) we keep on track, make sure that the standards actually are developed on time, they go through their voting process, I have to review and sign off on the standards of course.
I’m also the initiator of the standards which (inaudible) happens, you know, actually myself and — and the standard was funded originally by the U.S. Defense Department and that’s (inaudible) some communication of, you know, why they’re interested in standards but that’s where my role, inside ISO, is to make sure that we progress in an orderly fashion with the development and meet the requirements of the user community
DATA QUALITY PRO: OK, so, we’re gonna try and keep this session to about 30 minutes today and we — we’re gonna have about roughly 20 minutes sort of discussions with Peter and — and then we can have sort of, hopefully about 10 minutes Q & A with anybody on the call, today.
If you want to put questions to us during the call, just, there’s — there’s a questions tab there and a chat tab as well, that you can type your question into either one of those panes and I’ll be watching that throughout the session and we’ll put those questions to Peter at the end of the session.
OK, I — I guess, one of the first things we should start with because obviously there may be people on the call today who are — who are, you know, they may have seen some of the announcements about the webinar but don’t fully understand what ISO 8000 is — is really about from a higher level.
Is it — is it possible to give like a brief overview of a — ISO 8000 in particular and what are the goals for the standard, what are you aiming to achieve with the standard?
PETER BENSON: Yes, yes. You know, standards by their very nature, the purpose of standards is so that people can claim compliance with the standard.
So, when you look at a standard, it has certain compliance clauses and it’s, you know, why would we do that?
The reason is, again, the motivation for the standard was driven by, buyers, companies looking for data, looking for quality data, looking to get better, in this case master data, that are descriptions of organizations, individuals, materials, whatever they are buying and typically what they were getting back, from the data provider, was, badly formatted, there was no metadata, it was, you know, you had to try and figure out and what was going on.
So, the whole purpose of a standard is something where people can claim compliance with it.
So ISO 8000 started — the 100 Series is master data, with the basic requirements to — to allow (inaudible) this master data is quality master data.
What are the characteristics that make this data quality, as opposed to other data, is not quality data?
And, of course what you’re looking for, is, that people start to claim that, you know, their software, their company, their services are ISO 8000 compliant and that’s very, very, important.
One of the things that came out of ISO 8000, we’ll talk a little bit more about it, is, it reaches to the fundamental issue of portable data.
The committee that we work with in ISO, their whole reason for being, is, to make sure that data is interchangeable between different applications. So data portability is a key to it and of course quality data is by definition, portable data.
DATA QUALITY PRO: Fantastic and I — I guess, just so people can frame it in their minds, can you just walk us through a typical scenario, like — like a commercial scenario, I guess, explaining the various roles or, you know, data producer, data consumer, so people can kind of see the benefits of — of how they would apply, the ISO 8000 standard.
PETER BENSON: Oh, yes in fact, let me give you some examples which are pretty recent ones but I think we’ll all relate to. At the end of it, what you’re gonna find is when you look at ISO 8000, quality, the issue of quality within 9001, the definition of quality is, meets requirements.
If the data you provide, you know, it didn’t meets my requirements, is quality data or what do we mean by requirements for data or requests for data?
Every time you go into a web screen and you type in your username and password, you are providing data in compliance with a data requirement.
So, as you type in your username and password, if it works, obviously it was quality data you put in, because it worked. It met the requirements
Think of it more, wider than that, every form that you fill in, is a request for data.
Your ability to comply with that request is, dependent on the quality of the query that you were given. You’re trying to fill in your tax form or something of these, you know, forms we see and you can’t really, you know, “I’m not really sure what they want to know.”
The reason is because the metadata, the label they put, was not really very good quality.
So the — the exchange of quality data allows and defines the fact that, I’m going to ask you for data. My — my request for data is ISO 8000 compliant and therefore your reply to me, is gonna be ISO 8000 compliant.
Now, it’s used today extensively in the master data management, for example, in the ERP Material Masters, Vendor Masters, Customer Masters, how do I evaluate the quality of that data?
Well, I have to measure the data I have, against the requirement for data and ISO 8000 tells us how to do that and of course that kicks off the next one.
Well, if I don’t have the data that meets my requirements, how do I go and get it?
Well, how do I go and get it? Really is, how, do I go and ask for it?
So ISO 8000 sets out the basic principles of, what is quality data? What is not quality data?
It focuses on the issue that it must, of course, you have to deal with syntax, semantic encoding and it deals with, you know, meets requirements or what is a requirement? How do I meet it?
So, ISO 8000 at the high level is designed to make it easier for companies to ask their data providers for data and that’s one of the purposes of standards.
Is, I can simply say to a one of my suppliers, I need you to send me data to allow me to manage what I bought from you and the data must be ISO 8000 compliant. That’s the purpose of the standard.
It also allows my suppliers to declare that their — the data they have sent me is ISO 8000 compliant.
So, the — the purpose of the standard is to differentiate those who understand the concept of master data quality, from those who do not.
So, as a company, as an individual, if I’m looking for a job as a master data manager, the fact that I haven’t ISO 8000 certification says, I actually understand the fundamentals of how to — to define and measure master data quality.
DATA QUALITY PRO: I the ISO 8000 standard, you talk about master data, does it only apply to plants and equipment type data or can you use it across other master data types as well?
PETER BENSON: No. That — that’s where it started. In fact, I would say the biggest push at the moment is in the pharmaceutical industries, healthcare but also in finance and other mortgage companies.
Banks, have realized that one of their reasons and in their recent — let’s not call it melt down because they didn’t quite get that far but pretty close, was that the data was so poor quality, they just didn’t have the right data and the reason they didn’t have the right data, they didn’t ask for it correctly.
So they’re going through a whole exercise on the moment, when I say they, I mean, quite a large number of banks and here in the U.S., a lot of mortgage companies and — are looking at what — how we didn’t ask for the right data, we don’t even know — if we did it (inaudible) we have was correct or accurate, how do we measure that?
So what ISO 8000 starts with is, and — and if you look at ISO 8000 — in the first part of ISO 8000, deals with the foundation of quality, which is syntax, there must be a syntax.
You send me data and there is no syntax, my computer is not gonna be able to read it.
Well, that’s trivial. We all accept that, it seems to work.
Then we must be semantic in coding. If you’re going to send me data, send me a spreadsheet for example. I — that’s fine but the labels on the columns and the rows of the spreadsheet, right?
Must be on ambiguous and I’ll give an example.
Couple years ago, we were talking about ISO 8000 to Homeland Defense, they said, “well, you know, we need to have in — in our passports, we have the concept of hair color and eye color. Can you make sure it’s in this dictionary, in — so we have a semantic access to the dictionary, it says, ‘what is the definition of hair color and eye color.’”
I say, “That’s easy. OK, I’ll put it in. What is your definition of hair color and eye color?”
He looked at me and said, “Well, that’s — that’s — that’s silly. You know, color of the hair.”
I say, “actually, not really. You mean biological color or observed hair color?”
He goes, “wow!”
I say, “Well, you know, there is a significant difference between the two as any woman (inaudible) will tell you, right?”
So the — the bottom-line to that one, is, if I look at my passport today, my U.S. passport, hair color and eye color have been removed.
So, that was an application of ISO 8000, of getting somebody to look at their data requirements, looking at the metadata that we are using, look at the definitions, realizing, “I’m asking the wrong question. The question has real relevance to what I’m trying to do.”
So, that’s a practical application of ISO 8000. They, you know, the forms that the immigration has here, you come through customs, for example, what you’re gonna start seeing, is those forms in the bottom corner are gonna start saying, they are ISO 8000 compliant.
What does that mean?
That means that every question that’s been asked, goes to a definition and so nice to know they’re silly questions, like, you know, “what is your name?”
Well, that’s a pretty complex question, what of countries?
So, at the end of the day, it’s about when you ask for data, how explicit have you been, in terms of defining exactly what data you want.
That’s what ISO 8000 drives.
DATA QUALITY PRO: I understand and obviously as we increase the portability of data, that now obviously becomes at a — at a more important as well. So — so you — you talked about the different types of certification there, can you just walk us through, is — is there four different types of certification? I’m not sure, I’m just looking at it.
PETER BENSON: We — we actually (inaudible) again, remember in the ISO process just like ISO 9001, there is a standard.
Anybody can take that standard and self-certify it. You could do that ISO 8000 standard and say, “I,” you know, “Dylan Jones, I am ISO 8000 compliant.”
That’s right and you can do that.
But a lot of times you rely on methods or other people’s methods for going through the compliance process.
So ECCMA, as an association, we’re not-for-profit but we’ve come up with a process that we use to make a determination whether we believe the individual or organization is ISO 8000 compliant.
So, currently we offer for levels, four certificates. The first of them, the most common one is the Master Data Quality Manager. That’s the individual or the organization.
Typically, when we certify somebody, it’s an individual that works for an organization and that certification covers both of them.
So, what do we do?
During the certification process and we do it in one day, it’s eight hours, it’s not a — a hard job, we basically go through the principles of the standard and we basically –there is, yeah, there is a test, right?
The test is, can you create a data requirements statement?
Can you specify requirements for data?
And we don’t make it too hard. There is a piece of software that helps you do that.
Having specified your requirements for data, can you formulate that into a request for data?
Again, there is a piece of software to help you do that, so it’s not very difficult.
And then finally, if you received that request for data, could you actually answer it?
And again, those three things are what we look at, in the certification process.
Can you generate a — a — a request for data?
Can you answer a request for data?
And of course, to do that, you have to be able to define data requirements.
So that really is the key to the — to the individual organization as the master data quality manager.
We have a simpler certification, which is actually, literally done in about 3 to 4 hours and that’s a Quality Master Data Provider.
Typically, that’s designed for suppliers, who are going to be simply answering questions.
All we want to know, is, if we send you a request for data, can you answer it?
Now, it’s just like speaking a foreign language. You know, listening and, you know, writing down what they’re saying, is not too hard. So, it’s a lot easier if somebody sends you the questions, you look at the questions and you answer and it’s really very, very, simple, right?
So, if I — if I gave you a passport application that was ISO 8000 compliant, and you filled it in, all the boxes correctly, guess what?
You’d be deemed to be certified as an ISO 8000 Quality Master Data provider.
Now, the other two levels of certification are a little bit different.
One is for software services — software applications and the other one is for data cleansing services.
A little bit more complicated because at that level, we want to make sure that they are able to use Web services to access dictionaries, to be able to look up concepts in an open technical dictionary, that they can create data requirements in XML, they can create queries in XML and they can respond to a query in XML.
So, there’s a little bit more work involved at a software level as — and also the service level.
So four levels and — and typically, the — they’re not designed to be complicated and therefore they are not designed to be high-cost.
And I don’t know, Dylan, if you want to touch on, you know, the process we are using, is very, very, different from an ISO-9000 process.
DATA QUALITY PRO: Yeah. We — we talked about that before the call, so, — (inaudible) because sort of thing, a lot of people will have experienced the 9000 standards, so can you, yeah, can you explain how — how they — how they differ there?
PETER BENSON: When we designed ISO 8000, a couple things were very important — that jumped out at us.
Number one, we’re talking about data and data in — in the definition we’re using, for these standards, is, processable by a computer.
So, it follows that if I have requirements that are processable by a computer and I have data that’s processable by a computer, that I should be able to use a computer to evaluate whether or not the data is compliant and that’s the huge difference between ISO 8000 and ISO-9001.
ISO-9001 says, “You have a process and you follow your process.”
And, the only way to check that is for a humanoid to turn up on your doorstep and look and see that you have these processes and look and see if you’re following these processes.
Well, you know, that’s an audit and that’s expensive, right?
In this case, I don’t really care how you generated the data. I care that the data meets the requirements.
So, ISO 8000 is only concerned with exchange of the data, right?
Data that comes out of a company, so every time you send me, data I can immediately and automatically verify that it is ISO 8000 compliant. At every transaction I can do that.
So, I don’t need to send in auditors to go and push you one prong you and go to check your systems, I just want to — I can check them every single transaction that you send me, that it is or is not ISO 8000 compliant.
That’s pretty much the difference.
DATA QUALITY PRO: I remember the ISO 9000 auditors going in and it’s obviously a quite lengthy and costly process, so hopefully the — the ISO 8000 standards should be sort of more widely adopted again because it’s something — it’s something that is obviously less intrusive and less costly.
So — so — so in terms of software application, on software application side, do you — I mean, what sort of vendors you are actually seeing, software vendors going down the route of ISO 8000?
Are you seeing any sort of data quality vendors or master data vendors getting involved?
PETER BENSON: No, it — typically what you’re seeing at the moment, we’re seeing at the moment is, the — those companies who’ve been involved in — in data cleansing, cleaning master data, they’ve been getting (inaudible) that’s why if you look at our website, you’ll see a lot of data service providers, the cleansing of — obviously, their customers are interested to know that the data they get back is portable data, it’s coded using an open technology dictionary, it’s not proprietary.
There’s lots of reasons why they wanna make sure that the service provider person doing the data cleansing, is doing it in accordance to the principles of ISO 8000.
So that’s — that’s the service side of — of data cleansing services.
The other side in terms of software applications is typically tools that are used to manage data quality.
So, these tools will be tools where you develop your dictionaries, you develop your on top — your — your — I was going to use the word anthology but that’s a — (inaudible) come later, so dictionaries, plus your data requirements and your classifications.
So, there’s a — there’s the number of tools coming out — out already, that are compliant with ISO — ISO 8000 and we just start to see the first, you know, MDM applications, you know, looking at getting certified to being ISO 8000 compliant.
For an MDM application, it’s ISO 8000 compliant, what it really means and the key thing to that is data portability.
If you cannot get the data out of an application in a — in a form where the metadata is open and the model is open, it’s not ISO 8000 compliant.
And that’s gonna become more and more critical as we go down — as we go down the line, as more and more people actually follow and understand the importance of ISO 8000.
DATA QUALITY PRO: I understand. So, I’m — I’m all right in thinking, this is an XML-based standard in terms of the portability of data, is that correct?
PETER BENSON: Not quite. The — the way that standard is written, the syntax that you use, you can — as long as it’s an open setting, you can do what you want.
So, it separates, you know, whether it’s XML — I can have a spreadsheet which is ISO 8000 compliant, right?
I can have an e-database that’s ISO 8000 compliant. The syntax as long as it’s available and accessible, not a problem.
The semantic encoding is typically where the first problem occurs, is, all the metadata that was used in your spreadsheet, your column headers, for example.
Well where the definition for that?
Now, you can do that either by including those definitions in your file, that would be ISO 8000 compliant or typically by referencing an open technology dictionary.
So, it’s the — the ability to resolve the metadata, the data labels back to the — the definition, which is probably where the first problem of ISO 8000 starts to hit.
So, portable data, if somebody sent me, you know, comma separated data, that’s fine. Syntax, I can read that but if they didn’t give me the map, of, you know, the model or — or — or the metadata, right?
It’s not meaningful, right?
So, that’s — that’s part of ISO 8000, requires that when I receive data from somebody, I could import that into any other system and it’s still meaningful.
So, you know, date of birth is still labeled date of birth and it’s resolved to a date of birth, so no cryptic, you know, no cryptic metadata.
DATA QUALITY PRO: I understand and obviously by having that the — you — you — you can obviously embed the — the semantic rules that — that kind of, you know, enforce a requirement, I guess.
Is that — is that right, the — the idea of (inaudible) you actually
PETER BENSON: Correct
DATA QUALITY PRO: along — along with the data
PETER BENSON: You — you
DATA QUALITY PRO: you apply the semantics at the same time?
PETER BENSON: Correct. Now we use for — for ECCMA, we use another standard to do that.
We use ISO 22745.
ISO 22745 is another standard that we are project leaders for, which came out of the NATO environment, that basically has an XML schema for defining data requirements.
So 22745 Part 30 is how you express in XML, your data requirements and there is also how you resolve an open — an open technology dictionary.
So, if you send me data which is ISO 22745 compliant, I can tell you whether it is quality data or not.
Now, as I said there are — the way ISO rules or how we write standards, we cannot mandate a specific process, we have to talk generically about how the process has to be performed.
So, ISO 8000 says, “you must have a syntax, you must do semantic encoding and it must meet requirements.”
It doesn’t tell you how to do it.
ISO 22745 tells you how to do it.
DATA QUALITY PRO: Oh, I see. OK. Just — just for people on the — on the session, (inaudible), “Peter’s gone and put together quite a detailed PDF, going through a lot of the terminology on (inaudible) concepts” and I’ll circulate that with everyone on the call. We’ll make it available on the website as well.
Interestingly, I was reading through the website and I was going through some of the documentation you sent and obviously, yeah, the goal you — you cited was the — the fast track.
So, it’s the better quality data and also sort of a statistic, so the recent test showed a 30 percent increase in the quality of data and I guess, you know, figures like that are sooner gonna make a lot of organizations sit up, particularly as you say, in light of the — the recent issues in the finance sector and other sectors. So, I mean, what are some of the improvements that you’re seeing on the ground, with regards to data quality and how is the standard making a difference, you know, in some of the sectors you’re seeing?
PETER BENSON: Well, that that study in fact, I believe it will be published pretty soon, you may want to put it on their site as well.
Comes out of the U.K. Ministry of Defense, they conducted a very in-depth study using the principles of ISO 8000 to request data from their suppliers and what they looked at, is they’re traditional method, that there is a there’s a rule that says, “if they’re going to buy something, they need a certain amount of data to manage it in their in their inventory or to buy it,” or whatever else, so, they have requirements.
Now, the (inaudible) system wise that when somebody sold them an item, they accompany it by technical drawings or some description and then a catalog or we’d sit there and try to figure out, you know, how to describe it.
In the new system, using ISO 8000, ISO 22745, they generated requests for data, sent it to the supplier and said, “Well, you know, you’re trying to sell as this or you sold us this, these are the characteristics, I need to know. Can you please answer these questions?”
Now, what was interesting — what was interesting is that they basically — they basically got in — in their analysis — the speed in which they got the data back, the quantity of the — the quantity of the data they got back, you know, those were things there were really measuring and it — and they gave some metrics in terms of financial consequences of what they were doing and it was — it was huge, absolutely huge.
It demonstrated that the — and it comes out of the principle, that it’s better to ask to ask for what you want.
If you ask the — the data provider, this is exactly what I want to know, you’re more likely to get the right answers back and in terms of 30 percent savings on their cost, now they have a very clear benchmark of how much it cost them to describe an item and this was, actually it was greater than 30 percent but that was our — what we agreed with — with the published figure.
DATA QUALITY PRO: (inaudible)
PETER BENSON: So they saved — they — they saved hundreds of thousands like this. They — they looked to save tens of millions of — of pounds on their cataloguing effort, by simply communicating more accurately with their suppliers of what data they need.
DATA QUALITY PRO: I understand. So there is obviously the data quality, the — the physical data qualities are going to improve but obviously there’s not (inaudible) there because they’re not having to pour through catalogs, they’re not having to go and resolve ambiguity, that there’s — there’s clear definitions being passed — obviously clear requests going forward and then clear information coming back, so , yes, I can see how the — the cost improvements would come through there.
So, finally, I — I guess from the people on the call will probably wanna know a little bit about two things really, so — so one will be fees, you know, what other costs involved for some of the — some of the certification standards and also if they are interested what — what should they do next.
So, if you look at cost first and then, you know, what are the next steps for getting, you know, sort of — obviously contacting you and — and taking this forward?
PETER BENSON: Well up –up ‘til now, we’ve been running actual tutorials, in person tutorials, takes a day and I believe the cost of that’s about $500 for that training session.
There — we’ve been asked to split it into an online — online tutorial session over eight mini sessions, I believe that will — that will be available on the 1st of December.
The cost of that again, will be (inaudible) around $250 and that includes a certification process.
You need to — if you have a certificate, you need to renew it every year. I believe that between $50 and $100, it depends on if you’re a member of the association or you’re not members of the association.
So, it — this is not — this is not a high-cost exercise, it is focused on the practical implications of ISO 8000, how to actually generate a request for data, how do you send a query and how do you get a reply back, that’s very, very, focused on what it does.
It’s — it’s not — we are not, again looking at — you know very complex issue of governance, that doesn’t come under the current ISO 8000, although it will do.
The other one that I’d like to bring (inaudible) was your attention on these (inaudible) there is a fundamental difference, that is what we found during the development of the standard, between data quality and information quality and that was — you know, it took us two years to figure that one out.
So, what we’re looking here is data quality.
The fact, that you’ve provided the data right?
Doesn’t necessarily mean the data is useful.
So data quality does not necessarily mean that you would have quality information, although it is impossible to have quality information without quality data.
So that’s one way it works, the other way it doesn’t.
So, typically to get a individual certified as master data quality, as it currently stands, it’s a one-day training course.
The next one’s in October in Bethlehem and I believe there is a group in France that will be putting one in the spring.
As of 1 December, it will be available online and it will connect you to go through their certification course themselves.
The advantage we have with this, is, because it’s not an audit process, as you go through the certification process, you’re creating three files. Those files are actually assessed, as being compliant.
So, it’s much more positive, much more objective, rather subjective test.
DATA QUALITY PRO: I understand. Thank you very much and I — just if anybody wants to come forward with some questions on — on the session, please let me know.
If there is — I think one — one — we’ve had one question come in just a few seconds ago but I think, yeah, that obviously clears up the — the — the next steps scenario then.
It –it — I mean, is there any software that is available now, that can kind of help this compliance process?
I don’t necessarily — I don’t mean the — the online sort of training element but is there any sort of tools that you –the — the people can use to help integrate with their…
PETER BENSON: Yes.
DATA QUALITY PRO: existing
PETER BENSON: What we’re seeing…
DATA QUALITY PRO: …environments?
PETER BENSON: What we’re seeing is that there are a number of members of the association that are developing some very sophisticated tools.
Some are more expensive than others but one tool in particular which is drawing attention, is there is a very interesting anthology management tool.
Now, when we talk about anthologies, we start thinking about you know, OWL and RDF and all that, fairly easy, not so complicated stuff, right?
But, in reality, anthology is no more than a subset of a dictionary. So, what the tool helps us do is, say, “here’s an open technology dictionary which has 4 million concepts in it, let me create a subset of what I’m actually going to use” and then it allows you to then take that and build your data requirements.
So data requirements is not very complicated, it says, data requirement simply says, for an item, let’s say a table, I need to know these three things about the table. The height, the length and the material, right?
So, you know, those — that’s how you develop a data requirement.
Data requirement is also a form, you know, my — my passport form or my mortgage form, those are data requirements.
So, what the software does, it allows users to very quickly, through click and point, just build their data requirements and then of course, you could generate requests for data, you can generate cataloging tools that go with it and all the rest of it.
So there are — we’re starting to see some early tools being developed, some very sophisticated tools.
They will — will allow you to work inside an ISO 8000 environment of, I have a syntax, I know what semantic encoding is and I know how to build and manage my data requirements.
And again, that’s what separates us from those who say, you know, “I’ve got good quality data.”
My answer is, “OK and how did you come to that conclusion?”
“Well, it’s gotta be based on data requirements” and “show me your data requirements.”
And you get all sorts of run a-rounds.
“Well, Hmm.” You know, “what do you mean by data requirements?”
“No, no. What data do you need?”
DATA QUALITY PRO: (inaudible)
PETER BENSON: Well, there are — there are tools out there, if you look at our member on the ECCMA website, we tend to highlight those companies that are working in that area but in terms of members of ECCMA you got both you know Oracle, SAP, IBM. You’ve got the big companies as well.
They’re also working on the same problem.
We have a conference you in October, I think it’s on our website and there’s presentations there, from people as far ranging as the FBI, showing how they’re using some of these tools, too, I believe Boeing is showing how they’re using these in aviation and from, you know, refineries and oil, you know, oil facilities, obviously not the BP platform, I’m afraid.
DATA QUALITY PRO: Yeah, yeah. OK that is understandable. So the I — I — I guess so — so what’s your kind of roadmap then of –of just wrapping up, I guess, where is — where is the roadmap for ISO 8000 and (inaudible) sort of, you know, where you are now and where you’re hoping to take the — the standard moving forward.
PETER BENSON: Well we — we — we have done a pretty good job on master data and that’s being generalized to transaction data. So we’re moving from master data up to transaction data.
That — that seems to be pretty much under control.
There is an initiative at the moment, to develop governance recommendations. I hope those will be recommendations, not compliance issues. We’ll see where that goes and also ISO 8000 is going to — not only we’ve we got the data side pretty well nailed, I believe, at least, we’ll be moving more into information quality.
Now, again the, information quality issue, there are different characteristics of information that don’t apply to data and those tend to be in relations with third point, for example, timeliness is an information quality issue.
Well that has nothing to do with the data. The data is whatever it is.
Whether you received it in a timely manner, is only from your perspective.
So those are issues we are looking at as well.
DATA QUALITY PRO: I see. So…
PETER BENSON: (inaudible)
DATA QUALITY PRO: So — so — so your — your aim is to obviously as (inaudible) the focus is on master data, which I’m assuming is the — the most common type of issues, most organizations have but the — the idea is — is just to branch out and basically fill in the — the — the whole of the — the (inaudible) data go in and some information quality kind of — a landscape as well, ain’t you?
I understand. OK.
PETER BENSON: But– but you — you be cautious on those areas where there is no objective form of measure.
We’re trying to keep away — well ISO-9001 does a good job, for what it does, you know, there’s no point replicating that. So, when they’re looking from a data perspective, what can I do, how can I evaluate the data, to come to the conclusion.
DATA QUALITY PRO: And — and — I — I — I guess that’s — that — that that’s the challenge and, you know, and I think this is where we — we — if — if you want to have a heated argument in a — in a pub, ask a lot of data quality information, quality professionals what’s — what’s the difference between the two terminologies, out, you know, this is — so many difference of opinion there.
So — so I guess for the — for the time, you’ll — you’ll you’re emphasis is really on, you know, how you can actually measure, obviously you’ve got the, you know, you’ve got you you’ve got the data standards passing between producers and consumers and so the whole point of this standards, I guess, is to is to be able to electronically verify, I guess and — and — and instantly kind of, you know, sort of measure the data. So obviously, this is — where we’re drifting to
PETER BENSON: (inaudible)
DATA QUALITY PRO: (inaudible) ambiguity and subjectiveness then — then — then the — the — the — it becomes harder to kind of enforce that. Is that is that right?
PETER BENSON: Yes but I do think that the jury is out on information versus data. I — I think there’s been enough worked on that there is clearly a authority of, in terms of what is information and what is data.
(inaudible) it is very amusing that the individual that promoted the fact that the two were synonymous and you could not have information without data and data without information, it’s different, they didn’t represent the same thing, sent a document to the committee that explained that.
And by luck, would have it, and it was purely luck, the — the document was actually transmitted, left his machine as a PDF, arrived as a PDX, which nobody could open of course.
Because it failed the first state of data quality, the syntax was incorrectly defined and so the answer was, you know, “we understand your document and — and we don’t even have to read it because in sending the document you’ve proved that there is a fundamental difference between data quality and information quality.”
DATA QUALITY PRO: (inaudible)
PETER BENSON: (inaudible) aside of, you know, the issues to do with syntax, semantic encoding, completeness, those are issues which are — either that they exist or doesn’t exist, I can measure it, I can do things with it.
Whether it means something valuable, whether it’s accurate in the real world, that’s a totally different decision point.
DATA QUALITY PRO: I understand. That sounds yeah (inaudible) whole different area. OK
PETER BENSON: ISO 8000 goes further (inaudible) and says that, “there is no such thing as accuracy.”
How is that for a thought for everybody?
“There are only assertions of accuracy.”
Somebody says, “this is accurate,” right?
And it’s either, “I warranty that it’s accurate,” in which case if it’s not, I’m gonna pay you or “I am asserting it’s accurate. Here’s how I came to that conclusion.”
But saying, “this data is accurate, is — is nonsense.”
Somebody should be making that assertion.
DATA QUALITY PRO: Oh, yeah. I — I briefly read that through the documentation and obviously I’ll — I’ll send the documentation out, that you sent today.
So that talks about you or I, so the accuracy seems to relate to you or I which is — could you briefly explain what — how that kind of relates in the — I’m on the right kind of area then?
PETER BENSON: Yes
DATA QUALITY PRO: With your…
PETER BENSON: Yes, yes. We — we were cowards, right? Being cowards, as we are, (inaudible) should be said that, you know, accuracy — if somebody makes a statement of accuracy, so we need to know the organization identifiers, who makes this statement, right?
So we need an organizations identifier and then OK, where is their statement, right?
So we (inaudible) point to it. We don’t say, how the statement should be and that may be something the group may want to go into later on in terms of the standards organizations but we don’t make any requirements of how the warranty should be formulated, or how the assertions should be formulated.
We should be saying, “somebody has made that assertion and they’re gonna point us to something that we can make an evaluation, whether or not we agree and we accept that accuracy.”
DATA QUALITY PRO: So — so — so what is — what is this — I’ve just got the slide up on the — on the page
PETER BENSON: Yeah
DATA QUALITY PRO: now (inaudible) 130 accuracy, element, so, I mean, so what is a universal resource identifier?
Is that — is that a person?
Is that a… a value? Or…
PETER BENSON: No, no, no. A URI is — is like your URL. It simply says, “there’s a pointer on the Internet to a document somewhere,” right?
So the URI is actually a document — it’s like a, it’s a — it’s a web reference to an actual document, right?
So, I make an assertion that this data is accurate. I saw your name and address labels or whatever and I say, “You know, Dylan, this data is accurate.”
Well your question to me is, “Peter, are you guaranteeing that it’s accurate? Can you please send me the guarantee,” right?
Or, “are you simply asserting that it’s accurate? In which case can you send me a document that explains how you came to that conclusion,” right?
So, what you’re really asking for, is, pointers to documents…
DATA QUALITY PRO: And so…
PETER BENSON: … and the URI is simply the Web way of — of referencing a document, that’s all.
DATA QUALITY PRO: So — so in — in a subtle way, this starts to enforce, I guess data governance, the way, you know, your data governance policies will kind of cover a lot of these areas where, you know, for — for data — I guess a good example would be so for financial data, you know, end of year financial data being passed around their organization that — that you — you — you basically wants one to warrant or assert that the — the data has been compiled correctly and — and before that information is then passed off to a regulator or a compliance agency or something I that.
Is that the kind of scenario you’d say, yeah?
PETER BENSON: And — and if somebody say, “well I compiled it correctly. The answer is, “OK, can you show me how you did that? What were the rules you did to do that?”
Right, so it’s getting away from people making these — these claims that are just, you know, hot air, “my data is accurate.”
“OK are you — are you just gonna warrant here? If I find it’s not, are you gonna reimburse me?”
Right, it’s financial data, you could be paying an awful lot, right?
So, what we’re trying to do then and again, remember 130 is supplemental to 120.
One hundred and twenty is provenance.
To have 130-accuracy, you have to know where the data came from.
One hundred and twenty is provenance, tells us, where is this, the data comes from and that — it’s pretty simple. It says, you know, “who owns the database from which this data was extracted and what time and day it was extracted?”
So that gives us provenance.
Now, it allows us to trace where data comes from. Now, again, we are not talking about the message, we are talking about individual data elements. For example, in the password example, I did earlier, basically, you know, it’s date of birth.
“Well, where did you get the date of birth?”
Right, or, “I know my date of birth.”
OK, you know, if it’s — if I am the author of that, I’m the authoritative source, then that’s fine but in reality they may want me to prove that, with a document, right?
Because in fact, my date of birth, it’s — it’s what is the date of birth of record?
Right, they’re not asking me when I was born.
They’re asking, in reality, you know, to point to a record of when I was born. It’s when it was recorded and where it was recorded.
So, the whole concept of authoritative source is part of — part of the ISO 8000 process.
Provenance allows me to track where this data came from and then I can decide and again this is information quality, I can decide whether I believe in the provenance or not.
I mean if you — the reason we chose the word provenance, you know, it’s like in art.
If this painting went through a series of people’s hands, and provenance shows me the track — the track record, but if there was a well-known forger in that path of provenance, then the value may be substantially less.
That’s the weakest chain in the chain of provenance.
So — so ISO 8000-120, in claiming that your data is compliant with ISO 8000-120, simply says that, “every data element is labeled with where this data came from.”
DATA QUALITY PRO: I understand…
PETER BENSON: And 130, it says, OK, “I can tell you where it came from and I am now claiming it’s accurate.”
DATA QUALITY PRO: I understand. I –I –I can see so many applications here of how this can obviously it can mean… really play such a huge role in the — certainly in financial, compliance sector, you know, the banking sector as well and the regulatory compliance because it’s — it’s — it is also that proof of — that proof of, you know, the provenance and obviously the accuracy about too (inaudible) the — the — the (inaudible) when I say — it’s obviously something that can get lost as the — these data sets (inaudible) these huge corporations.
PETER BENSON: Absolutely correct, you see, they — they get broken apart, reassembled, all the rest of it and you lose track of it and, you know, it’s — it’s — it’s amazing.
When you start doing — tracing this through, you start to find that there really is a — some serious problems out there.
DATA QUALITY PRO: And I think — just they say, just applying the provenance itself will actually help sort of tighten up a lot of these (inaudible) information chain defects that creep in.
I’m — I’m — I’m just looking at the time, you know, we’ve –we’ve run over a little bit but it’s been a really, really, interesting discussion for me because this — this — a lot of topics here that really apply to some of the — the businesses, you know, I’ve — I’ve worked with in the past.
So, I mean I’ve, I — I do hope that the standard becomes widely adopted and I — I think it’s a obviously a great — it’s a great bonus and a benefit for some of the professionals as well, to kind of get trained in some of these areas, so I hope people look at this and learn a little bit more.
So, thank you very much for your time today.
I’m gonna put a bunch of information around ISO 8000 on the site as well but obviously if people have any further questions, then either send — send them directly to me at firstname.lastname@example.org and obviously you can contact the — Peter at ECCMA but we — we will give plenty of links and information so people can find out more.
But, once again thank you Peter, for your time today and thank you all
PETER BENSON: Thank you
DATA QUALITY PRO: Thank you everyone for attending and posting a few questions in and I look forward to the next webinar.
Image credits: Creative Commons Jurvetson
Peter Benson is an expert in distributed information systems, content encoding and master data management. He designed one of the very first commercial electronic mail software applications, WordStar Messenger and was granted a landmark British patent in 1992 covering the use of electronic mail systems to maintain distributed databases.
Peter is the Project Leader for ISO 22745 and ISO 8000 as well as the ISO TC184/SC 4 Quality Committee convener. He is an expert in the development and maintenance of Master Data Quality as well as an internationally recognized proponent of Open Standards that he believes is critical to protect data assets from the applications used to create and manipulate them.