DATA QUALITY PRO: Good
afternoon everyone. It's the top of the hour and another Data Quality Pro
Webinar and today we are delighted to be joined by Peter Benson and we’re --
we’re going to be discussing ISO 8000, the master data quality and standard
today.
And ISO 8000 is -- the reason we -- we set up this webinar
because it -- it is one of our most popular search terms on the website and
it's obviously there's a lot of interest there, around what ISO 8000 can do.
So, we just wanted to speak to Peter and dig a little bit
deeper into some of the -- the various levels of certification and try and
understand a little bit more about the -- the benefits to organizations and
professionals alike.
Perhaps we should start really by just a -- a brief overview
in terms of what your role is on the ISO 8000 project and so you could just
give us a brief interview, a sort of brief overview of -- of your role.
PETER BENSON:
Yes. Thank you. I'm actually the project leader for ISO 8000. The way ISO
works, is a series of technical committees and this technical committee, that
this comes under is TC184 and in fact the committee deals with automation
systems and the subcommittee SE4, Subcommittee 4, deals with data and within
that -- that committee, a project was submitted for new standards for data
quality.
It was given the name ISO 8000 and I'm the project leader of
the -- of the -- of the group that develops the standard and we have an editor
who is Dr. Gerald Radack.
So the way ISO works, my job as the project leader, is, to
make sure (inaudible) we keep on track, make sure that the standards actually
are developed on time, they go through their voting process, I have to review
and sign off on the standards of course.
I'm also the initiator of the standards which (inaudible)
happens, you know, actually myself and -- and the standard was funded
originally by the U.S. Defense Department and that's (inaudible) some
communication of, you know, why they're interested in standards but that's
where my role, inside ISO, is to make sure that we progress in an orderly
fashion with the development and meet the requirements of the user community
DATA QUALITY PRO:
OK
PETER BENSON:
That's a challenge
DATA QUALITY PRO:
OK, so, we're gonna try and keep this session to about 30 minutes today and we
-- we’re gonna have about roughly 20 minutes sort of discussions with Peter and
-- and then we can have sort of, hopefully about 10 minutes Q & A with
anybody on the call, today.
If you want to put questions to us during the call, just,
there's -- there’s a questions tab there and a chat tab as well, that you can
type your question into either one of those panes and I'll be watching that throughout
the session and we’ll put those questions to Peter at the end of the session.
OK, I -- I guess, one of the first things we should start
with because obviously there may be people on the call today who are -- who
are, you know, they may have seen some of the announcements about the webinar
but don't fully understand what ISO 8000 is -- is really about from a higher
level.
Is it -- is it possible to give like a brief overview of a
-- ISO 8000 in particular and what are the goals for the standard, what are you
aiming to achieve with the standard?
PETER BENSON:
Yes, yes. You know, standards by their very nature, the purpose of standards is
so that people can claim compliance with the standard.
So, when you look at a standard, it has certain compliance
clauses and it's, you know, why would we do that?
The reason is, again, the motivation for the standard was
driven by, buyers, companies looking for data, looking for quality data,
looking to get better, in this case master data, that are descriptions of organizations,
individuals, materials, whatever they are buying and typically what they were
getting back, from the data provider, was, badly formatted, there was no
metadata, it was, you know, you had to try and figure out and what was going
on.
So, the whole purpose of a standard is something where
people can claim compliance with it.
So ISO 8000 started -- the 100 Series is master data, with
the basic requirements to -- to allow (inaudible) this master data is quality
master data.
What are the characteristics that make this data quality, as
opposed to other data, is not quality data?
And, of course what you're looking for, is, that people
start to claim that, you know, their software, their company, their services
are ISO 8000 compliant and that's very, very, important.
One of the things that came out of ISO 8000, we’ll talk a
little bit more about it, is, it reaches to the fundamental issue of portable
data.
The committee that we work with in ISO, their whole reason
for being, is, to make sure that data is interchangeable between different
applications. So data portability is a key to it and of course quality data is
by definition, portable data.
DATA QUALITY PRO: Fantastic and I -- I guess, just so people
can frame it in their minds, can you just walk us through a typical scenario,
like -- like a commercial scenario, I guess, explaining the various roles or,
you know, data producer, data consumer, so people can kind of see the benefits
of -- of how they would apply, the ISO 8000 standard.
Oh, yes in fact, let me give you some examples which are
pretty recent ones but I think we'll all relate to. At the end of it, what
you're gonna find is when you look at ISO 8000, quality, the issue of quality
within 9001, the definition of quality is, meets requirements.
If the data you provide, you know, it didn't meets my
requirements, is quality data or what do we mean by requirements for data or
requests for data?
Every time you go into a web screen and you type in your
username and password, you are providing data in compliance with a data
requirement.
So, as you type in your username and password, if it works,
obviously it was quality data you put in, because it worked. It met the
requirements
Think of it more, wider than that, every form that you fill
in, is a request for data.
Your ability to comply with that request is, dependent on
the quality of the query that you were given. You're trying to fill in your tax
form or something of these, you know, forms we see and you can't really, you
know, "I'm not really sure what they want to know.”
The reason is because the metadata, the label they put, was
not really very good quality.
So the -- the exchange of quality data allows and defines
the fact that, I'm going to ask you for data. My -- my request for data is ISO
8000 compliant and therefore your reply to me, is gonna be ISO 8000 compliant.
Now, it's used today extensively in the master data
management, for example, in the ERP Material Masters, Vendor Masters, Customer
Masters, how do I evaluate the quality of that data?
Well, I have to measure the data I have, against the
requirement for data and ISO 8000 tells us how to do that and of course that
kicks off the next one.
Well, if I don't have the data that meets my requirements,
how do I go and get it?
Well, how do I go and get it? Really is, how, do I go and
ask for it?
So ISO 8000 sets out the basic principles of, what is
quality data? What is not quality data?
It focuses on the issue that it must, of course, you have to
deal with syntax, semantic encoding and it deals with, you know, meets
requirements or what is a requirement? How do I meet it?
So, ISO 8000 at the high level is designed to make it easier
for companies to ask their data providers for data and that's one of the
purposes of standards.
Is, I can simply say to a one of my suppliers, I need you to
send me data to allow me to manage what I bought from you and the data must be
ISO 8000 compliant. That's the purpose of the standard.
It also allows my suppliers to declare that their -- the data
they have sent me is ISO 8000 compliant.
So, the -- the purpose of the standard is to differentiate
those who understand the concept of master data quality, from those who do not.
So, as a company, as an individual, if I'm looking for a job
as a master data manager, the fact that I haven't ISO 8000 certification says,
I actually understand the fundamentals of how to -- to define and measure
master data quality.
DATA QUALITY PRO: (inaudible)
PETER BENSON: That's helps doing.
DATA QUALITY PRO: It -- it does, yeah. I -- I mean, I -- I
guess, what would be useful would be -- let's talk about a specific example
that, you know, I was looking at the list of companies you have there, which
are certified and -- I mean, (INAUDIBLE) the ISO 8000 standard, you talk about
master data, does it only apply to plants and equipment type data or can you
use it across other master data types as well?
PETER BENSON: No. That -- that's where it started. In fact,
I would say the biggest push at the moment is in the pharmaceutical industries,
healthcare but also in finance and other mortgage companies.
Banks, have realized that one of their reasons and in their
recent -- let's not call it melt down because they didn't quite get that far
but pretty close, was that the data was so poor quality, they just didn't have
the right data and the reason they didn't have the right data, they didn't ask
for it correctly.
So they're going through a whole exercise on the moment,
when I say they, I mean, quite a large number of banks and here in the U.S., a
lot of mortgage companies and -- are looking at what -- how we didn't ask for
the right data, we don't even know -- if we did it (inaudible) we have was
correct or accurate, how do we measure that?
So what ISO 8000 starts with is, and -- and if you look at
ISO 8000 -- in the first part of ISO 8000, deals with the foundation of
quality, which is syntax, there must be a syntax.
You send me data and there is no syntax, my computer is not
gonna be able to read it.
Well, that's trivial. We all accept that, it seems to work.
Then we must be semantic in coding. If you're going to send
me data, send me a spreadsheet for example. I -- that's fine but the labels on
the columns and the rows of the spreadsheet, right?
Must be on ambiguous and I'll give an example.
Couple years ago, we were talking about ISO 8000 to Homeland
Defense, they said, "well, you know, we need to have in -- in our passports, we
have the concept of hair color and eye color. Can you make sure it's in this
dictionary, in -- so we have a semantic access to the dictionary, it says,
‘what is the definition of hair color and eye color.’”
I say, "That's easy. OK, I'll put it in. What is your
definition of hair color and eye color?”
He looked at me and said, "Well, that's -- that's -- that’s
silly. You know, color of the hair.”
I say, "actually, not really. You mean biological color or
observed hair color?”
He goes, "wow!”
I say, "Well, you know, there is a significant difference
between the two as any woman (inaudible) will tell you, right?”
So the -- the bottom-line to that one, is, if I look at my
passport today, my U.S. passport, hair color and eye color have been removed.
They’re irrelevant.
So, that was an application of ISO 8000, of getting somebody
to look at their data requirements, looking at the metadata that we are using,
look at the definitions, realizing, "I'm asking the wrong question. The
question has real relevance to what I'm trying to do.”
So, that's a practical application of ISO 8000. They, you
know, the forms that the immigration has here, you come through customs, for
example, what you're gonna start seeing, is those forms in the bottom corner
are gonna start saying, they are ISO 8000 compliant.
What does that mean?
That means that every question that's been asked, goes to a
definition and so nice to know they’re silly questions, like, you know, "what
is your name?”
Well, that's a pretty complex question, what of countries?
So, at the end of the day, it's about when you ask for data,
how explicit have you been, in terms of defining exactly what data you want.
That's what ISO 8000 drives.
DATA QUALITY PRO: I understand and obviously as we increase
the portability of data, that now obviously becomes at a -- at a more important
as well. So -- so you -- you talked about the different types of certification
there, can you just walk us through, is -- is there four different types of
certification? I'm not sure, I'm just looking at it.
PETER BENSON: We -- we actually (inaudible) again, remember
in the ISO process just like ISO 9001, there is a standard.
Anybody can take that standard and self-certify it. You could
do that ISO 8000 standard and say, "I,” you know, "Dylan Jones, I am ISO 8000
compliant.”
That's right and you can do that.
But a lot of times you rely on methods or other people's
methods for going through the compliance process.
So ECCMA, as an association, we’re not-for-profit but we’ve
come up with a process that we use to make a determination whether we believe
the individual or organization is ISO 8000 compliant.
So, currently we offer for levels, four certificates. The
first of them, the most common one is the Master Data Quality Manager. That's
the individual or the organization.
Typically, when we certify somebody, it's an individual that
works for an organization and that certification covers both of them.
So, what do we do?
During the certification process and we do it in one day,
it's eight hours, it's not a -- a hard job, we basically go through the
principles of the standard and we basically --there is, yeah, there is a test,
right?
The test is, can you create a data requirements statement?
Can you specify requirements for data?
And we don't make it too hard. There is a piece of software
that helps you do that.
Having specified your requirements for data, can you
formulate that into a request for data?
Again, there is a piece of software to help you do that, so
it's not very difficult.
And then finally, if you received that request for data,
could you actually answer it?
And again, those three things are what we look at, in the
certification process.
Can you generate a -- a -- a request for data?
Can you answer a request for data?
And of course, to do that, you have to be able to define
data requirements.
So that really is the key to the -- to the individual
organization as the master data quality manager.
We have a simpler certification, which is actually,
literally done in about 3 to 4 hours and that's a Quality Master Data Provider.
Typically, that's designed for suppliers, who are going to
be simply answering questions.
All we want to know, is, if we send you a request for data,
can you answer it?
Now, it's just like speaking a foreign language. You know,
listening and, you know, writing down what they're saying, is not too hard. So,
it's a lot easier if somebody sends you the questions, you look at the
questions and you answer and it's really very, very, simple, right?
[0:16:21.5 ] So, if I -- if I gave you a passport
application that was ISO 8000 compliant, and you filled it in, all the boxes
correctly, guess what?
[0:16:32.0] You’d be deemed to be certified as an ISO 8000
Quality Master Data provider.
Simple!
Now, the other two levels of certification are a little bit
different.
One is for software services -- software applications and
the other one is for data cleansing services.
A little bit more complicated because at that level, we want
to make sure that they are able to use Web services to access dictionaries, to
be able to look up concepts in an open technical dictionary, that they can
create data requirements in XML, they can create queries in XML and they can
respond to a query in XML.
So, there's a little bit more work involved at a software
level as -- and also the service level.
So four levels and -- and typically, the -- they're not
designed to be complicated and therefore they are not designed to be high-cost.
And I don't know, Dylan, if you want to touch on, you know,
the process we are using, is very, very, different from an ISO-9000 process.
DATA QUALITY PRO: Yeah. We -- we talked about that before
the call, so, -- (inaudible) because sort of thing, a lot of people will have
experienced the 9000 standards, so can you, yeah, can you explain how -- how
they -- how they differ there?
PETER BENSON: (inaudible) answer would've been painful. When
we designed ISO 8000, couple things were very -- that jumped out at us.
Number one, we're talking about data and data in -- in the
definition we're using, for these standards, is, processable by a computer.
So, it follows that if I have requirements that are
processable by a computer and I have data that's processable by a computer,
that I should be able to use a computer to evaluate whether or not the data is
compliant and that's the huge difference between ISO 8000 and ISO-9001.
ISO-9001 says, "You have a process and you follow your
process.”
And, the only way to check that is for a humanoid to turn up
on your doorstep and look and see that you have these processes and look and
see if you're following these processes.
Well, you know, that's an audit and that's expensive, right?
And intrusive!
In this case, I don't really care how you generated the
data. I care that the data meets the requirements.
So, ISO 8000 is only concerned with exchange of the data,
right?
Data that comes out of a company, so every time you send me,
data I can immediately and automatically verify that it is ISO 8000 compliant.
At every transaction I can do that.
So, I don't need to send in auditors to go and push you one
prong you and go to check your systems, I just want to -- I can check them
every single transaction that you send me, that it is or is not ISO 8000
compliant.
DATA QUALITY PRO: So...
PETER BENSON: That's pretty much the difference.
DATA QUALITY PRO: Yes. I -- I -- I and so obviously, being
on the end of the, you know, the 9000 series standards, so, I remember the
auditors going in and it's obviously a quite lengthy and costly process, so --
so -- so I -- so hopefully the -- the ISO 8000 standards should be sort of more
widely adopted again because it's something -- it's something that is obviously
less intrusive and less costly.
So -- so -- so in terms of software application, on software
application side, do you -- I mean, what sort of vendors you are actually
seeing, software vendors going down the route of ISO 8000?
Are you seen sort of data quality vendors or master data
PETER BENSON: (inaudible)
DATA QUALITY PRO: data management vendors or -- or is it the
traditional kind of
PETER BENSON: (inaudible)
DATA QUALITY PRO: business app vendors?
PETER BENSON: No, it -- typically what you're seeing at the
moment, we're seeing at the moment is, the -- those companies who’ve been
involved in -- in data cleansing, cleaning master data, they've been getting
(inaudible) that's why if you look at our website, you’ll see a lot of data
service providers, the cleansing of -- obviously, their customers are
interested to know that the data they get back is portable data, it's coded
using an open technology dictionary, it's not proprietary.
There's lots of reasons why they wanna make sure that the service
provider person doing the data cleansing, is doing it in accordance to the
principles of ISO 8000.
So that's -- that's the service side of -- of data cleansing
services.
The other side in terms of software applications is
typically tools that are used to manage data quality.
So, these tools will be tools where you develop your
dictionaries, you develop your on top -- your -- your -- I was going to use the
word anthology but that's a -- (inaudible) come later, so dictionaries, plus
your data requirements and your classifications.
So, there’s a -- there’s the number of tools coming out --
out already, that are compliant with ISO -- ISO 8000 and we just start to see
the first, you know, MDM applications, you know, looking at getting certified
to being ISO 8000 compliant.
For an MDM application, it's ISO 8000 compliant, what it
really means and the key thing to that is data portability.
If you cannot get the data out of an application in a -- in
a form where the metadata is open and the model is open, it's not ISO 8000
compliant.
And that's gonna become more and more critical as we go down
-- as we go down the line, as more and more people actually follow and
understand the importance of ISO 8000.
DATA QUALITY PRO: I understand. So, I'm -- I'm all right in
thinking, this is an XML-based standard in terms of the portability of data, is
that correct?
PETER BENSON: Not quite. The -- the way that standard is
written, the syntax that you use, you can -- as long as it's an open setting,
you can do what you want.
So, it separates, you know, whether it's XML -- I can have a
spreadsheet which is ISO 8000 compliant, right?
I can have an e-database that's ISO 8000 compliant. The
syntax as long as it's available and accessible, not a problem.
The semantic encoding is typically where the first problem
occurs, is, all the metadata that was used in your spreadsheet, your column
headers, for example.
Well where the definition for that?
Now, you can do that either by including those definitions
in your file, that would be ISO 8000 compliant or typically by referencing an
open technology dictionary.
So, it's the -- the ability to resolve the metadata, the
data labels back to the -- the definition, which is probably where the first
problem of ISO 8000 starts to hit.
So, portable data, if somebody sent me, you know, comma
separated data, that's fine. Syntax, I can read that but if they didn't give me
the map, of, you know, the model or -- or -- or the metadata, right?
It's not meaningful, right?
So, that's -- that's part of ISO 8000, requires that when I
receive data from somebody, I could import that into any other system and it's
still meaningful.
So, you know, date of birth is still labeled date of birth
and it's resolved to a date of birth, so no cryptic, you know, no cryptic
metadata.
DATA QUALITY PRO: I understand and obviously by having that
the -- you -- you -- you can obviously embed the -- the semantic rules that --
that kind of, you know, enforce a requirement, I guess.
Is that -- is that right, the -- the idea of (inaudible) you
actually
PETER BENSON: Correct
DATA QUALITY PRO: along -- along with the data
PETER BENSON: You -- you
DATA QUALITY PRO: you apply the semantics at the same time?
PETER BENSON: Correct. Now we use for -- for ECCMA, we use
another standard to do that.
We use ISO 22745.
ISO 22745 is another standard that we are project leaders
for, which came out of the NATO environment, that basically has an XML schema
for defining data requirements.
So 22745 Part 30 is how you express in XML, your data
requirements and there is also how you resolve an open -- an open technology
dictionary.
So, if you send me data which is ISO 22745 compliant, I can
tell you whether it is quality data or not.
Now, as I said there are -- the way ISO rules or how we
write standards, we cannot mandate a specific process, we have to talk
generically about how the process has to be performed.
So, ISO 8000 says, "you must have a syntax, you must do
semantic encoding and it must meet requirements.”
It doesn't tell you how to do it.
ISO 22745 tells you how to do it.
DATA QUALITY PRO: Oh, I see. OK. Just -- just for people on
the -- on the session, (inaudible), "Peter’s gone and put together quite a
detailed PDF, going through a lot of the terminology on (inaudible) concepts”
and I'll circulate that with everyone on the call. We'll make it available on
the website as well.
Interestingly, I was reading through the website and I was
going through some of the documentation you sent and obviously, yeah, the goal
you -- you cited was the -- the fast track.
So, it's the better quality data and also sort of a
statistic, so the recent test showed a 30 percent increase in the quality of
data and I guess, you know, figures like that are sooner gonna make a lot of
organizations sit up, particularly as you say, in light of the -- the recent
issues in the finance sector and other sectors. So, I mean, what are some of
the improvements that you're seeing on the ground, with regards to data quality
and how is the standard making a difference, you know, in some of the sectors
you’re seeing?
PETER BENSON: Well, that that study in fact, I believe it
will be published pretty soon, you may want to put it on their site as well.
Comes out of the U.K. Ministry of Defense, they conducted a
very in-depth study using the principles of ISO 8000 to request data from their
suppliers and what they looked at, is they're traditional method, that there is
a there's a rule that says, "if they're going to buy something, they need a
certain amount of data to manage it in their in their inventory or to buy it,”
or whatever else, so, they have requirements.
Now, the (inaudible) system wise that when somebody sold
them an item, they accompany it by technical drawings or some description and
then a catalog or we’d sit there and try to figure out, you know, how to
describe it.
In the new system, using ISO 8000, ISO 22745, they generated
requests for data, sent it to the supplier and said, "Well, you know, you're
trying to sell as this or you sold us this, these are the characteristics, I
need to know. Can you please answer these questions?”
Now, what was interesting -- what was interesting is that
they basically -- they basically got in -- in their analysis -- the speed in which
they got the data back, the quantity of the -- the quantity of the data they
got back, you know, those were things there were really measuring and it -- and
they gave some metrics in terms of financial consequences of what they were
doing and it was -- it was huge, absolutely huge.
It demonstrated that the -- and it comes out of the
principle, that it's better to ask to ask for what you want.
If you ask the -- the data
provider, this is exactly what I want to know, you're more likely to get the
right answers back and in terms of 30 percent savings on their cost, now they
have a very clear benchmark of how much it cost them to describe an item and
this was, actually it was greater than 30 percent but that was our -- what we
agreed with -- with the published figure.
DATA QUALITY PRO: (inaudible)
PETER BENSON: So they saved -- they -- they saved hundreds
of thousands like this. They -- they looked to save tens of millions of -- of
pounds on their cataloguing effort, by simply communicating more accurately
with their suppliers of what data they need.
DATA QUALITY PRO: I understand. So there is obviously the
data quality, the -- the physical data qualities are going to improve but
obviously there's not (inaudible) there because they're not having to pour
through catalogs, they're not having to go and resolve ambiguity, that there's
-- there’s clear definitions being passed -- obviously clear requests going
forward and then clear information coming back, so , yes, I can see how the --
the cost improvements would come through there.
So, finally, I -- I guess from the people on the call will
probably wanna know a little bit about two things really, so -- so one will be
fees, you know, what other costs involved for some of the -- some of the
certification standards and also if they are interested what -- what should they
do next.
So, if you look at cost first and then, you know, what are
the next steps for getting, you know, sort of -- obviously contacting you and
-- and taking this forward?
PETER BENSON: Well up --up ‘til now, we've been running
actual tutorials, in person tutorials, takes a day and I believe the cost of
that's about $500 for that training session.
There -- we've been asked to split it into an online --
online tutorial session over eight mini sessions, I believe that will -- that
will be available on the 1st of December.
The cost of that again, will be (inaudible) around $250 and
that includes a certification process.
You need to -- if you have a certificate, you need to renew
it every year. I believe that between $50 and $100, it depends on if you’re a
member of the association or you're not members of the association.
So, it -- this is not -- this is not a high-cost exercise,
it is focused on the practical implications of ISO 8000, how to actually
generate a request for data, how do you send a query and how do you get a reply
back, that's very, very, focused on what it does.
It's -- it's not -- we are not, again looking at -- you know
very complex issue of governance, that doesn't come under the current ISO 8000,
although it will do.
The other one that I’d like to bring (inaudible) was your
attention on these (inaudible) there is a fundamental difference, that is what
we found during the development of the standard, between data quality and
information quality and that was -- you know, it took us two years to figure
that one out.
So, what we're looking here is data quality.
The fact, that you've provided the data right?
Doesn't necessarily mean the data is useful.
So data quality does not necessarily mean that you would
have quality information, although it is impossible to have quality information
without quality data.
So that's one way it works, the other way it doesn't.
So, typically to get a individual certified as master data
quality, as it currently stands, it's a one-day training course.
The next one’s in October in Bethlehem and I believe there
is a group in France that will be putting one in the spring.
As of 1 December, it will be available online and it will
connect you to go through their certification course themselves.
The advantage we have with this, is, because it's not an
audit process, as you go through the certification process, you're creating
three files. Those files are actually assessed, as being compliant.
So, it's much more positive, much more objective, rather
subjective test.
DATA QUALITY PRO: I understand. Thank you very much and I --
just if anybody wants to come forward with some questions on -- on the session,
please let me know.
If there is -- I think one -- one -- we've had one question
come in just a few seconds ago but I think, yeah, that obviously clears up the
-- the -- the next steps scenario then.
It --it -- I mean, is there any software that is available
now, that can kind of help this compliance process?
I don't necessarily
-- I don't mean the -- the online sort of training element but is there any
sort of tools that you --the -- the people can use to help integrate with
their...
PETER BENSON: Yes.
DATA QUALITY PRO: existing
PETER BENSON: What we're seeing...
DATA QUALITY PRO: ...environments?
PETER BENSON: What we're seeing is that there are a number
of members of the association that are developing some very sophisticated
tools.
Some are more expensive than others but one tool in
particular which is drawing attention, is there is a very interesting anthology
management tool.
Now, when we talk about anthologies, we start thinking about
you know, OWL and RDF and all that, fairly easy, not so complicated stuff,
right?
But, in reality, anthology is no more than a subset of a
dictionary. So, what the tool helps us do is, say, "here's an open technology
dictionary which has 4 million concepts in it, let me create a subset of what
I'm actually going to use” and then it allows you to then take that and build
your data requirements.
So data requirements is not very complicated, it says, data
requirement simply says, for an item, let’s say a table, I need to know these
three things about the table. The height, the length and the material, right?
So, you know, those -- that's how you develop a data
requirement.
Data requirement is also a form, you know, my -- my passport
form or my mortgage form, those are data requirements.
So, what the software does, it allows users to very quickly,
through click and point, just build their data requirements and then of course,
you could generate requests for data, you can generate cataloging tools that go
with it and all the rest of it.
So there are -- we’re starting to see some early tools being
developed, some very sophisticated tools.
They will -- will allow you to work inside an ISO 8000
environment of, I have a syntax, I know what semantic encoding is and I know
how to build and manage my data requirements.
And again, that's what separates us from those who say, you
know, "I've got good quality data.”
My answer is, "OK and how did you come to that conclusion?”
"All right”
"Well, it's gotta be based on
data requirements” and "show me your data requirements.”
And you get all sorts of run a-rounds.
"Well, Hmm.” You know, "what do you mean by data
requirements?”
"No, no. What data do you need?”
DATA QUALITY PRO: (inaudible)
PETER BENSON: Well, there are -- there are tools out there,
if you look at our member on the ECCMA website, we tend to highlight those
companies that are working in that area but in terms of members of ECCMA you
got both you know Oracle, SAP, IBM.
You've got the big companies as well.
They're also working on the same problem.
We have a conference you in October, I think it's on our
website and there's presentations there, from people as far ranging as the FBI,
showing how they're using some of these tools, too, I believe Boeing is showing
how they're using these in aviation and from, you know, refineries and oil, you
know, oil facilities, obviously not the BP platform, I'm afraid.
DATA QUALITY PRO: Yeah, yeah. OK that is understandable. So
the I -- I -- I guess so -- so what's your kind of roadmap then of --of just
wrapping up, I guess, where is -- where is the roadmap for ISO 8000 and
(inaudible) sort of, you know, where you are now and where you're hoping to
take the -- the standard moving forward.
PETER BENSON: Well we -- we -- we have done a pretty good
job on master data and that's being generalized to transaction data. So we're
moving from master data up to transaction data.
That -- that seems to be pretty much under control.
There is an initiative at the moment, to develop governance
recommendations. I hope those will be recommendations, not compliance issues.
We'll see where that goes and also ISO 8000 is going to -- not only we’ve we
got the data side pretty well nailed, I believe, at least, we'll be moving more
into information quality.
Now, again the, information quality issue, there are
different characteristics of information that don't apply to data and those
tend to be in relations with third point, for example, timeliness is an
information quality issue.
Well that has nothing to do with the data. The data is
whatever it is.
Whether you received it in a timely manner, is only from
your perspective.
So those are issues we are looking at as well.
DATA QUALITY PRO: I see. So...
PETER BENSON: (inaudible)
DATA QUALITY PRO: So -- so -- so your -- your aim is to
obviously as (inaudible) the focus is on master data, which I'm assuming is the
-- the most common type of issues, most organizations have but the -- the idea
is -- is just to branch out and basically fill in the -- the -- the whole of
the -- the (inaudible) data go in and some information quality kind of -- a
landscape as well, ain’t you?
I understand. OK.
Well...
PETER BENSON: But-- but you -- you be cautious on those
areas where there is no objective form of measure.
We're trying to keep away -- well ISO-9001 does a good job,
for what it does, you know, there's no point replicating that. So, when they're
looking from a data perspective, what can I do, how can I evaluate the data, to
come to the conclusion.
DATA QUALITY PRO: And -- and -- I -- I -- I guess that's --
that -- that that's the challenge and, you know, and I think this is where we
-- we -- if -- if you want to have a heated argument in a -- in a pub, ask a
lot of data quality information, quality professionals what's -- what's the
difference between the two terminologies, out, you know, this is -- so many
difference of opinion there.
So -- so I guess for the -- for the time, you’ll -- you’ll
you’re emphasis is really on, you know, how you can actually measure, obviously
you've got the, you know, you've got you you've got the data standards passing
between producers and consumers and so the whole point of this standards, I
guess, is to is to be able to electronically verify, I guess and -- and -- and
instantly kind of, you know, sort of measure the data. So obviously, this is --
where we're drifting to
PETER BENSON: (inaudible)
DATA QUALITY PRO: (inaudible) ambiguity and subjectiveness
then -- then -- then the -- the -- the
-- it becomes harder to kind of enforce that. Is that is that right?
PETER BENSON: Yes but I do think that the jury is out on
information versus data. I -- I think there's been enough worked on that there
is clearly a authority of, in terms of what is information and what is data.
(inaudible) it is very amusing that the individual that
promoted the fact that the two were synonymous and you could not have
information without data and data without information, it's different, they
didn't represent the same thing, sent a document to the committee that
explained that.
And by luck, would have it, and it was purely luck, the --
the document was actually transmitted, left his machine as a PDF, arrived as a
PDX, which nobody could open of course.
Why?
Because it failed the first state of data quality, the
syntax was incorrectly defined and so the answer was, you know, "we understand
your document and -- and we don't even have to read it because in sending the
document you've proved that there is a fundamental difference between data
quality and information quality.”
DATA QUALITY PRO: (inaudible)
PETER BENSON: (inaudible) aside of, you know, the issues to
do with syntax, semantic encoding, completeness, those are issues which are --
either that they exist or doesn't exist, I can measure it, I can do things with
it.
Whether it means something valuable, whether it's accurate
in the real world, that's a totally different decision point.
DATA QUALITY PRO: I understand. That sounds yeah (inaudible)
whole different area. OK
PETER BENSON: In fact ISO...
DATA QUALITY PRO: (inaudible)
PETER BENSON: ISO 8000 goes further (inaudible) and says
that, "there is no such thing as accuracy.”
How is that for a thought for everybody?
Right!
"There are only assertions of accuracy.”
Somebody says, "this is accurate,” right?
And it's either, "I warranty that it's accurate,” in which
case if it's not, I'm gonna pay you or "I am asserting it's accurate. Here's
how I came to that conclusion.”
But saying, "this data is accurate, is -- is nonsense.”
Somebody should be making that assertion.
DATA QUALITY PRO: Oh, yeah. I -- I briefly read that through
the documentation and obviously I'll -- I’ll send the documentation out, that
you sent today.
So that talks about you or I, so the accuracy seems to
relate to you or I which is -- could you briefly explain what -- how that kind
of relates in the -- I'm on the right kind of area then?
PETER BENSON: Yes
DATA QUALITY PRO: With your...
PETER BENSON: Yes, yes. We -- we were cowards, right? Being
cowards, as we are, (inaudible) should be said that, you know, accuracy -- if
somebody makes a statement of accuracy, so we need to know the organization
identifiers, who makes this statement, right?
So we need an organizations identifier and then OK, where is
their statement, right?
So we (inaudible) point to it. We don't say, how the
statement should be and that may be something the group may want to go into
later on in terms of the standards organizations but we don't make any
requirements of how the warranty should be formulated, or how the assertions
should be formulated.
We should be saying, "somebody has made that assertion and
they're gonna point us to something that we can make an evaluation, whether or
not we agree and we accept that accuracy.”
DATA QUALITY PRO: So -- so -- so what is -- what is this --
I've just got the slide up on the -- on the page
PETER BENSON: Yeah
DATA QUALITY PRO: now (inaudible) 130 accuracy, element, so,
I mean, so what is a universal resource identifier?
Is that -- is that a person?
Is that a...
PETER BENSON: (inaudible)
DATA QUALITY PRO: a value? Or...
PETER BENSON: No, no, no. A URI is -- is like your URL. It
simply says, "there’s a pointer on the Internet to a document somewhere,”
right?
So the URI is actually a document -- it's like a, it's a --
it's a web reference to an actual document, right?
So, I make an assertion that this data is accurate. I saw
your name and address labels or whatever and I say, "You know, Dylan, this data
is accurate.”
Well your question to me is, "Peter, are you guaranteeing
that it's accurate? Can you please send me the guarantee,” right?
Or, "are you simply asserting that it's accurate? In which
case can you send me a document that explains how you came to that conclusion,”
right?
So, what you're really asking for, is, pointers to
documents...
DATA QUALITY PRO: And so...
PETER BENSON: ... and the URI is simply the Web way of -- of
referencing a document, that's all.
DATA QUALITY PRO: So -- so in -- in a subtle way, this
starts to enforce, I guess data governance, the way, you know, your data
governance policies will kind of cover a lot of these areas where, you know,
for -- for data -- I guess a good example would be so for financial data, you
know, end of year financial data being passed around their organization that --
that you -- you -- you basically wants one to warrant or assert that the -- the
data has been compiled correctly and -- and before that information is then
passed off to a regulator or a compliance agency or something I that.
Is that the kind of scenario you'd say, yeah?
PETER BENSON: And -- and if somebody say, "well I compiled
it correctly. The answer is, "OK, can you show me how you did that? What were
the rules you did to do that?”
Right, so it's getting away from people making these --
these claims that are just, you know, hot air, "my data is accurate.”
"OK are you -- are you just gonna warrant here? If I find
it's not, are you gonna reimburse me?”
Right, it's financial data, you could be paying an awful
lot, right?
So, what we're trying to do then and again, remember 130 is
supplemental to 120.
One hundred and twenty is provenance.
To have 130-accuracy, you have to know where the data came
from.
One hundred and twenty is provenance, tells us, where is
this, the data comes from and that -- it's pretty simple. It says, you know,
"who owns the database from which this data was extracted and what time and day
it was extracted?”
So that gives us provenance.
Now, it allows us to trace where data comes from. Now,
again, we are not talking about the message, we are talking about individual
data elements. For example, in the password example, I did earlier, basically,
you know, it's date of birth.
"Well, where did you get the date of birth?”
Right, or, "I know my date of birth.”
OK, you know, if it's -- if I am the author of that, I'm the
authoritative source, then that's fine but in reality they may want me to prove
that, with a document, right?
Because in fact, my date of birth, it’s -- it's what is the
date of birth of record?
Right, they're not asking me when I was born.
They're asking, in reality, you know, to point to a record
of when I was born. It's when it was recorded and where it was recorded.
So, the whole concept of authoritative source is part of --
part of the ISO 8000 process.
Provenance allows me to track where this data came from and
then I can decide and again this is information quality, I can decide whether I
believe in the provenance or not.
I mean if you -- the reason we chose the word provenance,
you know, it's like in art.
If this painting went through a series of people’s hands,
and provenance shows me the track -- the track record, but if there was a
well-known forger in that path of provenance, then the value may be
substantially less.
That's the weakest chain in the chain of provenance.
So -- so ISO 8000-120, in claiming that your data is
compliant with ISO 8000-120, simply says that, "every data element is labeled
with where this data came from.”
DATA QUALITY PRO: I understand...
PETER BENSON: And 130, it says, OK, "I can tell you where it
came from and I am now claiming it's accurate.”
DATA QUALITY PRO: I understand. I --I --I can see so many
applications here of how this can obviously it can mean... really play such a
huge role in the -- certainly in financial, compliance sector, you know, the
banking sector as well and the regulatory compliance because it's -- it's -- it
is also that proof of -- that proof of, you know, the provenance and obviously
the accuracy about too (inaudible) the -- the -- the (inaudible) when I say --
it's obviously something that can get lost as the -- these data sets
(inaudible) these huge corporations.
So I'm...
PETER BENSON: Absolutely correct, you see, they -- they get
broken apart, reassembled, all the rest
of it and you lose track of it and, you know, it's -- it's -- it's amazing.
When you start doing -- tracing this through, you start to
find that there really is a -- some serious problems out there.
DATA QUALITY PRO: And I think -- just they say, just
applying the provenance itself will actually help sort of tighten up a lot of
these (inaudible) information chain defects that creep in.
I'm -- I'm -- I'm just looking at the time, you know, we've
--we've run over a little bit but it's been a really, really, interesting
discussion for me because this -- this -- a lot of topics here that really
apply to some of the -- the businesses, you know, I've -- I've worked with in
the past.
So, I mean I've, I -- I do hope that the standard becomes
widely adopted and I -- I think it's a obviously a great -- it's a great bonus
and a benefit for some of the professionals as well, to kind of get trained in
some of these areas, so I hope people look at this and learn a little bit more.
So, thank you very much for your time today.
I'm gonna put a bunch of information around ISO 8000 on the
site as well but obviously if people have any further questions, then either
send -- send them directly to me at edotor@dataqualitypro.com and obviously you
can contact the -- Peter at ECCMA but we -- we will give plenty of links and
information so people can find out more.
But, once again thank you Peter, for your time today and
thank you all
PETER BENSON: Thank you
DATA QUALITY PRO: Thank you everyone for attending and
posting a few questions in and I look forward to the next webinar.
Thank you.
END