How do you manage Customer and Supplier entities on a logical model in a way that doesn’t lead to data quality issues?
John Owens is a long-time contributor to Data Quality Pro and sits on our expert panel. He is an expert in business and systems modelling, creator of the Integrated Modelling Method and currently lives in New Zealand.
I asked John to expand on an earlier post he had written on his own website: Logical Data Model for Customer & Supplier because he believes it raises a number of data quality issues for companies that get it wrong.
Data Quality Pro: Why should Customer & Supplier entities never appear on a Logical Data Model?
John Owens: The short answer is because they are derivable data entities. This means that ‘Customer’, ‘Supplier’ and indeed ‘Employee’ are already contained in the data within the enterprise and, as such, can be derived from that data. ‘Customer’, ‘Supplier’ and ‘Employee’ are in fact derivable roles that are played by that pivotal core entity called ‘Third Party’.
Data Quality Pro: What kind of problems does this create from a data quality perspective?
John Owens: It creates a lot of problems in that it results in a lot of duplication about legal entities, be they individuals or corporations, with whom the enterprise has entered into commercial and, therefore legal, agreements. Any duplication or confusion regarding the principals in these agreements puts the business at both legal and financial risk.
However, let’s take a step back to see how the split of these three roles developed into the total separation that exists today in far too many enterprises.
It has its roots in legacy systems that were built in departmental silos. The Sales Department would build (or buy) a system to manage customers. The Purchasing Department in its silo would do the same for suppliers and the HR or Personnel Department the same for employees.
Twenty years ago this seemed a sensible approach as none of these departments ever interacted with each other and Customer, Supplier and Employee were entirely separate things, weren’t they? If they were not, tough luck, as that was the way vendors structured their software and, if you asked the IT department to build an integrated system, you would have a very long wait for a system that would cost the Earth and, by the time it was implemented, be totally out of date!
When the concept of corporate processes that transcended the traditional silo mentality of large enterprises was introduced, it became very clear (for those with eyes to see) that the apparent separation of Customer, Supplier and Employee was an illusion.
People or organisations to whom we sold things (Customers), could also be people or organisations from whom we purchased things (Suppliers) and these people could also be people employed by us (Employees)! So, they were one and the same thing, a legal entity other than the business, i.e. a Third Party and the terms ‘Customer’, ‘Supplier’ and ‘Employee’ simply described readerships we had with the Third Party, depending on what commercial transaction we had had with them.
Logical Data Model, created by John Owens
Data Quality Pro: Many companies will obviously buy off-the-shelf software, for example using Salesforce.com to manage their customer data, are Logical Data Models still relevant to them?
John Owens: An essential step when evaluating off-the-shelf software prior to purchase is to carry out a gap analysis against the Logical Data Model. Most enterprises fail to do this. They have a checklist of features and functions that they tick off as they evaluate each piece of software. What they fail to do is answer the critical question, “Will this software support our overall data needs?”
These need to be defined in two ways:
- Will it support our data structure? and
- Will it enable us to map its view of, say, Customer to our corporate definitions of Third Party?
If the answer to 1) above is “No”, then this is a show stopper and the application needs to be struck off the list.
If the answer to 2) is “No”, then this presents a large risk to the business in terms of data integrity and quality. Management then need to decide whether of not the overall benefits of the package outweigh the risks.
Data Quality Pro: Why is Option 1 regarding data structure a show stopper?
John Owens: Most applications or packages allow you to add and define additional fields for the attributes of entities, giving the impression that they are fully configurable. What most people fail to realise is that it is not the lack of attributes that cause failure but the absence of relationships between entities or, in the database terms, foreign keys between tables.
If the software dose not support the business structures in terms of all of the required relationships, then no matter how many features it has, it will not support the business needs. Ultimately it will be an operational and financial failure.
Data Quality Pro: You say that “From the above model we can derive Data Views for Customer, Supplier, Employee and, over an above that, Creditor and Debitor. This is because all of these are derivable roles for the true Data Entity of Party.” – What do you mean by a “data view” and can you explain the process of creating these data views for Customer, Supplier, Employee so that companies can get this right.
John Owens: Some RDBM systems, for example Oracle, allow you to define “Views”. These Views can be accessed by applications in the same way as tables are but they are actually made up of data extracted through queries on the database based on particular criteria.
So, to create a Data View of ‘Customer’ you would simply do a selection of all Third Parties to whom the enterprise had sold products or services during the period in question.
A View of Supplier would be all Third Parties from whom the enterprise had purchased products or services during the period in question.
Employee would be any Third Party with whom the enterprise had a contract of employment that was current during the period in question.
Similar queries would produce Views for Debtors and for Creditors.
As this shows, once the core entity of Third Party is put in place, all of these others become derivable entities and the information about them needs to be held once and once only, thus avoiding duplicated and contradictory data.
Data Quality Pro: Do you think this is why so many companies are relying on Master Data Management and Single Customer View solutions, to resolve these inherent deficiencies in their modelling and application design?
John Owens: It is. However, they are also relying on what is, for most enterprises, flawed Master Data Management (MDM).
An essential tool in MDM is the Logical Data Model. The fact is that without a Logical Data Model, no enterprise can be said to be seriously doing MDM!
Looking at a Logical Data Model, one properly constructed using the Dead Crows Fly East layout,would immediately show MDM practitioners that Customer and Supplier were in fact the same thing, that is, derivable roles for Third Party. With a little more work Employee would be seen to also fall into this category.
What is happening is that too many data quality managers and practitioners are relying on, and using technology, to try and provide answers rather than developing their essential core skills, techniques and disciplines. In any other field, other than computing and data quality, this would be seen as crazy behaviour.
Nobody in electrical engineering, for instance, would assume that the latest gizmo, like a computer controlled soldering iron – one that could automatically detect each terminal type and preset the temperature – would enable a person to build a sophisticated piece of electronic equipment if the person holding the soldering iron, a) could not read a wiring diagram and b) did not possess a wiring diagram for whatever it was they were planning to construct.
The Logical Data Model is the wiring diagram for Data Quality. No LDM, no DQ!!
Data Quality Pro: This is something, as you know, that we’re passionate about here on Data Quality Pro, explaining all of these supporting disciplines that help organisations break this continual cycle of data quality failures so what other disciplines and practices do you feel are absolutely imperative for companies to adopt in order to prevent long-term data quality failures?
John Owens: Data quality is, contrary to popular, relatively simple to achieve, provided you start in the right place and follow some fundamental rules and critical truths.
- The only data needed in an enterprise is that required to support the Business Functions of the enterprise. There are no exceptions to this. Therefore, Data Quality starts with knowing and modelling the Business Functions. This also brings the stark truth, which is, that without modelling its Business Functions an enterprise can never achieve Data Quality.
- There are no such things as Data Rules! Rather there are business rules that require or constrain particular values for the attributes of data entities when carrying out a specific Business Function. These rules, together with the functions logic, will be an integrated part of the Business Function in question.
- Data Structure Comes from Function. All data entities in an enterprise, their attributes and their relationships with other entities are, and must always be, derived from the same source materials as were the Business Functions.
- UIDs Are Not Codes. It is essential to clearly identify and document the Unique Identifiers (UID’s) of each Data Entity. This is the only sure way to eliminate data duplication. Incidentally, UID’s will never be codes or unique keys.
The above four rules are the foundation for Data Quality at the business level.
Will they give data quality on their own? No. The enterprise still has to ensure that the systems it builds and its daily procedures manage its data in a way that conforms to the requirements defined by these rules. But that is part of doing business day-to-day.
However, if an enterprise does NOT do all of the above, then the chances of it creating and maintaining data through its day-to-day activities, that is of high quality and in a form that will support its business needs, are very low.
It is also worth noting that, if the above rules have not been applied, then no amount of post creation data manipulation and cleansing will move the data any closer to what is required, because what is required has never been defined.
Please refer to the following posts for additional advice on this topic:
- There’s No Such Thing as Customer
- Logical Data Model for Customer and Supplier
- Customer & Supplier are Roles – But What Type?
John Owens is passionate about bringing simplicity, power and elegance to the world of Business Systems Analysis, Business Process Modeling and BPM.
He is an international consultant and mentor to a wide range of enterprises of all sizes in the UK, Ireland, Europe and New Zealand.