Beginners Guide to Master Data Management (MDM)

MDM (Master Data Management) for Beginners

MDM (Master Data Management) for Beginners

Beginners Guide to Master Data Management (MDM)

by Dylan Jones, Editor

Unless you've been living under a rock, you will have heard of Master Data Management (MDM), the information management discipline that presents great opportunities for data quality and data governance professionals.

Underpinning MDM is the need for an effective data quality management strategy and appropriate toolset. With so many organisations dipping their toes into the choppy waters of MDM we thought it high time to provide an overview for those getting started or wanting to learn more.

So what exactly is MDM?

The first stumbling block you'll face with MDM is when your peers or CEO asks you to explain yet another mystic three letter acronym to emerge from the world of data.

If you're looking for a simple explanation then this list provides some of the most commonly accepted definitions of MDM.

A set of disciplines, processes and technologies, for ensuring the accuracy, completeness, timeliness and consistency of multiple domains of enterprise data - across applications, systems and databases, and across multiple business processes, functional areas, organizations, geographies and channels.

Dan Power 
CEO, Hub Designs

The set of disciplines and methods to ensure the currency, meaning, quality, and deployment of a company’s reference data within and across subject areas

Jill Dyche 
Vice President, SAS Best Practices

MDM is the practice of defining and maintaining consistent definitions of business entities, then sharing them via integration techniques across multiple IT systems within an enterprise and sometimes beyond to partnering companies or customers

Philip Russom Ph.D. 
Industry Analyst, TDWI

MDM is a technology-enabled discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise’s official shared master data assets.

Master data is the consistent and uniform set of identifiers and extended attributes that describes the core entities of the enterprise including customers, prospects, citizens, suppliers, sites, hierarchies and chart of accounts.


Confused by the difference in MDM definitions?

Well, you're not alone.

If you spend time surfing the forums and communities that focus on MDM related subjects you'll realise that MDM is actually in its infancy compared to other disciplines but it is maturing rapidly, however, disagreements on what constitutes MDM are not uncommon.

If you are trying to provide a definition to members of your organisation you will certainly need to choose your words carefully.

Jill Dyché picked up on the importance of this in an article with Enterprise Systems, quoting Brian Rensing from Proctor & Gamble who commented that:

"You have to start MDM by attacking a problem that hurts people in their day-to-day jobs -- and use easy words."

Use a definition that meets your needs, don't go for a complex explanation that is meaningless to your audience.

Common MDM themes

Despite the confusion over a clear definition of MDM, by researching various thought-leaders and publications focused on MDM there are some constant themes that come through:

  • MDM is focused on Master or Reference Data (yes an obvious point but important to make the distinction with other information such as transactional data)
  • Certain dimensions of data quality are critical to enabling effective MDM (eg. timeliness, accuracy, completeness, meaning)
  • Continuous data improvement and a well-managed data quality strategy are essential
  • Technology is a key component in providing an MDM platform but MDM requires a whole lot more than just a technical solution, in particular data governance, which will be the subject of a separate article
  • Harmonizing and synchronising multiple data items is extremely important in creating a "single version of the truth" for your business objects
  • MDM typically delivers a "hub" infrastructure to source and distribute master data
  • Creating a single, shared reference point for a business entity is pivotal
  • Fostering cross-organisational commitment and change management through a data governance framework is also essential
  • MDM is in its infancy as a discipline, there are relatively few experienced practitioners, it can be tough to implement and it can take significant effort to achieve buy-in at a senior level

If the last point hasn't dulled your enthusiasm let's explore some more of the key elements of MDM.

What is meant by Master Data and Reference Data?

Every organisation typically has data on customers, products, employees and physical assets but these data items are seldom held in one location.

They are often physically scattered around the business in various applications, spreadsheets and even physical media such as paper and reports. What makes matters worse is that different parts of the business will have different concepts and definitions for the same business entity and relationship.

For example, an employee may be recorded in a payroll, HR, training and expense management system of an employer but back in the real world they are the same person.

Typical examples of master data include (sourced from Master Data Management by David Loshin):

Customers, Employees, Vendors, Suppliers, Parts, Products, Locations, Contact Mechanisms, Profiles, Accounting Items, Contracts, Policies.

What is the difference between Master Data and Reference Data?

Malcolm Chisolm, an expert author on reference data, explains that Reference Data is

"...any kind of data that is used solely to categorize other data found in a database, or solely for relating data in a database to information beyond the boundaries of the enterprise."

External data, therefore is a typical form of reference data whereas standard business objects such as customer, employee, parts and so on are classed as master data. When building MDM strategies, external data becomes incredibly important for creating a surrogate source of "truth". Some people consider reference data (such as standardized lists of values) as one type (or domain) of master data.

As Dan Power commented recently in this blog post:

"...when you don’t know what you don’t know, having an external content provider can be a big help." External data can play a major part in ensuring that you have a surrogate source of data to validate and enrich your existing business entities."

Why is data quality so critical to MDM?

If we read through some of the definitions above we can see obvious references to data quality.

We could also take the viewpoint that MDM is in itself a component of an information quality strategy because it resolves many of the issues that plague a typical information quality framework (eg. lack of timely data, duplicates etc.)

MDM pulls together multiple data items that relate to the same logical object and herein lies a common problem faced by our members on our sister site Data Migration Pro when undertaking system consolidation exercises.

There is typically no agreement on how common data items should be stored so when we try and combine disparate records for the same business entity we often have to make arduous decisions on which source to select as the most trusted and accurate.

However the problem for MDM is even greater because on a data migration project for example we can have many months to crack the data problem but we simply don't have that luxury in MDM initiatives.

MDM relies on near real-time data consolidation so these complex rules often need to be hard-wired into the infrastructure which gives some indication of just how complex MDM can be to implement.

Another thorny issue is the subject of 'MDM silo politics' as Jill Dyché, discussing MDM in Enterprise Systems, points out:

"People have truly realized that knowledge is indeed power, so they don't want to give up their data or collaborate in its re-definition or consolidation. They'd rather maintain control of their dirty, poorly-defined, incomplete customer records!"

(Dan Power has a good overview of the political aspects of MDM at this link).

If you've ever tried to foster cross-organisation commitment for any kind of data management initiative you'll know how tough it can be to get senior agreement and commitment to cooperate. Stakeholders may perceive MDM as undermining their authority by devolving ownership to a more enterprise-wide structure.

Data Governance for MDM is pivotal

Without data governance there is little chance of MDM succeeding so it makes perfect sense to build out an MDM strategy only when you have a well managed data governance framework covering the business subject areas in place.

The fact that data quality is so integral to MDM is clearly of great benefit to Data Quality Pro members who have data quality skills and are keen to progress their careers. They now have an additional and growing sector that is desperate for these skills.

What technology is required for MDM?

In a recent blog by Dan Power, he mentions five components of MDM, three of these relate to the technology element of MDM:

  • A "hub"
  • Data integration or middleware
  • Data quality capabilities


According to Dan, these come in 3 flavours.

  • A persistent hub takes all of the business critical data into the hub from the source system.
  • In a registry hub only the identifying information and key record identifiers are copied to the hub.
  • In a hybrid hub an element of both options is used allowing more fine-grained control of what goes into the hub.

Gartner and the MDM institute have similar definitions of hub architectures in these links: Gartner definition, MDM institute definition.

Data integration or middleware

Dan highlights the need to synchronise data across the disparate system landscape.

There is also a need to synchronise any data quality improvements that take place so that the benefits are maintained and quality is continuously improved. There are also various other interfacing and workflow type technologies that are incorporated in a typical MDM "stack" structure.

Data Quality Tools

According to Berson/Dubov these fall into 5 categories:

  • Data Quality Auditing
  • Data Quality Cleansing
  • Data Quality Parsing/Standardisation
  • Hybrids

The hybrid tool contains elements of the other data quality functions and may also incorporate ETL (extract/transform/load) capabilities.

The other functions are typical of most data quality initiatives.

Why is MDM becoming so popular?

Citing the open source MDM Solution Offering available on MIKE2.0, the following reasons are provided for the rise in popularity of MDM:

  • MDM issues impact the business. What is a business without its customers, its products and its employees? Master data is some of the most important data that an organisation holds and there is no choice but to fix the issues of the past; even minor issues with master data cause viral problems when propagated across a federated environment. A recognition that enterprise MDM defines competitive advantage has grown significantly in the last decade.
  • Increasing complexity and globalization. Master Data Management really hits right to the point of the drivers for an Information Development approach. Organisations are becoming increasingly federated, with more information and integration globally that ever before. Reducing the complexity is essential to a successful approach. Globalization led to a variety of additional problems and complications from the data management perspective. This includes multi-lingual and multi-character set issues, and 24x7 data availability needs driven by global operations. The number of channels enterprises receive and provide information has also grown significantly with the recent evolution of the Internet and voice recognition technologies.
  • All sides see a major opportunity. MDM is a big, complex problem and is therefore an opportunity for product vendors and systems integrators. New MDM technologies referred to as MDM data hubs have been developed. Even though the data hubs may look like their predecessors Operational Data Stores (ODS), modern data hub technologies are SOA enabled and leverage a number of other modern technologies not commonly used by the old traditional ODS. As the problem is an information management problem, every information management vendor has a "solution”. Application-centric vendors (which started the MDM trend) also see this as a major opportunity to expand their integration and application scope. Organisations with MDM issues are doing a variant of the same approach: they face a variety of challenges in the information management space and this provides them with a collective way to frame the problem. This situation is similar to that which arose with compliance initiatives a few years ago.
  • Compliance initiatives add corporate pressure. Driven by the War on Terror and corporate scandals in the US, compliance initiatives have put additional pressures on the enterprises. Without a sound MDM solution enterprises are facing increasingly difficult problems to support evolving regulatory requirements

What are the challenges presented by MDM?

The hurdles to overcome in delivering an MDM are actually very similar to those many of our members on Data Migration Pro will no doubt have witnessed, again citing MDM Solution Offering available on MIKE2.0:

  • Complexity: Organisations typically have complex data quality issues with master data, especially with customer and address data from legacy systems
  • Overlap: There is often a high degree of overlap in master data, e.g. large organisations storing customer data across many systems in the enterprise
  • Modelling: Organisations typically lack a Data Mastering Model which defines primary masters, secondary masters and slaves of master data and therefore makes integration of master data complex
  • Standards: It is often difficult to come to a common agreement on domain values that are stored across a number of systems, especially product data
  • Governance: Poor information governance (stewardship, ownership, policies) around master data leads to complexity across the organisation

Research from other experts in the field also leads to other more practical concerns:

  • Finding skilled practitioners/implementation partners to help deliver projects
  • Technology selection, MDM is an emerging market, lots of vendor movement, difficult to select one outright "winner"
  • Creating a compelling business case and getting senior sponsorship is always tough
  • Deciding where to start, prioritisation and focus are often a challenge
  • Educating the business on why MDM is so important

Where can I learn more about Master Data Management (MDM)?

There are some excellent online resources available on the subject of Master Data Management.

Online Portals:

Other Useful Articles and Resources:

Online Communities/Forums:

MDM Books: