How do you help a rapidly growing company get started with data governance so that it supports the strategic aspirations of the leadership?
In this interview I speak with Fred Kimmelbein, an Enterprise Data Architect for Scripps Networks Interactive. Fred is currently tasked with helping to mature a data governance strategy across the entire organisation.
Fred shares a number of practical techniques for getting a data governance program moving, setting targets for data quality and mapping the needs of the business into data governance requirements that everyone can understand and buy into.
Dylan: Your title is Enterprise Data Architect but you’re focusing a great deal on data governance at the moment. For others looking to make similar transitions, how has your data architect background helped you create success with data governance projects?
Fred: Data Architecture at an enterprise level is about the understanding of business data, where it is, who is using it and how it is being used. With this understanding and the drivers to create new data and find new sources of information you must work with the business to discover what they want and how they want it. This will help you create conceptual models for how the business operates.
Architecture and understanding the business data is just one part of overall Data Governance, though it touches many significant elements of Governance it is generally limited to the design aspects of data. I made my progression in this area here at Scripps specifically due to the need I identified in the company to improve the maturity around Data Governance.
The company overall is relatively young and has experienced explosive growth in many areas, which is great for the company. Due to the demands of the business and this explosive growth the company has had to be laser focused on making things work and although that was great for the first 20 or so years, they have realized a need to improve on the current state. This is fairly typical of any company that experiences the kind of growth Scripps has, which is why we’re now starting to develop Data Governance programs to improve our delivery to the business.
Dylan: On the data governance theme, what were some of the initial obstacles you’ve faced when trying to kick off data governance for the first time. How did you overcome them?
Fred: We are still in the early phases so this question really is what challenges are you facing and how do you plan to overcome them. In that light I can say that the greatest challenge is the knowledge that we need Data Governance across the enterprise.
An obstacle is a hurdle you just haven’t crossed yet, so looking at how to educate the enterprise on how Data Governance is important to overall operation, we’ve been using venues like lunch and learn sessions as well as email and annually we hold a Techtoberfest where this was one of the topics this year.
Starting next year we are doing a Data Summit where IT will be working with the business to convey cross departmental needs for data and how the business could be taking greater action to determine the path and growth of their information.
The ultimate goal is to drive down a path of educating the business on how data becomes information and information becomes knowledge and knowledge enables action.
If we can achieve that one simple goal the tables will turn and Data Governance will become a business demand rather than an obstacle. We’re on track and have started the education process.
Dylan: Many of our past interviewees have expressed frustration with the term ‘data governance’ in that it can often alienate or confuse the business community. Have you had to use a different term or adapt your communication approach based on the type of audience you’re aiming to get onboard with data governance?
Fred: In this regard what I’ve had to do is similar to any data mapping exercise. I’ve been asking people to describe and define for me the terms that make the most sense to them. For example “I want my data to be good” implies a data quality issue and a need to focus on this area. From here I would ask qualifying questions about what good data means to them and what about the data today makes it not good. I would then map this internally to the MDM or Data Governance processes and capabilities that make the most sense to justify their desires.
When discussing things of an IT nature it is often best to use terms the business can understand and then adjust their definitions to match known or accepted standards. This is often what trips up people in IT, as speaking the language of the business, is not often something we practice.
Dylan: You mentioned earlier that your current goal is to improve the maturity around Data Governance. What does that look like from a program activity perspective? You’ve mentioned rolling out a communication strategy and creating conceptual models but what are some of the other tangible gains you’ve achieved or are looking to achieve in the near term? I’m sure some of our readers getting started with data governance will find this particularly relevant.
Fred: Improving the maturity in Data Governance is a multi-pronged approach, that involves visiting with different groups and defining the things that matter most to them.
A fair example of this is where our business identified that it wanted to move the dial on integration and architecture around data. To do this there are several foundational things that need to occur like management alignment and the development of knowledge, skills and abilities in certain areas. For management alignment we need to identify the measurable aspects of an individuals behavior that can be determined to impact the areas where they interact with the governance processes. This allows us to improve knowledge in the organization of everyone’s responsibility in regard to Data Governance and ultimately floats the whole boat a little higher. In the end this will give us the ability to architect more efficiently as well as develop tighter integration with our data sources and federated data.
So in short the development of KSA’s that make sense and identifying the next near target for Data Governance that delivers value to the organization.
Dylan: Can you give an example of a past data quality project that you’re particularly proud of? Why was it so successful?
Fred: Data Quality is not about the 100% accurate delivery of data, it is about achieving the best possible result given the data capture methods and growth toward a target that makes sense for your business.
I was working for a Surety company where through our data capture methods we had multiple systems collecting similar data and storing it in the same repository.
Since the capture methods (different applications in Java and the Mainframe green screen) ultimately collected the data in one transactional system a process had to be put in place to ensure we were not duplicating data.
Though I was not with this company when the project was completed, we did have great success along the way in defining what the quality measures would look like and what was an acceptable measure of failure.
The target was 98% accuracy of matching people and businesses across the input methods and being able to guarantee that business seeking bonding through our agency were not given an excessive ability to impact our models for insuring them against their losses.
When we started this process there were as many as 180 entries in the system that represented one company, upon my departure this was down to where we could identify no more than 2 occurrences of the same company represented in different ways. This was a significant achievement considering the applications in use had a circular method of handing off the data to each other and case (all upper vs. camel case) was an important factor in handling the data.
Dylan: Some companies seem to head for “Five 9’s” levels of data quality, others seem to just pick an arbitrary number so I’m intrigued – how did you and the team agree on that final target figure?
Fred: Several passes of discussion on this occurred and what it essentially came down to was how much off hours support did any of us want to put into maintaining the connections between systems. The closer that number got to 0 the more we decided the data quality, needed to fit our quality of life considerations. No-one wanted to be called at 2am to figure out why information passing between systems created a duplicate record so we settled on a number that felt reasonable as a measure that the business could accept as “High Quality”.
I should also say that after having gone through many of these exercises to determine what “High Quality” means, it has all boiled down to this; what is the business willing to accept as serviceable information and how accurate does the business want it at the end of the day? This combined with how much time we want to spend irritating the support teams who have to answer the calls and federate or replicate the data generally puts you in the right ballpark for determining a number everyone can live with.
Dylan: Finally, on the Data Governance side again, you mentioned in the pre-interview that you had earlier created a central data repository for a large mortgage company. Can you explain the process you followed and the reason for implementation?
Fred: The idea of a central data repository was based on the need to give accurate and timely responses to mortgage companies that used my company for backing their mortgages as securities.
When I started with this company we had a multitude of systems that were interrelated and used live replication of data to make data movement between systems easier. This might seem like a foreign concept to some, but in the early 1990’s tools to move data were in their infancy and were not capable of giving us the same capability we needed so we used the native engine in the database we had at the time.
The idea of collecting some of the moving parts to a centralized data store came from a research project many of us worked on at the time we named the project F.O.D. this was for the movie with Kevin Costner that was contemporary to our work. The Field of Dreams project was considered because we knew that unless we built and proved the capability there would be no effort to change the current model and something absolutely needed to be done. This was a collection of data and systems people who had the mindset of “skin in the game”.
Much of this work was done without really putting a program together around it and free time investment or “extra effort” was considered the norm at that time, for people like us. This is a work ethic that appears to be less evident today.
In any case we did this knowing that the approach was risky and that unless we built it, the business would not come to the thinking that centralizing was necessary to the processes around this issue. The business would eventually suffer due to the amount of data flowing through the systems and the complexity involved in putting the information together to deliver an automated answer.
Another driver was the on-call schedule, to be fair it was a reasonable schedule, but everyone hated the Sybase on-call week not due the database engine itself, that was easy to deal with, but more due to the fact we had built a tightly interwoven monster that was difficult to support.
About Fred Kimmelbein
An industry leading Enterprise Data Architect focused most recently on Data Governance and Library Sciences, currently employed at Scripps Networks Interactive. With 25 years of experience in data systems, engineering, architecture and modeling having designing systems for CNA Surety, American Airlines, Freddie Mac, Canadian Airlines and Travelocity.
Fred has been published in The Journal of Imaging Services for How to Plan and Map Data in Knowledge Systems. He is a regular contributor to the LinkedIn groups for TDWI, DAMA and IBM Data Governance. Currently pursuing a Bachelor’s Degree in Media and Library Sciences.
His philosophy on data is; it’s everyone’s responsibility to make it clean, timely and cost effective. Decisions made regarding data should benefit the organization as a whole. Everyone should have access to the information necessary to perform their function. Data should be defined consistently throughout the organization.
You can find Fred Kimmelbein on LinkedIn:
Image rights: cc Flickr