Big Data Governance by Sunil Soares, a book review

Sunil Soares has carved a niche for himself as a thought-leader and practitioner within the world of Data Governance.

His first book: “The IBM Data Governance Unified Process” provided a detailed account of the people, processes, and software tools required to implement a data governance program. The book focused on a framework of 14 primary steps that were in turn broken down into 100 sub-activities for practitioners to follow.

No doubt many practitioners struggle with the question of how to get buy-in for data governance. Sunil tackled that very problem with his follow-up book: “Selling Information Governance to the Business: Best Practices by Industry and Job Function”.

The 3rd book that was recently released by Sunil focuses on another data governance challenge that many organisations are increasingly faced with – how do we govern immense volumes of unstructured, semi-structured or structured data?

In his latest book that was recently published: “Big Data Governance: An Emerging Imperative” Sunil presents a series of approaches that are aimed at the company that already has a big data initiative in place. Sunil makes an excellent point early in the book that because big data is inherently prone to errors, duplications and other data challenges it becomes all the more important to ensure big data is governed with effective controls and processes.

The book is structured into 5 parts. Part 1 provides some early advice for getting started., Part 2 focuses on big data governance disciplines, Part 3 examines the governance of big data types, Part 4 looks at big data governance in some important industries and Part 5 looks at big data technology architectures.

The 4 V’s of Data Governance – Volume, Variety, Velocity (and Value)

Part 1 firstly covers an introduction to Big Data Governance with a good case for extending the traditional 3 V’s of Data Governance (volume, variety and velocity) by adding a fourth element, Value. This makes a lot of sense because the initial problem comes are going to face with big data is determining exactly where to recoup their “best bang for the buck”.

Sunil provides an early definition of Big Data Governance that is perhaps the most comprehensive I’ve witnessed to date. Importantly, he points out that Big Data Governance is merely part of a wider information governance program and should not therefore be seen as some kind of maverick, standalone initiative.

The book then outlines a Big Data Governance framework that is really focused on 3 different dimensions: Big Data Types, Information Governance Disciplines, Industries and Functions.

Cross-Sector Big Data Governance Focus

Still in Part 1, the book introduces current insights into the types of big data and associated challenges found in a range of industries. I was impressed by the depth Sunil has gone to as he explored the current big data types in industries such as telecoms, retail, healthcare, banking, insurance, transport and a number of others. This is a great section to show your business leaders as it really brings the topic alive when you see just how far ranging big data is becoming across a huge range of sectors from Education to Transportation.

From there, Sunil takes the reader on a tour of the many business functions that are impacted by big data, outlining many examples of what happens without effect governance in place.

Big Data Governance Maturity Assessment Model Included

Chapter 3 provides a maturity assessment of your big data governance capability. Here Sunil leans on the IBM Information Governance Council Maturity Model (you may remember our earlier interview with Steven Adler that covered the IGC) to describe the 11 categories of information governance maturity.

Sunil Soares, author of Big Data Governance: An Emerging Imperative Sunil Soares, author of Big Data Governance: An Emerging Imperative

Sunil has extended out the model however to specifically cover big data goals, enablers and disciplines. The chapter effectively becomes a blueprint to measure your own ability to deal with big data and will probably become one of the most practical sections of the book.

Creating The Big Data Governance Business Case

One of the most important sections for many companies is that of creating a business case for big data governance but obviously in this case within the big data domain. Sunil provides a range of different approaches to developing a business case and what makes this section really valuable is the short case studies provided for each approach. Sunil provides examples of how geospatial data, clickstream data and smart meter data can be incorporated into a business case, making it much easier for the reader to get a handle on data outside of the traditional structured world of RDBMS etc.

Developing a Big Data Governance Roadmap

Big Data Governance Roadmap Included Big Data Governance Roadmap Included

Chapter 5 presents a strategy for developing a big data governance roadmap. Again, the roadmaps cited are based on real case study examples and obviously you will have to custom build your own if implementing a big data governance strategy but it does give you some useful insight into the timeframes other companies view data governance programs.

Part 2 of the book is where Sunil shares his extensive experience and invites other contributors to cover disciplines such as managing data quality for big data, ensuring security and implementing a data governance organisational framework.

The metadata chapter in particular is well structured with some great examples and key stages for the practitioner to follow.

The Big Data Quality chapter provides some interesting insights into the conceptual difference between big data and traditional business data. The sections on using big data to improve the quality of existing business data is particularly useful, showing the reader how big data can become part of the data quality solution and not just a challenge to tackle in its own right.

New Discipline = New Dimensions

It’s also interesting to see how new dimensions have to be introduced to cope with the inherent complexity and scale of big data. Again, the use of examples and case studies make the topics far more easy to digest for the reader. This is vital as relatively few people will have significant experience of big data to date.


I won’t get into exhaustive detail over the later chapters but suffice to say they cover the same amount of depth as the earlier sections, providing countless case studies, frameworks and practical steps for the practitioner to follow. It’s rare to find such a comprehensive resource covering so many topic areas, particularly for such a relatively recent discipline.

If your organisation is embarking on a big data initiative then it goes without saying that this book is mandatory as it provides tremendous insight clearly borne from delivery on the ground.

In parting, I would add that the book also shines a light on an exciting part of our industry that is innovating and impacting businesses, customers and practitioners at an incredible pace. In that sense the book is also visionary in nature, giving a taste to the reader of what will undoubtedly result in many businesses. As such the book appeals to a much wider audience than just those involved in big data.

I think anyone interested in the future of data management will gain value from reading Big Data Governance.

The Sunil Soares Data Governance “Trilogy”