Saving 1 Million Euros in the Utilities Industry with Kaizen and Data Quality
Featuring Mark Humphries
Mark Humphries is an information quality practitioner and business analyst at Essent in Belgium.
One of the toughest challenges for any company is demonstrating the financial impact of poor data quality and as a result they go hunting for errors first, with the hope of linking to business value later.
In this interview, past contributor Mark Humphries of Essent in Belgium, shares practical advice of how to develop a complete data quality strategy that starts with customer needs and ultimately ends in major data quality improvement.
Dylan Jones: It’s been awhile since we featured on the site Mark, what data quality work have you been involved with lately?
Mark Humphries: About a year ago now in my role as Billing Manager I was worried about the number of customers that called every month about their bills and the number of corrections that we were making as a result.
We had just started an ambitious growth plan, and we were already seeing success. One of my worries was that the growth would generate more calls and more corrections and part of the deal with the growth plan is that we had to do it without increasing headcount.
It was clear to me that we had to do what we could so that less customers felt the need to call us and challenge the bills.
When we looked into this one of the highlights was what we called bill shocks. For example a customer pays €100 a month as part of his budget plan, but then by the annual meter reading he suddenly receives his annual bill for €200. The annual bill, based on measured consumption should be a small correction and sometimes a credit note if the customer has consumed less than we estimated.
So we started a Kaizen for this specific problem.
Dylan Jones: How are payments calculated?
Mark Humphries: We calculate the monthly payments based on expected consumption for the year. If the actual consumption is line with this, then the annual bill will be a small adjustment.
First things first, how could we accurately measure our performance and get an idea for the potential improvements?
This isn’t obvious, because for one customer who is paying €500 a month then an annual bill of €100 is maybe a small adjustment but for another who is paying €60 a month then €100 is a major problem.
Eventually we settled on a ratio – the net annual bill divided by the monthly advance amount.
In this way everything is normalised.
Dylan Jones: So, what did you find?
Mark Humphries: Well the nice thing that we saw was that over the previous 12 months this index had steadily reduced from 5 to under 1 and was continuing towards zero where it should be.
This was a welcome confirmation that a big data cleaning exercise from a year previously really had worked. Then the focus had been on estimated consumption and meter reading dates.
The bad news though, was that there was far too much variability. Although the averages were good, the variation was too high, so yes we still had a problem, but the question was did it correlate to the calls that we were receiving?
So we explored the next step in the Kaizen methodology which is to analyse the facts.
We looked at the correlation behind the observed deviation and an number of undesirable events such as calls, corrected bills, customers who leave and customers that we eventually drop due to bad payment. This was a big surprise for me, because we established a very clear link to the calls, not just for the large bills but also for the large repayments.
Dylan Jones: Ah I see, interesting, were there any tools or techniques used to help discover that link between variation of billed amount and the number of calls?
Mark Humphries: A lot of the work was done in Excel, but we also used Mintitab for the regression analysis. Minitab is very performant running just on a laptop and is good for generating a range of graphics that enabled us to visualise complex relationships.
Dylan Jones: What about customer churn, did you find a link to people closing their account?
Mark Humphries: Good question. So what we found actually was that there was no link with customers leaving. Customers with an unexpectedly high bill called us and complained, but didn’t leave.
What did happen though with a small number of customers is that they didn’t pay that bill, and we ended up dropping them. This turned out to be very important because these were the customers on a tight budget.
Dylan Jones: I see, can you give an example so our readers can see how you worked through the process.
Mark Humphries: Absolutely. It’s a great example actually of how data quality management really demands that you understand your business and in particular the needs and behaviours of your different customer segments.
So let’s say we were billing a customer €50 a month based on expected annual consumption worth €600 a year. That was OK and they could budget for it. Then suddenly we present them with a bill for €120. At that point they just don’t have the money and can’t pay, we end up dropping them and losing the €120.
If in the beginning of the year we had calculated €60 a month instead of €50 a month, they could have budgeted. It was all there in the data and we could build a business case around call reduction and non-payment reduction. The only question now was where was it going wrong.
The next step took some time. We focused on what we knew at the beginning of the billing cycle and what could possibly have an impact on the calculation of the monthly bills and how reality might turn out differently.
Dylan Jones: How did you make the calculations?
Mark Humphries: We created a large dataset based on two years of billing history and then used Minitab to perform a regression analysis of the various inputs. This took several iterations before we arrived at credible conclusions, but the final conclusions were beautifully simple.
Dylan Jones: What was driving the variation?
Mark Humphries: The biggest drivers behind the variations were found to be:
- The quality of the meter reading at the start of the billing cycle
- The source of estimate consumption that we used to calculate the monthly bills and
- The sales channel
Let me explain these one at a time.
Each meter reading that we receive is tagged with a quality indicator – read by the meter reading company, supplied by the customer or estimated.
What we saw is that when the customer or the meter reading company read the meter the billing goes fine, but when the reading is estimated, then the probability of a high annual bill is increased. And it’s the meter reading at the start of the cycle that counts.
With each meter reading we also receive expected consumption for the coming year from the meter reading company. However, when a customer moves house we ignored this and asked the customer what he expected to consume at his new address. It turns out that the meter reading companies estimates, which are based on the previous occupant’s consumption, are far better than the customer’s. This is quite a remarkable conclusion because it means that energy consumption depends more on the house or flat than on who’s living in it.
Lastly there was also a link with the sales channels, but that’s commercially sensitive information, so I can’t explain it any further.
We are currently in the process of implementing the improvements, but in the end the root cause comes down to data quality, specifically in this case the quality of the meter readings and the quality of the projected consumption.
The first solution is to react to estimated meter readings. We are contacting the customer and encouraging him to go and check his meter against the estimates. Since we are now tackling this at the beginning of the billing cycle, there
For the second, we now know that the meter reading company knows best, and by default we will use their estimates. The projected savings are around €1M per year from these simple changes.
Dylan Jones: That’s a significant amount of savings for one process, very impressive, particularly when you consider it over a 3 year return period, certainly enough to fund a data quality team! What other lessons can you share for members who want to learn from your experience and try this themselves?
Mark Humphries: Sure, well some of the most important lessons to draw from this story are:
- Firstly although the root causes are to be found in data quality, this started as a process issue “why are customers complaining about their bills?”. In particular, by focusing on the customer and the bottom line you’re homing in on some very targeted areas that you know will get peoples attention.
- Secondly we took the time to accurately measure the size of the problem and to make a valid estimate of the potential savings before we even looked for the root causes. Data quality is not just about profiling data and finding defects, you’ve really got to create a solid business case to help prioritise the most critical initiatives.
- Thirdly the root cause analysis was very data intensive. All my data management and data quality experience was very relevant at this stage, but this meant that we could justify our conclusions, they really were fact based. I enjoyed the moment when I was asked in board room how big a sample we had taken and I could reply that the sample was all annual bills from the last two years.
The final lesson I would cite is that I had looked at the problem of estimated meter readings previously in isolation, and then I had dismissed it as priority.
In that exercise I concluded that only 1 estimated reading in 20 caused a problem and that with less than 10% estimates overall the impact was negligible. This time I came at the problem from a different angle. I started with an undesirable output and worked back to the root cause.
Dylan Jones: Can you quickly list the tools and techniques that you adopted in this data quality improvement initiative so that others who are getting started can get a shortlist of areas they know will help them?
Mark Humphries: To get at the data we just used old fashioned SQL and our own knowledge of the data model.
A lot of the initial investigative work was done with Excel. Then to really pull the correlations out of the data we used Minitab.
For the methodology we applied Kaizen.
None of this is high end stuff and I think that’s important. In this whole exercise we did a lot of iterations. I have highlighted the important stages, but there were a lot of blind alleys that didn’t lead anywhere.
Having direct access to the data, being able to refine the dataset and perform new analyses was important.
The structure of the Kaizen approach, although simple, was also important, especially the emphasis on forgetting about the solution until you really understand the problem.
Dylan Jones: What I like most about this case study Mark is that you really started with the customer.
By working back from their pain and needs you were able to model the process and map the data to it correctly to support your analysis.
A lot of people do the reverse by profiling data, finding defects then hunting for business pain. As a result they often struggle to discover the real business impact.
I think you’ve raised some really valuable lessons, well done to you and your team, thanks for sharing.
Mark Humphries: Thanks Dylan, I hope other members find it useful.
Image credit: dno1967b, Flickr cc
Mark Humphries is an experienced Information Management expert who applies both data and process thinking to solve real problems with working solutions.
Mark adds value by using Data Quality techniques to find problems that no one knew existed, and then applies Process Improvement techniques to implement sustainable solutions that fix the root causes.
Mark is convinced that the data and process communities need to work better together if they are to realise their full potential.
In a good year Mark adds €1M to the bottom line.
You can connect with Mark on:
Contributions by Mark Humphries: