5 Simple, Low Cost Ways To Improve Data Entry Data Quality
DQ Techniques | by
Dylan Jones (Editor) One of the single biggest causes of data defects in any organisation is poor quality information entered at the start of the information chain via data entry interfaces.
This post provides some simple, practical and cheap techniques to dramatically improve the data quality of human entered information.
5 Simple, Low Cost Ways To Improve Data Entry Data Quality
1. Take a minimalist approach
Keep it simple, do you really need all that data?
Case in point: To join Data Quality Pro requires name, username, email and password fields to be populated. Earlier in the year there was a bug identified by a small group of users that prevented certain name constructs. Some people contacted me privately (thanks Daragh) about the bug but some went onto Twitter and voiced their frustration. I suspect several others gave up so perhaps we even lost some would-be members who couldn't enter their name in the format they wished.
Think about that for a second in a business context.
That was one field for a free membership service that gives you free software, resources and expert advice. The fact that Data Quality Pro is a free community site was irrelevant, people were frustrated and they openly vented it.
Compare this to a vendor webinar or white paper sign-up form that can often require more than 15 pieces of information just to get something you're mildly interested in. Most forms like these are creating far too many opportunities for data entry data quality defects to occur.
Look at this internally within your business and the applications your knowledge workers operate. Entering data is a time-consuming task that is frankly, dull. Is there any wonder why businesses suffers from defective data when staff have to fill in often endless, cluttered forms?
You now have all the tools and techniques you need to profile the data stemming from your data entry forms. Look for the patterns. How do international customers cope with your forms? Are there frequently missing or abused entries in various fields? What issues occur again and again?
Use this information to strip down your data entry forms to the bare essentials and you will create far more contented workers and customers.
This is vital for data quality and essential for good business.
2. Profile the good and the bad
Jim Harris penned a great post recently talking about his experience of living in a zip code that doesn't exist.
When a customer or worker enters data into a form are you recording the valid data AND the previous data that failed? The defects are just as important as the accepted data.
By collecting both sets of information and profiling it we instantly create a picture of where our data entry process needs to improved. If you don't store that information you're missing out on potential new business so there is a clear commercial driver to do this.
Even on valid, accepted data, we will find scores of examples where data is incorrectly entered or witness to regular abuse. This information flows into downstream processes and creates extra work, costs and bad decisions.
You now have no excuses to begin profiling your data entry data and use this intelligence to design new data entry processes.
3. Design helpful forms
I recently tried to enter some personal information into a tax return form which failed to accept the data. The form returned an error at the top of the page, using the particularly helpful phrase:"Error in form, please correct". Of course I had no choice but to persist otherwise a hefty fine beckoned.
However, with business forms, if we feel helpless, then we walk away.
If a form fails to validate your information then it should be designed in such a way that it guides you through the correction process.
Also, why not add basic support and assistance:
- How many web forms do you see that don't provide a feedback or help option easily visible next to a data entry form?
- Do you have a visible call support number, email or Wiki page where your data entry workers can get help or advice?
These are just two simple ways to improve the user experience and quality of inbound data to your business.
4. Prevention is better than cure
Data cleansing downstream from the point of entry is costly, repetitive, time-consuming and error-prone. It can increase service lead-times and add unnecessary complexity to your information chains.
Look at the typical data cleansing functions you have implemented to cope with poor quality data in your business. This is typically automated, using software, or manual through data workers.
Identify some quick wins:
- If you are standardising country names wouldn't it be easier to create a drop-down list based on accurate reference data?
- If the part codes are entered in a myriad of weird and wonderful formats and require pattern recognition and cleanse to standardize them, wouldn't it be easier to enforce standards on entry?
Take for example the excellent open source DataCleaner product that we reviewed and created a tutorial for. This provides pattern analysis logic in a Java format which you can re-use in your applications.
Data Quality Pro members also get free ascii pattern analysers for Oracle, VB Script and SQL Server. Why not use the VB code at the interface layer or the database functions down in the database transaction layer to validate the data?
If you cannot prevent defects at source then ensure you have routines as close to the source as possible to trap defects before they flow into the business. Whenever you build an error-checking routine or clean-up process ensure that you have a feedback loop to the form designers so that logic can be implemented at source.
5. Gather user feedback and act on it
Here is a wild and crazy notion- why not ask users and customers what they think of the data entry process?
Several years ago I consulted on a project where two systems were increasingly suffering from poor data quality and becoming inconsistent as a result. It was clear that the data being entered by the field staff was a major cause of the issue. I followed an incredibly simple process of:
- Profiling the entered data (see point 2) to identify the defect hotspots
- Mining comments data which was found in the entered record (this found countless examples of frustrated field workers)
- Listening to field staff who had entered poor quality data
- Re-engineering forms and interfaces to meet their needs
Common sense I know but the client had already commenced a complex data cleansing process. It simply had never occurred to them that there was a simple reason behind the poor quality data flowing into their business. The forms were poorly designed, cumbersome, error prone and didn't suit the working patterns and habits of field workers.
Solving data quality means going to the root of the source and implementing preventative measures that increase the satisfaction of the user experience. By creating simple, helpful, intuitive and preventative controls at the data entry source we can dramatically cut costs and complexity throughout the business.
There are some additional resources below to give you some further information on how to improve your data entry processes.
What other suggestions do you have for improving data quality in data entry processes? Please share your thoughts in the comments section below.
Useful Resources
See all posts in: DQ Techniques
Best Of: Technical Data Quality Tutorials
Free eBook Provides Practical Advice For Improving Web Form Data Quality
Reducing the need for scrap and rework with web data collection


Reader Comments (2)
Excellent 5 points.
Here are 2 additional points that I have found useful:
6. Have an error tolerant search
A common workflow when in-house personnel are entering new customers, suppliers, purchased products and other master data are, that first you search the database for a match. If the entity is not found, you create a new entity. When the search fails to find an actual match we have a classic and frequent cause for either introducing duplicates or challenge the real time checking.
An error tolerant search are able to find matches despite of spelling differences, alternative arranged words, various concatenations and many other challenges we face when searching for names, addresses and descriptions.
More about error tolerant search here.
7. Verify or pick from external reference data:
This is going further from the list reference data on country names and codes.
Say you are going to add a business entity in your customer table. Instead of typing name, address and other data you may plug in to a business directory and select the entity from there. You will have the following advantages:
• Less typing (less errors)
• Data from reliable (official) source
• Possibility of ongoing updates
As the term “low cost” is included in the title here I have some tips around the possible costs of such solutions:
• Prices on external reference data are decreasing. There are huge differences between countries here. Regularly check the market if ROI is turning positive in your markets and your financial scope.
• Solutions for searching and reference data integration may be rented thus making your wins more than paying for costs.
Great ideas Henrik, thanks for extending the list, as you say, both of these options have considerable advantages cost-wise.
Cheers for your input.