You mean cleaning isn’t just for my house?

For your organization to really get the benefits of your data it must be clean and accurate. So, what is clean data you may ask? As per Wikipedia, data cleaning is “the process of detecting and correcting corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect, inaccurate or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data.” In short it means making sure that you have:

  • Gone through your data and made sure that the individual records are accurate.
  • Removed duplicates or corrected data that has errors.
  • Checked records to make sure they have been formatted properly.

Without clean data you are doing your organization a disservice. After spending years in reporting and analytics myself I’ve seen what can happen when the source of the data is inaccurate. We always used to say, “garbage in = garbage out”. You will never get the full power of your data unless the source is correct. You need to make sure that there are standards in place so that everything is entered correctly in your database, no matter which software you use.

Data entry doesn’t really matter…does it?

Here’s a simple example to think about. Let’s say you want to find the total amount of donations for the last six months. But for some reason, when you try to get the total amount, you get an error message. Upon further investigation someone on your team realizes that one of the entries is a text value instead of numeric. The mistake is corrected, and total donations is now available. This is a simple example; however, these kinds of errors can lead to larger discrepancies as your donor database increases. Furthermore, requiring more time which you don’t have trying to figure out what the issue is.

I’m not sure where to start

If you have never cleaned your data simply start with your contact information. Your goal will be to ensure that the data is accurate and makes sense. Here are a few tips to get started so you can generate error-free reports for your organization, and your stakeholders as well.

5 tips to help ensure your data is accurate from the start:

1. Before you start any data cleaning I highly recommend that you establish standards for how the data will be entered. With standards you will have consistency. For example, will postal codes be entered with or without spaces? Will dates be entered month, day, year or year, month, day? Will full city names and provinces be entered or abbreviations? These are just a few things to think of when you are entering data. Why is this important? Looking at the example below, we can see how if we want to summarize this data in the future, or provide a quick report, this will become a nightmare fast.

At a quick glance, it appears to be four different donors since you see four distinct donor ID numbers. However, on closer inspection it really is two different donors from the same household.

This can pose a problem. If you need to send mail to the donors below, are they going to receive four different mailings even though they are under the same household?

2. Once you have established standards it is important that you have a method of data validation. Whatever application you are using to house your data, there should be a way to restrict or add a rule to the type of data you are entering. For example, if you are entering provinces/territories it may be a good idea to create a drop-down menu so only 10 provinces and 3 territories are available, and the user cannot enter anything else. Or if a user is entering dates a rule can be added that it must be in MM-DD-YYYY format. So, if a user enters 13-09-2019 the system will not allow it to be entered and ask to be put in the proper format.

3. Review and double check. I cannot stress this enough. Especially if you will be creating reports from your data the numbers have to be accurate. I also suggest, if possible, to have someone else double check the work. Sometimes you may have been looking at the data for so long you might miss something that someone with fresh eyes can catch instantly.

4. If you have a staff member that is comfortable with the data try to take the time to train the other staff members who might also be responsible for entering the data as well.

5. And finally, documentation. It’s important to have a document that can describe the process to someone who is new, needs a refresher and for your long-term succession plan. As with any organization people will leave, and they’ll take their knowledge with them. Your documentation doesn’t have to be long, but if it can outline the basic steps, you’re on the right track.

This might seem like a lot, and your organization might have a little work ahead of them to get your data in a workable state. But know that it is worth it, and it will make your lives easier in the long run.

With over a decade in the financial sector, Rochelle Greaves has used her technical knowledge to help businesses measure key metrics and trends, driving strategic decision making. Currently, as the Co-founder and Director of Analytics & Strategy at Story Point Consulting, Rochelle continues to use her skills to help nonprofits increase their fundraising capacity and revenue. You can reach her at info@storypoint.ca.