Category: Top » Internet » Tools-and-resources »


Author: Allegiance | Total views: 69 Comments: 0
Word Count: 663 Date: Fri, 16 Jan 2009 9:25 PM

Survey Data Cleansing: Five Steps for Cleaning Up Your Data

Preparing your survey data for analysis can be a messy process, mostly because data typically needs to be cleansed for various reasons. For example, respondents' answers may not match pre-defined choices or they may answer questions that don't really apply.

Using an online survey tool can eliminate many of the problems associated with paper surveys by limiting response choice and enabling participants to skip irrelevant questions. But even online survey data may contain records that exclude key variable or include duplicate responses from the same person. And if your survey is large, the task of cleaning up your data can, at first glance, seem a bit overwhelming.

However, it needn't be. For instance, I recently completed a survey analysis with 35,000 respondents who answered about 75 questions, which resulted in a 2,625,000 cell spread sheet. Fortunately, editing and cleansing the data was fairly simple because I used a tried-and-true, five-step process that included:

Step 1: Make a copy of your data and use that version for data cleansing. This isn't as much of a step as it is a warning. Even the best laid data cleansing plans sometimes have to be taken back to the drawing board. So, only delete records from a copy of your data and keep your original file on hand in case you need to put something back in.

Step 2: Conduct a few mini data cleansing trial runs. Export smaller subsets of your data to conduct data cleansing trial runs to refine your process. It's a lot easier to get the process down with a data set of 2,000 than 35,000. Plus, then you'll know the exact steps that you'll need to follow when you export all 35,000.

Step 3: Identify "crucial variables" in your survey efforts and define what constitutes "complete." - In the survey I mentioned above, senior-level executives wanted to identify high-performing managers in geographically defined regions. To the company, these geographical regions were a crucial variable in their survey efforts, as without them, survey responses were useless and had to be deleted. In addition, the scores for each region were based on answers to 7 questions. The company decided that in order for a response to be considered complete, all 7 questions had to be answered.

Step 4: Remove "speeders" and "flat-liners" - Using an internet survey tool, we were able to place a date/time stamp on each response and find out how much time it took each person to complete it. We know from past experience that respondents who complete the survey too quickly (less than 30%-50% of median time) and are likely not reading or answering the questions appropriately. The same is true for flat-liners (i.e. those who mark each answer the same), which are often speeders. They may have read the questions, but they don't really think about their answers. Therefore, it's best to remove speeders and flat-liners from your data to eliminate a lot of meaningless data.

Step 5: Eliminate duplicate responses - Usually, it's hard enough to get people to respond to a survey once. But some people actually care so much that they tell you twice, especially if there are some particularly juicy survey incentives involved (which may tempt them to try to increase their odds of winning) and/or if your managers are informed that their scores are somehow tied in with response rates (which may cause them to flood the system with duplicate favorable responses). Fortunately, all you have to do in those cases is match the entry information with your survey list, and then use the date/time stamp to identify and delete duplicate entries later. I recommend keeping the first response and deleting any subsequent responses.

Once you complete these steps, you'll not only have a cleaner and more accurate data set, but you'll also be able to ensure that each person who takes your survey only counts as one response instead of several in the results.

About the Author

Terence Fugazzi is the VP of Demand Marketing at Allegiance (http://www.allegiance.com). His company provides Customer Engagement Software that helps organizations grow and increase profitability through improved customer loyalty and engagement.




Rate, comment or bookmark this article

Seed Newsvine

Rating: Not yet rated

Bookmark this article in your preferred program
AddThis Social Bookmark Button

Comments RSS

No comments posted.

Add Comment

Your Name:


Your Email:


Comment

Enter the code shown

Visual CAPTCHA



Popular Articles in this cathegory

1: How to Attract Attention to Sales Copy with Power Words
How to use power words in your web site's headlines to attract attention and increase your web sales. With examples.

2: The Dangers Of Using Social Networking Websites
You must warn your teens about using Social Networking Websites

3: Do You Know What Orkut Is ?
Google has is own social networking website

4: How to Troubleshoot Internet Explorer Errors
The majority of computer users use IE to browse the web. So what if your Internet Explorer suddenly starts having problems? Unexpected crashes, hanging, weird errors... you need a structured approach to analyze and solve IE errors.

5: Free Covers- No fine print
Free covers may seem like something that is impossible to find, but it 's true. You can find them on the internet easily. Before you get too excited though, you should make sure that you understand what you are getting.


Creative Commons License
This article is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.
Spanish taslation