The good thing of online surveys sites like SurveyMonkey and others is that they make it easy to gather survey data.
The bad news is that important surveys should never be analyzed with the standard reporting tools these sites offer.
Important surveys deserve professional survey reports.
What survey sample size?
How large should be a survey sample?
There are 4 common methods to determine a survey sample size:
-
- Arbitrarily (what often happens with online surveys). Let’s gather the interviews we can gather.
- Minimum cell size (Chi2). Based on an assumption of the Chi-Squared test that there should be at least a count of 5 in each cell of a bi-variate table.
- Budget-based. We have $200. One interview costs $4. We can gather 50 interviews.
- According to a desired precision. Gather as many interviews as required to interpret marginal totals with an error level of +/-5%.
The first three methods are unreliable.
The last method should be preferred.
For the sake of information, the size of a sample extracted from an infinite population (larger than a million or so) is typically determined according to the following bernoullian formula, commonly used in marketing research:
And 384 interviews is the sample size of a survey conducted at the 5% error level (and 95% Confidence Interval. The two values cannot be added together).
Special formula may apply in cases like lab tests without repetition, it means when the sample elements cannot be replaced in the population, for instance a Taste Test on biscuits.
When is a survey representative?
Each survey is a representative survey. The real question is: Representative of what?
Many erroneously believe large samples make surveys representative, but it is not true. Large samples reduce the error level associated with the answers. In fact, at the 95% confidence interval both samples of 96 and 9’512 interviews may be representative of the population they are drawn from. The difference lies in the error level, 10% for the former sample and 1% for the other.
What ultimately makes a survey representative is its ability to reproduce the characteristics of the population it comes from.
This means the way interviewees are selected is what really impacts the ability of a survey to represent the population it refers to.
If our survey is supposed to estimate the median height of Californian women and we measure 9’512 women living in Los Angeles, no matter how large the sample, the survey will not be representative of Californian women, because the LA population is not representative of the Californian one.
The cross-tables of the demographics data are usually a good place to start looking for evidence concerning the representativeness of a survey.