Language: English Français

# Can design effect be less than 1 in cluster surveys?

This question was posted the Assessment and Surveillance forum area and has 3 replies.

### Anonymous 1089

Normal user

21 Jun 2012, 15:18

If it's about a SMART nutrition survey, don't hesitate to go on the SMART methodology forum: experts will answer to you quickly and inscription is free... http://www.smartmethodology.org/index.php?option=com_fireboard&Itemid=293&lang=en

Self-employed

Technical expert

21 Jun 2012, 19:16

The design effect reflects the sum total of all the influences introduced by various sampling techniques on the statistical variance for a certain outcome in a specific sample. Cluster sampling increases the design effect if the outcome you are looking at (for example, wasting) is grouped in the population and not evenly spread throughout. This means that the prevalence of the outcome is different in different clusters. On the other hand, stratified sampling (that is, intentionally picking portions of the sample from different subgroups in the population) often leads to a decrease in the design effect. After you have input the data and told the computer that you did cluster and stratified sampling, the effects of these two different sampling schemes is taken into account when the computer calculates the variance of the outcome, and hence the design effect. If your outcome is spread evenly in the population, so there is little clustering, and if the prevalence of the outcome differs among the sample strata, then it is likely the design effect will be less than 1.0. I'm no statistician, so if anyone knows better, please correct me, but I believe that if only cluster sampling is done with no stratified sampling or other sampling technique which might decrease the design effect, the design effect theoretically cannot be less than 1.0.

### Kevin Sullivan

Professor

Normal user

21 Jun 2012, 20:54

In a two-stage cluster survey where, at the first stage, clusters are selected randomly, and at the second stage, individuals randomly selected, the variance is comprised of two components: variability within clusters and variability between clusters. With PPS sampling, the variance formula is simpler and based on the variability between clusters (the within variance drops out of the formula). Unless provided additional details, most statistical programs analyze cluster data as a one-stage cluster survey where the only variance component is the variability between clusters. The variance for simple random sampling (srs) is based on individuals, [pq/(n-1)] ignoring the finite population correction. The DEFF is calculated as: variance accounting for cluster design / variance assuming srs. Therefore, with PPS and 1-stage cluster sampling, what drives the variance is the difference in estimates between clusters. If the clusters are more similar than expected assuming random variability, the DEFF can be less than 1.