# Comparing 2 surveys

This question was posted the Assessment and Surveillance forum area and has 6 replies. You can also reply via email – be sure to leave the subject unchanged.

### Ranjith

Normal user

1 Aug 2012, 19:41

I have 2 anthropometrc survey datasets, one survey was conducted using 30x30 cluster sampling method (where children were selected using quota sampling method) and the ther using SMART methodology (20 HH x 32 clusters; HH rather than children were selected as basic sampling unit). Is it valid to compare the results from these 2 survsys? For example to examine the statistical difference between the 2 GAM estimates?

### Mark Myatt

Consultant Epideomiologist

Frequent user

2 Aug 2012, 08:52

The purpose of the surveys was to estimate prevalence with useful precision. This means that you will not have an optimal design for comparing two proportions (e.g. in terms of statistical power). This does not mean that you cannot do it. The simplest approach is to look for non-overlapping CIs. If the two CIs do not overlap then this is evidence against the null hypothesis that the prevalences are the same. With some arithmetic this method can be extended to do things like estimate the difference between two prevalences or perform a hypothesis test. Kevin Sullivan has done this in this this web-based calculator.

I hope this helps.

### Tamsin Walters

en-net moderator

Forum moderator

2 Aug 2012, 15:12

*From Bradley Woodruff:*

First of all, the sampling method influences only the precision. If done correctly, the sampling method does not change the actual point estimate. For example, let's say you do one survey of nutrition status in pre-school age children in a population using cluster sampling of households, and you include every eligible child in selected households. In another survey of the same population, you use simple random sampling of eligible children. If the prevalence of some nutrition outcome, for example, wasting, has not changed between the surveys, the 2 surveys should produce similar estimates of prevalence. The only thing different between the results will be the precision obtained. If the prevalence of wasting obtained by the 2 surveys was 10%, the first survey might have 95% confidence intervals of 6% and 14%, while the second survey might have 95% confidence intervals of 8% and 12%. This is because cluster sampling usually results in a loss of precision. But if your sampling is unbiased, the point estimates should be similar.

The best way to compare the results of surveys is to put the data from both surveys into one dataset. Be sure each child is identified with a cluster number (in the case of non-cluster sampling, each child will have his/her own cluster number because the cluster size is 1). Also assign to each child some code to identify if the child is from survey #1 or survey #2. Then do crosstabulations of survey number by nutrition outcome. The computer, if you are using an appropriate computer program which can account for cluster sampling, should give you a p value for the difference between the results of the 2 surveys.

Of course, often you do not have the raw data to put into the same dataset. In this case, many programs will allow you to enter for the 2 surveys the estimates of prevalence along with some measure of precision for each survey, such as standard error or confidence intervals. The program will then compute a p value or some other measure of the statistical significance of the difference between the 2 surveys. Do NOT just compare the confidence intervals of the 2 surveys. This is a very common mistake. Overlap of the confidence intervals from the 2 surveys does NOT mean that the difference is not statistically significant. However, lack of overlap does mean that the difference IS statistically significant. If you rule out statistical significance because confidence intervals overlap, you are underestimating the precision of your comparison. The statistical reason for this is that in calculating the p value for the difference between 2 survey results, you need to use the weighted pooled estimate of the variances for the 2 surveys because under the null hypothesis that there is no actual difference between the surveys, you assume that the variances for the 2 surveys is the same. Nonetheless, as a quick screen, you can compare the confidence intervals, but you must keep in mind that if they DO overlap, then you can make no conclusions about the statistical significance of the difference and you must calculate the specific p value for this difference, either by hand or by using a computer program.

### Mark Myatt

Consultant Epideomiologist

Frequent user

4 Aug 2012, 11:47

Woody is correct. The simple approach of looking for an absence of overlap of two CIs is a *conservative* (i.e. low power) test. This means that it has a tendency to not rejecting the null hypothesis when differences are not very large. The link I sent above takes the (more correct) approach outlined by Woody (above).

### Mark Myatt

Consultant Epideomiologist

Frequent user

2 Sep 2013, 08:40

Kevin,

Just checking on previous posts and find that this link:

http://www.sph.emory.edu/~cdckms/compare%202%20proportions.htm

appears to be dead.

Is there a new link?

Mark

### Tamsin Walters

en-net moderator

Forum moderator

3 Sep 2013, 09:27

*From Kevin:*

Hi Mark- our IT group changed the address - the new address is:

Web1.sph.emory.edu/cdckms

### Mark Myatt

Consultant Epideomiologist

Frequent user

3 Sep 2013, 10:37

Just fixing the link to Kevin's calculator. It is now here.