# Combining of multiple an independent surveys through weighting

This question was posted the Assessment and Surveillance forum area and has 2 replies. You can also reply via email – be sure to leave the subject unchanged.

### Anonymous 81

Public Health Nutritionist

Normal user

21 Dec 2014, 13:43

### Mark Myatt

Frequent user

22 Dec 2014, 17:09

**each survey**are similar to each other. This can be as simple as a visual check using a "forest plot" of estimates and 95% CIs. Here is an example of

**similarity**:

```
Survey 1 |-----*-----------|
Survey 2 |-----*----------|
Survey 3 |------*------------|
...
Survey N |-----*-----------|
+--+--+--+--+--+--+--+--+--+--+--+--+--+
8 9 10 11 12 13 14 15 16 17 18 19 20 12
Prevalence (%)
```

Note that the point estimates (marked by the "*") are close to each other and there is a lot of overlap of the 95% CIs. In this case an average will have meaning.
Here is an example of **dissimilarity**:

```
Survey 1 |-----*--------|
Survey 2 |-----*----------|
Survey 3 |------*---------|
...
Survey N |---*------|
+--+--+--+--+--+--+--+--+--+--+--+--+--+
8 9 10 11 12 13 14 15 16 17 18 19 20 12
Prevalence (%)
```

Note that the point estimates are widely spread and some of the CIs do not overlap much or at all. In this case an average will hide variation and would best be avoided.
The pooled proportion is a population weighted average of the proportions found by each survey:
```
p1 * w1 + p2 * w2 + ... + pn * wn
Pooled proportion = ---------------------------------
w1 + w2 + wn
```

where:
```
p1 = proportion from survey 1
p2 = proportion from survey 2
.
. and so-on
.
w1 = population in area for survey 1
w2 = population in area for survey 3
.
. and so-on
.
```

Complications arise when trying to pool variances. This is because the survey samples are complex and the variance is influenced by the proportion, the sample size, and the survey design effect.
One way to approach this problem is to calculate the standard error (SE) from the estimates and 95% CIs reported from each survey:
```
Upper Confidence Limit - Lower Confidence Limit
SE = -----------------------------------------------
2 * 1.96
```

The pooled SE is:
```
( SE1^2 * w1 + SE2^2 * w2 + ... SEn^2 * wn )
Pooled SE = sqrt( ----------------------------------------- )
( w1 + w2 + wn )
```

where:
```
SE1 = SE for survey 1
SE2 = SE for survey 2
.
. and so-on
.
w1 = population in area for survey 1
w2 = population in area for survey 3
.
. and so-on
.
```

The pooled estimate is:
```
Pooled estimate = Pooled proportion +/- 1.96 * Pooled SE
```

Here is an example with three surveys only ... the survey results are:
```
Survey Population p LCL UCL
-------- ---------- ----- ----- -----
Survey 1 23,670 12.7% 9.7% 16.1%
Survey 2 16,546 9.3% 6.3% 13.2%
Survey 3 19,201 13.5% 9.8% 18.0%
-------- ---------- ----- ----- -----
```

The pooled proportion is:
```
Survey w p p * w
-------- ------ ----- -----
Survey 1 23,670 0.127 3,006
Survey 2 16,546 0.099 1,638
Survey 3 19,201 0.135 2,592
-------- ------ ----- -----
Sum 59,417 7,236
Pooled proportion = 7236 / 59417 = 0.122
```

The pooled SE is:
```
Survey w LCL UCL SE SE^2 SE^2 * w
-------- ------ ----- ----- ----- -------- ---------
Survey 1 23,670 0.097 0.161 0.016 0.000256 6.059520
Survey 2 16,546 0.063 0.132 0.018 0.000324 5.360904
Survey 3 19,201 0.098 0.180 0.021 0.000441 8.467641
-------- ------ ----- ----- ----- -------- ---------
Sum 59,417 19.888070
Pooled SE = sqrt(19.88070 / 59417) = 0.0183
```

The pooled estimate is:
```
Point estimate = 0.122
95% LCL = 0.122 - 1.96 * 0.0183 = 0.086
95% UCL = 0.122 + 1.96 * 0.0183 = 0.158
```

or 12.2% (95% CI = 8.6% - 15.8%).
Important ...
(1) Someone should check my thinking and my arithmetic.
(2) When you do these sorts of calculation you should do them to the full precision throughout and only round at the end. I did not do this above so there will be some accumulated rounding error in the final result above.
I hope this is of some use.