Language: English Français

# Calculating variance and DEFF for stratified sampling

This question was posted the Assessment and Surveillance forum area and has 4 replies. You can also reply via email – be sure to leave the subject unchanged.

### Mark Myatt

Consultant Epideomiologist

Frequent user

28 Mar 2014, 12:08

I think you want a confidence interval around a point estimate.

Here is a simple approach ...

You data:

```  Strata     N    n    P
------ ----- --- -----
1  9870 225 0.813
2 33599 219 0.548
------ ----- --- -----
3 14130 212 0.311
```

```  w = N / sum(N)
```

giving:
```  Strata     N   n     P     w
------ ----- --- ----- -----
1  9870 225 0.813 0.171
2 33599 219 0.548 0.583
3 14130 212 0.311 0.245
------ ----- --- ----- -----
```

The point estimate is:
```  sum(P * w) = 0.813 * 0.171 + 0.548 * 0.583 + 0.311 * 0.245 = 0.535
```

The variance is:
```  sum((w^2 * p * (1 - p)) / n)
```

```  Strata     N   n     p     w  x\$w^2*x\$p*(1-x\$p))/x\$n
------ ----- --- ----- ----- -----------------------
1  9870 225 0.813 0.171            1.975795e-05
2 33599 219 0.548 0.583            3.844253e-04
3 14130 212 0.311 0.245            6.067027e-05
------ ----- --- ----- ----- -----------------------
SUM = 0.00046485
-----------------------
SQRT(SUM) = 0.02156046
-----------------------
```

The 95% CI is then:
`  Lower 95% CL = 0.535 - 1.96 * 0.02156046 = 0.4927 (49.27%)  Upper 95% CL = 0.535 + 1.96 * 0.02156046 = 0.5773 (57.73%)`

BTW : The design effect is the ratio of the variance (calculated above) and the variance calculated from a simple random sample. With your data:

```c = round(225 * 0.813 + 219 * 0.548 + 212 * 0.245)
= 355

n = 225 + 219 + 212
= 656

p = 355 / 656
= 0.541

var = (p * (1 - p)) / (n - 1)
= (0.541 * (1 - 0.541)) / (656 - 1)
= 0.000379113

DEFF = 0.00046485 / 0.000379113
= 1.23
```

Avoid rounding early (as I have done above).

I hope this helps.

You should check my arithmetic.

### Mark Myatt

Consultant Epideomiologist

Frequent user

28 Mar 2014, 14:31

Just to be clear ... rounding errors can accumulate and end up becoming quite large over a series of calculations ... I have been lazy above ... keep all numbers to full precision throughout and round only the final results. If you do calculations in a spreadsheet with raw numbers then you should be OK as the full precision is often retained "behind the scenes".