Menu ENN Search
Language: English Fran├žais

Calculating variance and DEFF for stratified sampling

This question was posted the Assessment and Surveillance forum area and has 4 replies.

» Post a reply

Anonymous 730

Normal user

28 Mar 2014, 00:40

I would like to request assistance in calculating variance for stratified PPS sampling. Let's say we do a survey with 3 strata with the following populations and prevalence of malnutrition in a 30 X 7 survey: Strata Population Sample Prevalence Weight 1 9,870 225 81.3% 0.4997 2 33,599 219 54.8% 1.7475 3 14,130 212 31.1% 0.7590 I calculated the weights from the population and sample for each of the strata, and also calculated the weighted prevalence which is 53.22%. I would like assistance in calculation of the variance and design effect in this case. Thanks in advance!

Mark Myatt

Epidemiologist at Brixton Health

Frequent user

28 Mar 2014, 12:08

I think you want a confidence interval around a point estimate. Here is a simple approach ... You data: Strata N n P ------ ----- --- ----- 1 9870 225 0.813 2 33599 219 0.548 ------ ----- --- ----- 3 14130 212 0.311 Add weights as: w = N / sum(N) giving: Strata N n P w ------ ----- --- ----- ----- 1 9870 225 0.813 0.171 2 33599 219 0.548 0.583 3 14130 212 0.311 0.245 ------ ----- --- ----- ----- The point estimate is: sum(P * w) = 0.813 * 0.171 + 0.548 * 0.583 + 0.311 * 0.245 = 0.535 The variance is: sum((w^2 * p * (1 - p)) / n) From your data: Strata N n p w x$w^2*x$p*(1-x$p))/x$n ------ ----- --- ----- ----- ----------------------- 1 9870 225 0.813 0.171 1.975795e-05 2 33599 219 0.548 0.583 3.844253e-04 3 14130 212 0.311 0.245 6.067027e-05 ------ ----- --- ----- ----- ----------------------- SUM = 0.00046485 ----------------------- SQRT(SUM) = 0.02156046 ----------------------- The 95% CI is then: Lower 95% CL = 0.535 - 1.96 * 0.02156046 = 0.4927 (49.27%) Upper 95% CL = 0.535 + 1.96 * 0.02156046 = 0.5773 (57.73%) Is this what your need? BTW : The design effect is the ratio of the variance (calculated above) and the variance calculated from a simple random sample. With your data: c = round(225 * 0.813 + 219 * 0.548 + 212 * 0.245) = 355 n = 225 + 219 + 212 = 656 p = 355 / 656 = 0.541 var = (p * (1 - p)) / (n - 1) = (0.541 * (1 - 0.541)) / (656 - 1) = 0.000379113 DEFF = 0.00046485 / 0.000379113 = 1.23 Avoid rounding early (as I have done above). I hope this helps. You should check my arithmetic.

Anonymous 730

Normal user

28 Mar 2014, 12:44

Many thanks Mark, very very helpful!

Mark Myatt

Epidemiologist at Brixton Health

Frequent user

28 Mar 2014, 14:31

Just to be clear ... rounding errors can accumulate and end up becoming quite large over a series of calculations ... I have been lazy above ... keep all numbers to full precision throughout and round only the final results. If you do calculations in a spreadsheet with raw numbers then you should be OK as the full precision is often retained "behind the scenes".

Anonymous 730

Normal user

28 Mar 2014, 14:52

Thank you very much

If you have any problem posting a response, please contact the moderator at post@en-net.org.

Back to top

» Post a reply