indÃ©pendant consultant

Normal user

19 Aug 2010, 07:07

We have a recurrent issue with sampling that I would be interested to get some feedback on. When designing baseline assessment for typically 1year projects, the numbers in each age group for the IYCF indicators is quite small. For example we are currently designing an integrated baseline looking at livelihoods, WASH, Heath and Nutrition. We are sampling 400 households out of 4012 households which are receiving the livelihoods interventions. This is not the total number of households in the target area but the most vulnerable and the ones receiving the key LH interventions. The nutrition interventions will be available to all mothers in the target area though priority will be given to the most vulnerable when necessary. It is not realistic to sample more HH than this given resources, time, capacity etc however the total number of 0-6months will be approx 48 (and for other indicators, introduction of comp foods, the number is even smaller as the age group is smaller 6-8m). We cannot only include HH with children under 24months for the IYCF indicators as the health sectors require indicators for Under 5s.
Exclusive BF rates is one of the key indicators but I am sure it is not possible to measure this change in behavior over a short period of time amongst a relatively small sample through a baseline/endline , or indeed whether we will even get a EBF rate which reflects the situation. Maybe we need to do more creative ways of measuring it through mothers groups etc, ongoing monitoring. This is an issue that has arisen time and again and I wander if other people have faced it and how they have over come it?
Many Thanks

Consultamt Epidemiologist

Frequent user

19 Aug 2010, 16:32

I think the problem is arising because you are trying to measure multiple indicators with a single survey. This means that you have to select a sample size for the indicator that requires the best relative precision (usually an indicator with an expected prevalence of about 50%). Your problem is compounded by the fact that your indicators are measured at different levels (e.g. WASH indicators often apply to households) or in different groups (e.g. 6 - 59 months or 6 - 24 months or 6 - 36 months). This makes sample size calculations quite tricky. The basic idea is to find the sample size than meets your needs for every indicator. If you have an indicator that applies to a small proprotion of the survey population then you will usualy end up with a massive sample size for the entire survey. Here's an example with just two indicators ... We want to estimate prevalence of GAM. We expect this to be about 10% and we want a precision of +/- 3%. We need a sample size of 384 children aged 6-59 months. Imagine we use this as our survey sample size. We now want to estimate something in the 6-8 month old children using our sample of 384 childen. The sample size will now be 384 * (3 / (59 - 6)) = c. 22 children. This is a small sample size. An estimate of (e.g.) 50% would have a 95% CI of 28% - 72%. I doubt this is precise enough for your purposes. If we go about this by starting from a need to sample 384 kids aged 6-8 months from a sample of kids aged 6 - 59 months then the entire sample size would be something like 384 * ((59 - 6) / 3) = 6784 kids. Even if we apply some sort of finite population correction to this it will still be a very big number.
So ... what to do? I can think of two approaches:
(1) Do a special survey that samples only the 6-8 month old children. This could be nested inside the main survey. This can be confusing for survey staff and care is needed to distinguish between the original and the special "top up" sample.
(2) Reduce sample size requirements by testing rather than estimation. Testing can be used when we have a standard to meet. Say we want our indciator to be above 80% ... we can just test whether the data are consistent with a prevalence > 80%. All you get is an answer to a yes / no question (i.e. whether the indicator is above 80%) rather than an estimate. The sample size calculation is quite simple. To give you an idea of the sample sizes required ... if you said that < 50% was bad and > 80% was good then a sample size of about 30 kids is needed for 5% errors.
If you go for (2) then you still need to work up to a bigger sample size (as in the example) but you do end up with a smaller final sample size ... using the example it would be 30 * ((59 - 6) / 3) = 530.
I'd be happy to help with sample size calculations for (2) above if needed ... we can do this as an example on this forum.
I hope this is useful. Perhaps there are other solutions.