Language: English Français

# Sample size calculation on IYCF indicators in small scale survey

This question was posted the Infant and young child feeding interventions forum area and has 11 replies.

### Mark Myatt

Frequent user

23 Dec 2010, 11:18

This question should have been posted to the assessment forum. You have to be very careful when investigating mortality. The main sampling issue is survivor bias (i.e. the sample tends to exclude households in which deaths have taken place and is, therefor, biased to include survivors). All of the other indicators relate to living infants and children and the sample will be of households with children. Any household that has lost all children will be excluded from the sample. It is important to realise that "all children" can mean just one child. Any survey done this way will underestimate mortality. If (e.g.) infant mortality is high in first-born children of young mothers then the magnitude of underestimation will be large. Mortality is a difficult indicator. My recommendation is to always use a separate survey to estimate mortality. You can do the two surveys at the same time in the same villages but the sample will be different and is probably best collected by a separate enumerator. Some surveyors use a mixed sample which consists of the households with children AND the households that were "skipped" because they did not have children. I find this slow and inefficient and can lead to a tendency to rush the mortality component. I hope this helps.

### Tamsin

Forum Moderator, ENN

Forum moderator

23 Dec 2010, 12:04

From Kirk Dearden: A couple of thoughts and questions: 1. I'm assuming you are conducting a survey that is principally designed to capture something other than IYCF and that the nutrition questions are simply being added to that questionnaire (which appears to be focused first and foremost on mortality). Knowing this helps one decide on the most appropriate sampling strategy. 2. It seems to me that there should be 37 infants 0-5.9 months of age, not 33 because 333 divided by 9 age intervals between 6 and 59.9 months = 37 per 6-month age interval. 3. I don't follow how there will be 367 respondents in 408 households. Also, if the sample size goes up from 408 to 514, I wouldn't consider this a massive increase. Perhaps I am missing something. 4. I don't know all of the logistics and costs associated with conducting the survey in this particular context; however, given that a lot of the expense associated with a survey is finding the households in the first place and that the IYCF questions take up about 2 pages of a questionnaire (i.e., it wouldn't take long to administer them), I'd recommend conducting the IYCF survey in each one of the 514 houses where there is a child 0-59.9 months of age. If that simply isn't feasible, then you can simply collect IYCF data from the 408 households. In other words, they'd collect data from 79.4% of the households involved in the mortality survey (408/514). Perhaps the easiest way to do this is to ask during the mortality survey for age-eligible children (0-59.9 months) in each household in each cluster. Once you reach the necessary 25 kids for a given cluster (370 sample size divided by 15 clusters), they stop collecting IYCF data on kids for that particular cluster. Hopefully, this is of some help. Kirk

### Kirk

Associate Professor, Boston University

Normal user

23 Dec 2010, 13:31

Thanks. This all sounds reasonable.

### Marie McGrath

ENN

Frequent user

23 Dec 2010, 15:18

Dear Muhiadin, I've had some e-chat with Mary Lung'aho on your question, who has been very involved with developing the IYCF Step by Step Guide, especially with regard to IYCF. Just a note to emphasise, with regard to the IYCF survey. A separate survey of mothers of children 0-<6 months and of mothers of 6-<24 months is not needed. The same questions should be asked of all mothers with children 0 up to 24 months of age. Best regards Marie & Mary

### Mark Myatt

Frequent user

30 Dec 2010, 16:03

Addressing the sample size issue ... this is tricky with multiple indicators and these will have different expected values and need to be estimated with different levels of precision. The usual thing to do in such situations is to pick the most important indicator(s) and calculate the sample size based on expected value, desired precision, and expected design effect. If you do this for more than one indicator then you use the largest sample size. Since you posted this in the "Infant and young child feeding interventions" forum I'd guess that some form of IYCF indicator is most important to you. The indicator is often a complex case-definition which is applied to each respondent and returns a pass / fail classification. This sort of indicator can be thought of as a coverage indicator. The most common programs for which coverage is estimate are EPI and CMAM. Both of these tend to use a precision of +/- 10% or better. A random sample of n = 96 gives this for any proportion. For a cluster-sampled survey we usually assume a design effect of 2 (unless we know better). Then gives n = 96 * 2 = 192. Cluster-sampled surveys work best with many small clusters with "many" usually defined as >= 30 clusters. Assume you need 30 clusters. The cluster sample size is then 192/30 = 6.4 which should be rounded up to 7. This is the same as the standard 30-by-7 EPI coverage survey. This is good for estimating a single proportion. For many proportions (e.g. age-specific proportion) you will need a bigger sample size. For me, n = 210 collected as 30-by-7 would be good enough for M&E / surveillance. For nutritional anthropometry you could use the PROBIT approach which works well at small sample sizes and is implemented in the SMART software. This sample size is also suited to the LQAS approach developed by FANTA for classifying the prevalence of wasting. In summary ... your proposal for n = 333 is plenty. To address the mortality sample size issue ... I need more information (i.e. expected rates, required precision, duration of recall period, and average household size). You may want to use the new informant method (also developed by FANTA). I hope this helps.

### Tamsin

Forum Moderator, ENN

Forum moderator

31 Dec 2010, 10:43

From Bradley Woodruff: Dear All: Thanks to Mark for his very clear explanation. One additional point you should remember is that the standard IYCF indicators, as recommended by WHO and UNICEF, apply to age groups of differing width, and some of these age groups are quite narrow. For example, introduction of complementary foods is calculated only in children 6-8 months of age, while ever breastfed is calculated for all children less than 24 months of age. Therefore, if you are recruiting children in different age groups from the same sample and you are calculating a sample size for more than one IYCF indicator, you must find a common unit into which the sample size for each indicator can be converted so that you can compare the various sample sizes you have calculated. For example, if you calculate that the sample sizes for introduction of complementary foods and ever breastfed are both 333, as recommended by Mark, and you are conducting a household survey, which indicator requires visiting more households? You do not know this until you convert the sample size of 333 children 6-8 months of age and the sample size of 333 children 0-23 months of age into the number of households you must visit to find the 333 children in each age group. Let's work through an example. If we know, for example, that 18% of the population is less than 5 years of old and that the average household size is 5.5 persons, we can calculate the average number of children in each age group who will be found in each household. If we assume that infant mortality and child mortality are neglible (which may not be true in some populations), then about 40% of the under five population is under two, and 7.2% of the population is less than 24 months of age (18% x 0.4). The age group 6-8 months represents 3 months of the 59 months in the first 5 years of life; therefore, children 6-8 months make up 0.92% of the population (18% x [3/59]). Now multiply the average household size by the proportion of the total population made up of each target group to get the average number of children in each age group per household. For children less than 24 months, this would be 0.4 child per household (0.072 x 5.5 = 0.396). For children 6-8 months of age, this would be 0.05 children per household (0.0092 x 5.5 = 0.05). Now to calculate how many households you will have to visit to find the 333 children for each indicator. For children less than 24 months of age, you will have to go to about 2.5 households to find one eligible child (1 / 0.4). To find 333 children, you will need to visit 833 households (333 / 0.4). For children 6-8 months of age, you will have to go to 20 households to find one eligible child (1 / 0.05). To find 333 children, you will need to visit 6,660 households! (333 / 0.05) Therefore, you will have to go to many more households to find children for the indicator introduction of complementary food than for the indicator ever breastfed. So even though the sample size for children may be the same, when it comes to the sampling unit (in this example, households) the sample sizes are very different. In fact, because some of the IYCF indicators, such as introduction of complementary foods, are calculated in children from such a narrow age range, it is essentially impractical to measure them in household surveys. You may need to find some other sampling frame, such as population registrations or MCH registrations, from which to sample such children. Regards, Woody

### Mark Myatt

Frequent user

3 Jan 2011, 10:05

Woody is more familiar with the specifics of the indicator(s) than I am and what I write now may be silly. The indicator(s) vary with age. We should expect this since we don't (e.g.) consider exclusive breastfeeding to be appropriate for the older child. My thinking is that all you really want to do is is apply a case-definition that tells you whether a given child is exposed to adequate or inadequate IYCF practices. The case-definition that is applied will have different sets of clauses for different age-groups but should still produce a binary pass / fail classification for every child regardless of their age. In this case you are estimating a single proportion and the sample size calculation is simple and the sample size required quite small. This "common indicator" approach was developed by Arimond and Ruel (International Food Policy Research Institute) for in DHS surveys. Their indicator was call ICFI. I think the IYCF should not be much different ... if it is then I think that a simpler indicator set is required that is better suited to simple low cost surveys. In the meantime, I suggest we use the ICFI with the addition of exclusive breastfeeding for the 0-6 month age-group. What you lose from this approach is an ability to "decompose" the indicator ... by this I mean that you might find that < 70% of children receive a "pass" classification but, with a small sample size, it may not be possible to identify the exact nature of failure. Just my tuppence.

### Alexandra Rutishauser-Perera

International Medical Corps

Normal user

1 Oct 2012, 20:56

Dear All, I have the same kind of question about a sample size... The KAP will include IYCF, WASH and GBV questions in a camp context... I calculated the sample size required for the IYCF indicators , So we have for Camp 1: 146 individuals Indicators Prevalence (p1) Expected Prevalence (p2) Sample Size Exclusive Breastfeeding under 6 Months 55% 71% 141 Minimum dietary diversity (6 – 23 months) 73% 86% 146 For Camp 2: 233 Indicators Prevalence (p1) Expected Prevalence (p2) Sample Size Exclusive Breastfeeding under 6 Months 29% 43% 201 Minimum dietary diversity (6 – 23 months) 73% 86% 233 Based on the results above, the sample size is 146 for Camp 1 and 233 for Camp 2. However, I understood that this figure has to be multiplied by four to put into consideration the four distinct age groups namely 0 – 5, 6 – 11, 12 – 17, and 18 – 23. Therefore, the final sample size for Camp1 would be 584 and 932 for Camp 2 which is HUGE ! What am I doing wrong here? Thanks for your help...