Menu ENN Search
Language: English Français

Modified EPI vs Segmentation in Urban Settings

This question was posted the Assessment forum area and has 12 replies. You can also reply via email – be sure to leave the subject unchanged.

» Post a reply

Anonymous 1354

Normal user

12 Apr 2012, 02:15

Hello,

Id like to know what is the best recommended methodology for SMART Survey in Urban Setting with large population (total population: 500,000p, pop/cluster >2000 households) where no HH database is available ?
Thank you

Mark Myatt

Consultant Epideomiologist

Frequent user

12 Apr 2012, 15:07

Do you know the population in each of the potential clusters? If you do then you can use PPS to select clusters. If not then you can use a systematic sample (e.g. CSAS) to select clusters. The problem then is how to select households to sample. See this document for help with that (and cluster selection). You might also find this document on using satellite imagery useful.

Please send a follow-up question if you need anything clarified.

I hope this is of some use.

Tariq Khan

Normal user

12 Apr 2012, 19:42

If you have identified your cluster and cluster posses HH more than 2000, you better go for segmentation based on, say administrative structure and select one segment randomly. Modified method is no more used at all, due to its inflexibility.
Here I would also mention that ENA SMART will give you cluster based on the population size but for segmentation purpose you use number of HH in each segment and this can be done even on field.
Mr. Myatt is right, using satelite imagery will give you good idea of an area's structure and boundaries.

Anonymous 1354

Normal user

13 Apr 2012, 01:27

Hello

Thank for your quick replies. I already did the clustering using PPS and I have an average of 2000 people per cluster zone. My area to survey is composed of urban areas like slums with high population density and some areas more rural like (rice paddies etc).
I think that for the urban areas, I can do the segementation and use systematic sampling.
For the rural areas, I was considering using modified epi method as it is recommended when HH are not well organized.

What do you think?

Thank you very much

Anonymous 81

Public Health Nutritionist

Normal user

13 Apr 2012, 06:02

Dear Diane,

As per your reply, it seems that your survey area is composed of urban and rural. if so, is it one survey or is it two separate surveys for each setting (urban+rural? if they are not homogenous in terms of livelihood and other social services, a single survey might not reflect the actual situation.

Anonymous 1354

Normal user

13 Apr 2012, 06:52

Hello,

Actually, the area is a big city with downtown being mainly urban and quite packed in term of population size and the limits of the city being composed of more rural areas. This city suffered from recent flooding and we are targeting the affected areas. I have chosen a DE of 1.7 to compensate for the differences (a SMART Survey was done in the neighboring region with half urban, half rural settings, the DE found was 1.5).

Anonymous 81

Public Health Nutritionist

Normal user

13 Apr 2012, 07:27


I had similar issue. here is the link http://www.en-net.org.uk/question/66.aspx , the response from Mark Myat and Miichael Gold.



Tariq Khan

Normal user

13 Apr 2012, 07:51

Hi,
Diane, it good to have a bit higher D.E than to have two surveys, i think they are not completely well separated places. Increasing D.E is then good option. In my experience I had the rural areas very close to Urban and in such case I increased the D.E to even 2.

Mark Myatt

Consultant Epideomiologist

Frequent user

13 Apr 2012, 09:04

It is difficult to give a definitive answer to some of these issues. I think that you need to ask yourself if it makes sense to survey these two (urban / rural) or three (urban / peri-urban / rural) populations as one population. Will the result be readily interpretable? If not then you should consider 2 or more surveys. Raising the DEFF is an option but it does not solve the problem of interpretability of the survey result.

I am not convinced that it is reasonable to treat an entire city as a single population. In UK cities (e.g.) we usually see poverty in a belt around an affluent city centre surrounded by affluent suburbs (three populations minimum). In other settings we see peri-urban poverty.

Raising the DEFF will increase the calculated required sample size. The best way of collecting the larger sample size is to collect data from more clusters rather than just increasing the cluster size. You can reduce the DEFF by employing a within-cluster sampling strategy that captures more variation than the traditional proximity sample. The systematic HH selection process outlined in the urban sampling document does this by sampling from the entire cluster area rather than just a cluster of HHs. Using EPI3 or EPI5 (spin the bootle at each selected HH and take the 3rd (EPI3) or 5th (EPI5) HH in the indicated direction) in rural settlements does the same.

If you suspect a high DEFF is needed then you will have a big sample size. This will be best collected as more clusters. It will really not be much more work or money to do two surveys with lower DEFFs particularly in such a compact survey area.

I hope this is of some help.

Please keep us informed of your work in this forum.

Anonymous 1354

Normal user

13 Apr 2012, 09:36

Hello Mark,

Thank you for your feedback. Actually, 2 months ago, an emergency nutrition assessment was done in this area using MUAC and the results didnt show any disparities between rural and urban settings. Likewise, a SMART survey done in a similar area (southern region) with half urban and half rural produced a DEFF for W/H of 1.4.
Would you still recommend to make a separation based on these information?
Thanks
Diane

Mark Myatt

Consultant Epideomiologist

Frequent user

13 Apr 2012, 10:54

It seems, then, that a single survey might suit. Please keep us informed of what you decide and how it worked.

I assume you will collect both MUAC and W/H.

Asfaw Addisu

Emergency Nutrition Specialist, UNICEF

Normal user

13 Apr 2012, 21:23

Hi Diane,
I would suggest for the same option as Mark for Urban sampling (using PPS for cluster sampling with a known population size per cluster or Systematic sampling for otherwise + Segmentation of larger clusters). But for the rural population cluster sampling using PPS+Segmentation, then for HHs sampling/cluster it is better to use simple random sampling, Modified Epi method introduces biase as HHs selected are by far not independent of one another.
The other Issue regarding the DEFF is that this is usually dependent on the context of your geographic area, given your case is the 'flooding' that affected both the urban and rural population. In this case flooding does not equally affect the population living in Urban and those in Rural population as the rural population is more likely to be highly affected than the urban for various reasons. Hence your design effect will still fail to account for the intra-population clustering effect - (within urban and within rural).
If you believe this effect is less likely to be the situation in your case, it is ok to use a design effect of between 1.7-2. However If the above senario is more likely to be the case then it is better to do two separate surveys applying the above conditions for sampling.

Mark Myatt

Consultant Epideomiologist

Frequent user

14 Apr 2012, 15:23

A few points that might be helpful ...

We have to be careful with terms. The term "modified EPI method" is vague. The type of sampling that this commonly used for the within-cluster sample in SMART surveys is usually a proximity sample in which one household is selected at random with subsequent households selected by their proximity to the previously sampled household. We could call this "modified EPI" but "proximity sample" is more precise. At the opposite end there is a the "QTR + EPIx" method in which a community is divided into four parts of approximately equal population, a quarter of the cluster sample is taken from each of the quarters with the first household selected at random and subsequent households selected using a random walk with a random direction (spin a bottle) and a fixed step size (EPI3 = every third house - good in small communities, EPI5 = every fifth house - god in large communities). The "QTR + EPIx" method has been shown to be broadly equivalent to a simple random sample but may still be described as a "modified EPI method". If you use a sampling method that gives the equivalent of a simple random sample then the DEFF will drop. The point is to use a sampling method that captures variability rather than one that reduces variability. It is difficult and time-consuming to do simple random sampling in many settings. This is why the "QTR + EPIx" and similar methods where developed.

DEFF is a phenomenon caused by cluster sampling, and which increases the sampling error or imprecision. Households / individuals within a cluster resemble each other because of their proximity, thus resulting in an overall loss in sampling variability (this is taken from here. More precisely:

    DEFF = 1 + r(n-1)

where:
                     between cluster variance
    r = --------------------------------------------------
        between cluster variance + within cluster variance

and:
    n = mean cluster size

A value of "r" of 0.05 (e.g.) means that that the elements in a cluster are about 5% more likely to have the same value than two elements chosen at random from the survey population. Small values of "r" give better better reliability (precision). See this article for more details and some illustrative simulations.

If you examine the formulae given above you will see that we can reduce the DEFF by:

(1) Increasing within cluster variance. This is why a simple random sample or "QTR + EPIx" is better than a proximity sample.

(2) Reducing cluster size. This means taking more smaller clusters.

When sampling in compact / urban areas, we are able to take many small clusters at little extra cost. Rather than take a sample of (e.g) 30 clusters of 30 children by "proximity" (the old "30-by-30" design) I might go for something like 45 clusters of 12 by "QTR + EPIx" or "systematic by door counts". The latter (i.e. 45-by-12) design has a smaller overall sample because of a smaller expect design effect.

I think the issue of surveying two populations is more one of interpretability (i.e. you may end up with an average that does not apply to either population and so applies to none) than precision.

I hope this is of some use.

BTW : The use of small clusters coupled with a high variability within-cluster sample is part of the proposed RAM ("Rapid Assessment Method") survey design that we hope will begin field-tests later this year.

Back to top

» Post a reply