Menu ENN Search
Language: English Français

Stage 3 - Sample size issue - Help

This question was posted the Coverage assessment forum area and has 15 replies. You can also reply via email – be sure to leave the subject unchanged.

» Post a reply

Géraldine LE CUZIAT

ACF

Normal user

12 Dec 2011, 09:35

Dear all,

We are currently conducting a SQUEAC investigation in two townships of the Northern Rakhine State in Myanmar. We have been doing well so far but we are now facing a problem - as it seems that we will not be able to reach the minimum target sample size (n=52) of SAM children at stage 3 in one of the 2 townships. We don't have the total number so far - should be ready by tomorrow.

We conducted the small-areas surveys in 2 villages per quadrats. The initial total of quadrats was 10 but we did not manage to access 1 QDT for security reasons. Another QDT was left as the team did not find any villages within the quadrat (remote villages in the mountains).

Against this background, we thought of 2 major options :
- Option 1: we may reduce the precision (11% instead of 10) - which will decrease the sample size to 38 - which should be feasible.
- Option 2 : we may go back and sample other villages in every 8 quadrats to increase the number of SAM children and eventually reach 52. We still have a bit of time ahead to do it.
Maybe there are other options that should be considered and we did not think about.
According to your experience, what will you suggest as a way forward ?
What shall we do if we are unable to reach the minimum sample size ?

Thanks for your prompt support,
Best Regards
Géraldine

Ernest Guevarra

Valid International

Technical expert

12 Dec 2011, 11:16

Géraldine,

From your email, the following conditions seem to be the case:

1. You are confident of the exhaustivity of your surveys in each of the quadrats (i.e. all or almost all SAM cases have been found);
2. You are keen to keep your precision at ±10% as much as possible; and,
3. You have time and resources to still do some more villages

From my experience, more sample is always better (and ideal) especially if you have the time and resources for it. So, your option 2 will be my main suggestion. However, this is hinged on whether #1 above is true (exhaustivity is certain) as doing more villages in the setting of poor exhaustivity will be a waste of time. If there are any doubts on your part about this, then if you are to do option 2, you need to improve exhaustivity as well.

What you need to consider here then is how much more villages per quadrat should you sample for you to get the remaining number of cases to reach your minimum target of 52. Because you have done the surveys already, you are a bit more informed now of what you can expect to find per village and based on this estimate how many more villages per quadrat to sample. You can estimate this by:


# of add. villages to sample = # of additional SAM cases to find / average # of SAM cases found per village


Remember that you need to always roundup this calculation (e.g. go for the higher number of villages). In any case, you are looking at at least 1 additional village per quadrat (additional 8 villages) or more to get to your minimum target sample size.

In the occasion that you are still unable to get your target sample size, then use the sample size you reached and continue with the Bayesian conjugation and report results along with the actual precision that you have reached. A precision of up to ±15% is still acceptable. It is always more acceptable to report on your achieved precision after the survey rather than changing the premises/foundations of your sampling plan to fit/accommodate the sample size you have reached.

I hope this helps.

Mark Myatt

Consultant Epideomiologist

Frequent user

12 Dec 2011, 12:34

First ... I think that there is a confusion regarding SQUEAC terminology. A "small-area survey" is a survey used to test the hypothesis that coverage in an identified area that is (much) smaller than the entire program area is either above or below an acceptable level. The type of survey used in stage 3 of a SQUEAC investigation is a wide-area survey in the sense that it is a survey representing the entire program area.

Now ... back to you questions ...

I agree with what Ernest has to say. I have a few things to add ...

Option 1 : In terms of precision 11% rather than 10% is not a big difference. I would not be too concerned if (e.g.) a SQUEAC investigation returned a 95% CI of +/- 12% or +/- 13%. That said, 10% is the standard precision for coverage surveys of child survival interventions (after the EPI survey method).

Option 2 : You could go back if you have time and resources to do so. You could, instead, spend the available time working out how to best reform the program given all the SQUEAC findings available to you. It all depends on how precise you want the estimate to be and how you think your time and resources will be best used.

Note : If you decide to go back then you should go back to all the quadrats. Don't just sample until you get n = 52. This avoids a selection bias creeping in as the temptation is to choose the easier-to-access quadrats. A random top-up sample (i.e. sample at random from the quadrats until you get your sample size) will probably not work well with such a small number of quadrats and the likely small size of the required top-up.

If you fail to reach the minimum sample size you should check that you are not violating method assumptions:

(A) Check that the posterior alpha and the posterior beta are both greater or equal to 10.

(B) Check that the sum

     prior alpha + prior beta + survey sample size 
is greater than or equal to 30.

These rules only apply if you analyse the survey data by hand using the methods described in the SQUEAC handbook. If you are using BayesSQUEAC to analyse your data then these rules do not apply since BayesSQUEAC uses a different analytical process.

More information can be found in main SQUEAC section and the technical appendix of the draft SQUEAC handbook.

I hope this helps.

Géraldine LE CUZIAT

ACF

Normal user

12 Dec 2011, 13:04

Many thanks for your replies - this helps a lot. We just got the final results from the field team and we surprisingly found more SAM cases in the last two quadrats over the past three days. We now have 82 cases.
Anyway, thank for your detailed and very much appreciated support.

I have one remaining question regarding SQUEAC methodology. We are planning to re-do the exercise in 6-month time and follow the same methodology. I don't have much experience in SQUEAC but I was wondering if we will not introduce any bias/flaws if we perform the wide-area survey (thanks Mark for correcting) the same way as we did. This means that we will go to the same villages in the same quadrats.
Any thoughts ?

Best regards,
Géraldine

Mark Myatt

Consultant Epideomiologist

Frequent user

12 Dec 2011, 15:00

Very good question!

Let me pontificate ...

It is an "observer effect". It is worthwhile thinking through this effect. Prevalence surveys (e.g.) rarely have much of an observer effect unless they adhere to the "no service - no survey" principle and provide a service to the cases found.

Coverage surveys following CSAS, SLEAC, SQUEAC, or S3M methods adhere to the "no service - no survey" principle. We find SAM cases and, if uncovered, refer them and inform outreach services / community-based volunteers. If we find complicated cases requiring inpatient care then we might transport them to the stabilisation centre directly. The effect of a coverage survey is, therefore, to immediately improve coverage. This is not a flaw. It is a desirable consequence of ethical behaviour (if only all ethical questions were this simple!).

Experiences from the early CSAS surveys demonstrated this effect but they also demonstrated two other effects :

(1) Coverage was improved as programs were reformed to take into account the barriers data collected by the coverage surveys. Improving coverage by reforming programs to alleviate identified problems is usually why we do coverage surveys.

(2) Recruiting and successfully treating cases from surveyed communities (in-program mortality is low and many cases are cured quickly because we find cases before they are very severe) led to the circulation of good opinions of the program and to sustained improved coverage in those communities. This is the "success breeds success" principle of CTC programming.

The observer effect can, with coverage surveys, have this double effect. This is, I think, very much a good thing. It means that coverage surveys have, in themselves, a sustained positive effect on coverage. I find it difficult to see this as a flaw or a bias.

Addressing your question directly ... You are correct. We do expect to see sustained improvements to coverage in the surveyed communities (i.e. unless we have a really bad program). If we go back to the same communities we will see this. The "flaw" or "bias" is that this might lead subsequent surveys to overestimate overall coverage. It also allows the indicator to be "gamed" (e.g. the program could just concentrate on a handful of villages and look great).

We need to avoid the bias due to the observer effect and minimise "gaming" opportunities. The obvious solution is to exclude previously sampled communities from the sample. This is not a good solution because (1) it excludes communities that have benefited from previous surveys and will bias the coverage estimate down, and (2) we'll pretty soon run out of communities to sample. A "nearest neighbour" approach of taking the near neighbours of already sampled communities potentially suffers from a "contamination effect" if the observer effect also has some effect on neighbouring communities. What we need to do is take a sample that is reasonably independent of previous samples. There are a couple of approaches that we could use:

(1) We stick with CSAS but move the grid a bit and / or alter the size of the grid a little.

(2) We use eccentric systematic area sampling. Here we take (e.g.) two communities at random from each quadrat. It is called "eccentric" because there is no requirement to select communities from the centre of the quadrat. Another name for this type of sample is stratified random sampling (the quadrats are the strata and we select communities at random from within each stratum).

You may be able to think of other approaches (if anyone does then PLEASE POST THEM HERE ... even if they are just ideas).

You can do (1) or (2) the next time you do a likelihood survey.

BTW : A good argument in favour of MUAC case-definitions is that it allows SMART type surveys to adhere to the "no service - no survey" principle (i.e. we identify cases at measurement rather than after data-entry) making these types of survey more ethical than they usually are.

I hope this helps.

Saul Guerrero

Director of Nutrition

Technical expert

12 Dec 2011, 16:41

Hi All,

I agree - great question. A couple of thoughts on the alternatives mentioned:

1. Moving the grid: this is nominally a good idea, as it means that we can adapt each wide-area survey to changes that may occur from one investigation to another (e.g. need for finer quadrats to explore spatial variations, more/less resources including time, etc) rather than adopting a standardised approach. I guess the challenge is that the grid is designed in a way that optimises inclusion of programme areas, and any changes could negatively affect this inclusivity or require additional resources (e.g. days) to cater for the new shape of the grid. I would be concerned that users would spend too much time trying to make sure that the grid is “different” from previous ones, rather than, let’s say, coming up with a good Prior, or collecting qualitative data.

2. Eccentric Sampling: strikes me as a good alternative, but what about the rationale for originally choosing to sample the village closest to the quadrat? Would we lose anything if we change this, or can eccentric sampling (because it uses the quadrat as strata) still yield the same spatial representation? If so, I would be in favour of randomised sampling within quadrats, but I think we would need to provide some guidance on which alternative to use (e.g. first time use village closest to the centre of the quadrat, from then on randomised sampling). I think this would help minimise confusion (for first timers) whilst granting those looking long-term with a solution to avoid feeling like they work with “sentinel sites” (for all the reasons already mentioned).

Just a thought.

Best of luck to you all

Saul

Mark Myatt

Consultant Epideomiologist

Frequent user

12 Dec 2011, 21:12

I think you have it right Saul. The eccentric approach will yield a reasonably even spatial sample and allow for repeated surveys. I think that it might be best to always use this approach. No problem if you've done CSAS in round one. You can use the eccentric approach in subsequent rounds. I think that I should put this in the handbook. What do you think?

Saul Guerrero

Director of Nutrition

Technical expert

12 Dec 2011, 21:29

I think we should certainly include it. When I wrote my last post I initially proposed that we adopt eccentric sampling for ALL surveys (and not just ones being repeated) as this would minimise confusion. So yes, Im all in favour of that.

Ernest Guevarra

Valid International

Technical expert

13 Dec 2011, 09:43

Just 2 comments on the use of the CSAS approach for the stage 3 of SQUEAC and for subsequent SQUEAC surveys.

1. CSAS approach is just one way of doing the sampling/selection of villages for stage 3 SQUEAC. The other way which is also described in the SQUEAC handbook is the use of list of villages/communities/locations stratified either by defined administrative units or other meaningful stratifications (i.e. clinic catchment areas). This is described in the case study of the national coverage survey for Sierra Leone:

http://www.brixtonhealth.com/handbookSQUEAC/nationalSLEAC.pdf

This method can be repeated in future SQUEACs without worrying about the issue that Géraldine brought up yesterday of sampling the same villages when a CSAS approach is used without modification in subsequent SQUEACs. Another advantage is that this does not require maps and even in the case of lists not being complete or not available, approaches can be used to come up with these lists as shown in this case study:

http://www.brixtonhealth.com/handbookSQUEAC/noMapNoList.pdf

Such method is comparable to a CSAS approach as it also gives an even or near even spatial spread of sampling villages/communities/locations.

I have a personal preference for this method in stage 3 sampling because of the points I raise above. From a practical perspective and from what I've experienced, it just is a more "natural" way for most people particularly ministries of health staff to organise or plan around sampling.

However, the common excuse of not having appropriate maps is becoming more and more irrelevant in the advent of Google Earth® and other open source mapping utilities that are improving quite rapidly. Satellite imagery provided by these products make appropriate maps relatively more accessible than before. Also, in country capacity for detailed maps have also increased with more open source software available for this purpose. So, CSAS approach is more than ever more viable.

2. I don't think we need to prescribe an eccentric sampling approach right from the start. Practitioners who are doing their first SQUEAC can still go by the book and apply CSAS approach as described. Future SQUEACs can then use modifications of the original CSAS grids used and apply eccentric sampling approaches that will allow other villages/locations/communities within the quadrats to have an equal chance of being sampled.

The earlier suggestion by Mark and Saul to use the CSAS quadrats as "strata" sort of uses the principles of the list approach I mention above in which the strata is defined but the grids and from here stratified random sampling or stratified systematic sampling can be done. The potential issue/challenge that may arise with this is ensuring that one has a complete list of villages/locations that are enclosed within the quadrat. Theoretically, this will be easy to do if one has very good detailed maps. However, this is not always the case and bias can be introduced if the list that you have for villages within a quadrat is not complete. Also, it is much harder to apply the no list approach described in the case study above because we will have to be asking for information about inclusion of villages in areas that are defined in a map rather than by what people or informants already know or have defined themselves. I say here theoretically because practically, this might be a non-issue but I would think that such a scenario I describe here can feasibly arise.

An alternative to this might be that using the base grid used for the initial SQUEAC stage 3 sampling, each quadrat can be further divided into smaller quadrats creating a grid within a grid. These smaller grids within the bigger grids can then be randomly chosen depending on the number of villages required per bigger quadrat (with each smaller quadrat representing 1 village). Once the smaller quadrats have been chosen, the centroid of these quadrats can be located and the closest village to the centroid sampled.

We are hopefully going to implement something similar to this in a large-scale survey we are currently implementing and would present this as a case study to the coverage assessment community once we have some indication of how it will practically work.

Mark Myatt

Consultant Epideomiologist

Frequent user

13 Dec 2011, 12:43

Just to illustrate Ernest's comment on using Google Earth ... I have used CSAS in a setting were maps and lists were incomplete or unreliable due to population displacement (settled refugees, internally displaced due to loss of land to minefields, labour migration). We did this using Google Earth to locate villages and then used key informants (local MoH staff, local NGO staff, village health workers, village shop workers, &c.) to identify (i.e. name) them. You can see the resulting map here (the filled circles represent likely clinic catchments.

I think it a worthwhile exercise to try this before having to do it in the field since it can take quite a while to understand what you are seeing in satellite imagery. I'd do this in an area that I was familiar with. The setting discussed above is a flat and arid area and this made it relatively easy to spot roads and tracks which helped locate villages. The task may not be so easy in (e.g.) heavily wooded or mountainous areas. Another issue to consider is that accessing satellite images on-line requires quite high data bandwidths. This (i.e. making maps from satellite images) may be a role for the sort of remote support for SQUEAC that ACF has been working on.

You can find a case-study on using satellite imagery for sampling in urban settings here.

A question ... Do you think that we need to add something to the SQUEAC handbook describing the eccentric approach?

Ernest Guevarra

Valid International

Technical expert

13 Dec 2011, 13:14

Yes, I think we should. We have quite a bit of material on this to work with already. Let's see what we come up with in the coming days.

Géraldine LE CUZIAT

ACF

Normal user

13 Dec 2011, 17:52

Well, thanks for the stimulating brainstorming !
From my own experience in a few countries, the internet connection is clearly a barrier to access maps online and/or Google Earth. I would suggest to add the different options that have been mentioned/debated in the handbook and let people decide what is the most feasible/best solution.
Thanks again,
Géraldine

Mark Myatt

Consultant Epideomiologist

Frequent user

14 Dec 2011, 10:41

Just having an internet connection can be a problem. A more common problem might be not having an internet connection of sufficient bandwidth to support exploration of satellite imagery using (e.g.) Google Earth. In that case there may be a role for remote support (I understand that ACF are working on this). I wonder if it might be a good idea to try this very soon. The map could be created remotely and sent to the field. A map in a vector format would be a small file. It may even be possible to send a form of coded drawing instructions in a plain text email or SMS text message. I have made an example here. Easy to make a map this may with squared paper or some very simple software. Just thinking aloud.

Anyway ... I will add something the the handbook and post here when I have something ready.

Saul Guerrero

Director of Nutrition

Technical expert

14 Dec 2011, 11:18

Hi Mark

A couple of thoughts on the issue of mapping and the potential role of support actors (e.g. ACF) working remotely.

I agree that maps of programme areas is indeed limited in many places. I also agree with you and Geraldine that internet bandwith connection is also a major concern, especially once you move away from capital cities. I think that we could indeed start working on an approach that would enable field programmes to circumvent these barriers by working with people based elsewhere. We could indeed offer this kind of support, but I think we should think about how best to offer a systematic and consistent service, rather than on ad hoc basis. In Haiti, I know that the humanitarian sector was able to link up with groups of people based in the US and Europe, to develop specific maps almost on a real-time basis to enable teams on the ground to do their activities (search and rescue in particular). We could think about something similar, in which we could engage with people who are into this kind of thing (mapping) and link them up with programmes in need of this kind of support. ACF could certainly use its relationship with the public to draw some support for it.

I also think that we need to take this discussion about mapping a step further. One thing that we have found over the last 18 months of doing SQUEAC almost on a monthly basis around the world, is how much of a difference it makes to be able to visualise key bits of information (admissions and defaulters by village in particular) on a map, and how difficult it is to do this with existing (mostly MS Windows-based) software. We have tried a number of things (using Photoshop/illustrator, for instance) to place this kind of information on existing (.pdf) maps, and the results have been good. The problem is that software like Photoshop/Illustrato is expensive, it is therefore widely unavailable, and it does require some degree of practice in order to make the most of it. We often talk about how great it would be to have a much simpler software, that we could use to generate simple maps of our programme areas, and that would enable programmes to not only INPUT data (per village) more easily the first time around, but that would also enable teams to UPDATE said data more easily. This way, the real reach of the programme could be visualised more easily and more constantly, something which we are pretty sure would lead to more proactive thinking about ways to improve coverage.

Perhaps we could open a different post to explore these issues of mapping further, and get some feedback from people in the field about what they want/think?

Best

S

Mark Myatt

Consultant Epideomiologist

Frequent user

14 Dec 2011, 12:24

I think that you are right about providing a systematic and consistent service. As regards mapping, there are NGOs that do this (e.g. MapAction). These NGOs seem to be more involved in natural emergencies and with UNOs such as OCHA. I've been around the block a few times and have not seen maps from such NGOs in use in the field. Perhaps that is our fault and we need only reach out. Perhaps all that is needed is to get / make maps or a CD of data (e.g. a high resolution satellite image of the program area) before surveying and take that to the field with us.

I agree that mapping of data is very useful for SQUEAC investigations. SQUEAC does have list-based, tabular, and graphic methods for working with spatial data but having a map is still very useful.

Good open-source software is available. All the maps in the SQUEAC handbook were produced using OpenOffice. Programs like InkScape and GIMP provide useable alternatives to programs such as Adobe Illustrator and Adobe Photoshop. Open-source GIS programs are available but, like all GIS software, can be difficult to use. EpiInfo has a mapping module. Perhaps we need to make an effort to review these bearing in mind that basic boundary files can be very expensive.

I agree that a new post is the best way to carry this forward.

Mark Myatt

Consultant Epideomiologist

Frequent user

14 Dec 2011, 13:38

Updated handbook material can be downloaded here.

Back to top

» Post a reply