# S3M

This question was posted the Coverage assessment forum area and has 5 replies.

### Anonymous 618

Normal user

12 Oct 2012, 13:38

### Mark Myatt

Consultant Epidemiologist

Frequent user

12 Oct 2012, 15:10

### Anonymous 2251

Normal user

9 Apr 2013, 12:57

### Mark Myatt

Consultant Epidemiologist

Frequent user

10 Apr 2013, 14:09

**Overview :**The S3M is a cluster sampled survey method (just like SMART) but with clusters selected using a spatial sampling method rather than PPS (as is often done with SMART). Estimates and classifications are made for the overall survey area and over much smaller areas within the overall survey area. The PPS sample is problematic as it tends (by design) to concentrate the sample in the most populous communities. PPS is also problematic if accurate population data is not available. This is typically the case in emergency contexts.

**Overall estimates :**PPS is a

*prior*weighting scheme. S3M uses a

*posterior*weighting scheme as might be used with a stratified sample using a fixed quota in each strata. This allows S3M to produce overall estimates with more populous communities receiving higher weighting than less populous communities. The difference between the two approaches is that PPS weights before sampling and S3M weights after sampling. The issue of loss of sampling variation (design effect) is addressed in the within-PSU sample design (see below) and by using appropriate data-analysis techniques. Data from S3M surveys can be analysed using model-based approaches (e.g. as in the "svy" commands in STATA or the Complex Samples module in SPSS). We use

*blocked and weighted bootstrap estimators*because these allows more flexibility in terms of the statistics that can be used with classical methods (e.g. an exact 95% CI on a median - used in analysing HDDS - is extremely difficult using model-based approaches but trivial using the bootstrap). The sample size used for overall estimates is usually forced upon us by the need to make more local estimates and classifications. This means that we tend to have larger overall sample sizes (and better precision) than a SMART type survey.

**Within-PSU sampling :**The map-segment-sample (MSS) technique produces a sample that is much closer to a simple random sample than the proximity sample used by SMART. This leads to lower design effects (better precision) less bias. Work done during the development and testing of the EPI survey method indicate that the SMART type proximity sample are not appropriate for variables showing a centre-to-edge gradient or within-community clustering. These include education, pregnancy status, variables relating to child-care, epidemic diseases, socio-economic factors and variables related to health care. This means that the SMART sample is a poor choice for many indicators. There is nothing to stop anyone using MSS as a within-PSU sampling strategy with SMART.

**Local estimates / classifications :**A key reason to use S3M is to exploit the spatial sample to map indicator values at local levels. The key issue here is sample size. A local area is represented by three PSUs which include data from between one and six neighbouring communities (the exact number here is defined by average village size and the target population). Random (sampling) variation is damped by data smoothing that arises from the sample design. We have developed and tested a number of small sample indicators. These are indicators that are designed to perform well with small sample sizes (by "small" we mean

*n*= 60 to

*n*= 96 total from three PSUs). These are sometimes standard indicators (e.g. FANTA's HDSS). Sometimes they are adaptations of MICS and DHS indicators (e.g. JMP's WASH indicator set, the S3M/RAM IYCF indicator set). Sometimes they are entirely new indicators (e.g. PROBIT for GAM). We now have indicators that cover most applications. Many of these are standard indicators. In May 2013 we plan to pilot a revised IM method for mortality (CMR) estimation. Where we cannot estimate with useful precision we use sequential sampling classifiers with useful accuracy and reliability (as we often do with SQUEAC and SLEAC) of with sample sizes of

*n*< 50 total from three PSUs. This does not really answer your question. Perhaps a look at the pedigree of the components might ...

**Sampling on a hexagonal grid :**This is a modification if the standard CSAS / quadrat sampling method to improve evenness of sampling. This has been used in the ASEAN PONJA assessments and in a number of S3M surveys in Niger, Sudan, Ethiopia with good results.

**Use of spatial tessellation techniques :**Voronoi polygonisation is a common technique dating back three of four hundred hundred years. Its first epidemiological use was in Snow's groundbreaking work on Cholera. The use of triangulated irregular networks is a common technique in geostatistics and spatial epidemiology.

**Posterior weighting**is a standard survey technique. It is desired in all basic textbooks on survey design.

**Bootstrapping**is a modern (i.e. non-classical) statistical technique. We have extended the general approach to the analysis of data from cluster sampling by borrowing techniques (i.e. blocking) from time-series analysis. Weighting is achieved by a standard "roulette wheel" algorithm. We tested our approach using the same data (Niger IYCF data) analysed by us using the bootstrap and by CDC using SAS. Results agreed to 4 decimal places. The

**MSS within-PSU sampling method**is taken from the literature associated with the testing and developing of the EPI survey design method. MSS was used and tested in ASTRA trachoma surveys.

**Sequential sampling classifiers**use standard methods. Sample sizes are found using computer-based simulations of sampling from finite populations. The small sample

**IYCF indicator set**is developed from DHS indicators. The

**PROBIT indicator**was developed on the recommendation of a WHO expert committee. The method has been tested and published. We now use an improved estimator. This technique is currently being tested on another database of SMART surveys by CDC. The

**IM method**was developed and tested by FANTA and LSHTM. We improve case-finding sensitivity by using multiple informants. We are concerned about a bias due to small numbers but the bias will be consistent and so allow mapping / identification of mortality hotspots. S3M is (like many survey methods) a combination of well understood and tested components. The proof of the pudding is in the eating ... (1) You are welcome to join us as an observer in any of our upcoming S3M surveys. (2) You are welcome to contact any of our partners to discuss their experiences with the method. You could try one yourself.

**BTW :**I am one of the developers of S3M so I am probably not the best person to ask for an objective view. I am happy to meet with your team or with statisticians of your choosing to discuss this work. I have not really though that validity of the overall method was an issue ... I have more concerned with practicability and validity / utility of specific indicators.

### Bradley A. Woodruff

Self-employed

Frequent user

11 Apr 2013, 01:24

### Mark Myatt

Consultant Epidemiologist

Frequent user

11 Apr 2013, 10:06

If you have any problem posting a response, please contact the moderator at post@en-net.org.