# SMART surveys and interpretation of quality

This question was posted the Assessment and Surveillance forum area and has 20 replies.

### Josek

Epidem. Afgan

Normal user

25 Jun 2013, 13:17

### James lual

Consultant,surveys

Normal user

25 Jun 2013, 15:26

Please visit COD-02 - Conduct a FRAT study in the Democratic Republic of Congo for ToR and application information.

### James lual

Consultant,surveys

Normal user

25 Jun 2013, 19:38

### Otieno K Musumba

M & E , IMC Kenya

Normal user

25 Jun 2013, 21:39

### Hamid Hussien

Nutrition Specialist Concern WW

Normal user

26 Jun 2013, 09:16

### Anonymous 81

Public Health Nutritionist

Normal user

26 Jun 2013, 10:07

### Mark Myatt

Frequent user

26 Jun 2013, 10:21

**Checks for digit preference**are useful : If there is consistent rounding up or rounding down then this is likely to introduce bias. Checks for digit preference can identify if this may be a problem. It is not a definitive test as you would still get digit preference if proper rounding were taking place (and which would introduce little or no bias).

**Enumerator performance tests**are useful : We can see how good each enumerator is in terms of accuracy and reliability. We can use this to pick survey staff or select survey supervisors and to decide if more training or remedial training is needed. Some of it is (IMO) just plain daft. The normality testing makes a very grand assumption that will almost always be untrue. We risk condemning a survey because an unreasonable assumption does not hold. The result is that we further marginalise the most marginal an at-risk populations which will be responsible for the fat left-hand tail of the distribution (which prompts rejection of the survey). What SMART condemns as bad data is very likely useful and accurate data. See this post. The worst case is that we have survey staff adding random elements to data so as to "avoid" age-heaping and censoring accurate data to make a survey fit with one or other fatuous assumptions about how data should behave. In response to (2) ... In many survey that we do there are a few different sample sizes :

**The number of clusters (m) :**We like this to be as

**large**as is practicable.

**The number of children per cluster (n) :**We like this to be

**small**as is practicable. Is it generally true that bigger 'm' and smaller 'n' reduces the design effect associated with a survey. We usually compromise on the size of 'm' and try to have as few clusters as possible. This keeps costs down. We usually keep this to about m = 30 if we are using proximity sampling to select households with eligible children. Work on this (looking at "30-by-30" survey that preceded SMART) suggests that a useful minimum of m = 25 clusters is probably OK. We also have an

**overall sample size**which is the product of 'm' and 'n.' This is what we usually calculate when doing a sample size calculation and then work back to 'm' and 'n'. The proportion of children aged 6-59 months can affect both 'm' and 'n'. If this is small and villages are small then there may only be (e.g.) N = 15 eligible children in most villages. This is often the case in pastoralist settings. In these settings it is not reasonable to have n = 25 because we cannot sample n = 25 children from N = 15 children. In this case you might have n = 10, (or n = 11, 12, 13, 14, or 15). Since 'n' is now small we will need to increase 'm' to meet your overall sample size requirement. In this case both 'm' and 'n' are influence by population size. Note that with a large 'm' and a small 'n' (and particularly when n is close to taking a census from clusters) we will have a small design effect. This can lead to some savings. Here is an illustrative example :

```
START WITH:
guess at prevalence = 10%
required precision = 3%
overall sample size (simple random sample) = 384
guess at design effect = 2.0
overall sample size (cluster sample) = 384 * 2.0 = 768
number of clusters = 25
within-cluster sample size = 31
BUT :
average village population = 90
proportion aged 6 - 59 months = 17%
average village population aged 6 - 59 months = 90 * 0.17 = 15
SO :
New design effect = 1.25 (we have more small clusters)
New sample size = 384 * 1.25 = 480
within-cluster sample size = 15
number of clusters = 480 / 15 = 32
```

The proportion aged 6 - 59 months can also affect the overall sample size.
Sample size formulae tend to assume that we sample from an infinitely large population. This assumption is reasonable when the sampling fraction (i.e. the ratio of the sample size to the population size) is small. The error (e.g. the width of the 95% CI) is essentially the same regardless of population size as long as the sampling fraction is less than about 5%.
In the example above we have n = 480. If the size of the population aged 6 - 59 months were about:
```
N = 480 * 20 = 9600
```

or larger then we need not worry.
If the sampling fraction was greater than about 5% then we would apply a "finite population correction" (FPC) in order to account for the added precision gained by sampling a large proportion of the population. The FPC can be calculated as:
```
FPC = sqrt((Population - Sample Size) / (Population - 1))
```

If we assume a population of 4,800 and a sample size of 480 then we have a sampling fraction of 10%. Since this is above about 5% we should calculate an FPC:
```
FPC = sqrt((4800 - 480) / (4800 - 1)) = 0.95
```

The required sample size is now:
```
n = 480 * 0.95 = 456
```

Continuing with our example ... we might collect this as 30 clusters of 15 (n = 450 is close enough to n = 456 to make no difference) and save ourselves a little work and a little money.
We do not usually apply an FPC in SMART surveys as (1) savings are usually small and (2) the SMART software does not adjust results to account for the sampling fraction.
I hope this is of some use.
### Mark Myatt

Frequent user

26 Jun 2013, 10:36

```
new.n = (old.n * population) / (old.n + (population - 1))
```

Continuing with the example we get :
```
new.n = (480 * 4800) / (480 + (4800 - 1)) = 436
```

which we might collect as 29 clusters or 15.
Sorry for any confusion.
The moral here (for me) is to check what I write before posting.
I fear my mind is going.### Bradley A. Woodruff

Self-employed

Technical expert

26 Jun 2013, 16:12

### Anonymous 81

Public Health Nutritionist

Normal user

26 Jun 2013, 16:22

### Victoria Sauveplane

Senior Program Manager, Action Against Hunger CA

Normal user

26 Jun 2013, 17:18

### James lual

Consultant,surveys

Normal user

26 Jun 2013, 19:08

### Juergen Erhardt

Normal user

26 Jun 2013, 20:11

### Mark Myatt

Frequent user

26 Jun 2013, 20:54

### Tariq Khan

Normal user

27 Jun 2013, 20:52

### Mark Myatt

Frequent user

28 Jun 2013, 10:57

```
failureCount <- 0
for(test in 1:100000)
{
randomHeights <- round(runif(n = 500, min = 650, max = 1100), 0)
finalDigits <- substr(randomHeights, nchar(randomHeights), nchar(randomHeights))
table(finalDigits)
p <- chisq.test(table(finalDigits))$p.value
if(p < 0.05)
{
failureCount <- failureCount + 1
}
}
failedProportion = failureCount / 100000
failedProportion
```

Simulates 100,000 surveys with no digit preference (just random variation). The result is that the null is rejected (as expected) in 5% (actually 4.968% in the simulation that I ran) of the surveys. This means that 1 in 20 good surveys are rejected.
As the sample size increases even small deviations from expected distributions can be statistically significant. This will usually only be a problem with very large sample sizes. The opposite is also true. With small sample sizes we might not detect clear digit preference.
Here are some examples ... clear digit preference for last digit = 0 or 5:
```
last digit count
---------- -----
0 10
1 5
2 5
3 5
4 5
5 10
6 5
7 5
8 5
9 5
---------- -----
60
-----
chi-square = 6.6667, df = 9, p-value = 0.6718
```

but we fail to reject the null of no digit preference.
Here is the same pattern but with 10 times the sample size:
```
last digit count
---------- -----
0 100
1 50
2 50
3 50
4 50
5 100
6 50
7 50
8 50
9 50
---------- -----
600
-----
chi-square = 66.6667, df = 9, p-value = 0.0000
```

we reject the null of no digit preference.
This makes it very difficult to use simple significance tests for monitoring when (e.g.) a team might bring in data from two clusters per day (i.e. n = 60 or less).
Since plausibility tests can "detect" problems that are not there (about 5% of the time) and can (with small samples)fail to detect real problems we should be careful when we use them. It is probably better to "eyeball" (visually inspect) the data to see if we have (e.g.) too many ".0" or ".5" final digits for height.
I hope this is of some use.### Andrew Seal

UCL and NIE Regional Training Initiative

Frequent user

28 Jun 2013, 14:52

*should*be or that those that are not normal or have a larger SD are wrong! This thinking is a classic illustration of the problem, or limitation, of inductive reasoning and why black swan theory is so important. Put another way, statistical theory should help describe biology, not try and define what it is. So, my plea is for survey coordinators to use the very useful features of the ENA plausibility to help monitor and improve data quality during survey implementation, but never use the plausibility score by itself to reject, or accept, a survey report.

### Bradley A. Woodruff

Self-employed

Technical expert

28 Jun 2013, 14:54

### Bradley A. Woodruff

Self-employed

Technical expert

28 Jun 2013, 15:02

### Andrew Seal

UCL and NIE Regional Training Initiative

Frequent user

28 Jun 2013, 15:11

Hi... there are not many studies on adolescent nutrition in my country, and no studies carried out in the location in which I am intending to carry out my research.

Is it advisable to do a cross-sectional study before starting the intervention study to be able to know the gap and topics that will be relevant during the intervention and also to compare the outcome at the end? thank you

If you have any problem posting a response, please contact the moderator at post@en-net.org.