Menu ENN Search
Language: English Français

When the survey SD value is out of aceptable range, especially < 0.8, how the prevalence estimation should proceed?

This question was posted the Assessment forum area and has 1 replies. You can also reply via email – be sure to leave the subject unchanged.

» Post a reply

nicky

Normal user

8 Apr 2014, 11:17

When the plausibility check report came out with SD value lower than the acceptable range, 0.79, and not mentioned about the prevalence estimation with current SD and SD 1,

- how should we go ahead with estimation of prevalence,
- what kind of correction or additional information need to do for presentation of this result,
- it is technically save to use the observed prevalence in that case and
- What kind of things can lead to result this kind of SD value.

(Seem I made a lot question :) )

I also attached with the sample "Plausibility check report session on Evaluation of SD.

Evaluation of Standard deviation, Normal distribution, Skewness and Kurtosis using the 3 exclusion (Flag) procedures

. no exclusion exclusion from exclusion from
. reference mean observed mean
. (WHO flags) (SMART flags)
WHZ
Standard Deviation SD: 0.79 0.79 0.79
(The SD should be between 0.8 and 1.2)
Prevalence (< -2)
observed:
calculated with current SD:
calculated with a SD of 1:

Mark Myatt

Consultant Epideomiologist

Frequent user

9 Apr 2014, 10:43

Regular readers of these forum may know that I am sceptical about these types of plausibility checks as they are based on assumption that are (IMO) not very plausible.

I am not sure but I think this:

  calculated with current SD: 
  calculated with a SD of 1: 

refers to a PROBIT estimator. This type of estimator is not often used with SMART data (it is more often used with RAM survey data). You probably want to use a classical approach (i.e. number of cases divided by the number of children). If you use the PROBIT approach then I suggest you use the observed SD as this is likely to be more accurate than an assumed SD.

I assume that you have checked your data and the issue is not a lot of missing WHZ values.

To answer your questions ... to give definitive answers I would need to see the data ...

(1) Yes, you should go ahead and estimate prevalence. Even if the SD between 0.8-1.2 rule has some value these are very crisp boundaries and 0.79 is so close to 0.8 that rejection would (IMO) be legalistic.

(2) You should report that the SD was a little lower than expected.

(3) I think it is safe to use the estimated prevalence. SMART people may say otherwise. They may be right.

(4) A low SD means that the spread of WHZ values is smaller than expected under a specific set of assumptions. It means that you have more children close to the mean WHZ value and fewer children in the "tails" of the distribution than expected. Your children are more alike each other than they "should" be. There are several things that can make this happen. Here is an incomplete list:

(i) Errors in measurement.

(ii) Errors in sampling (sometimes sick children are "hidden" and this will lead to a smaller than expect left tail (i.e. too few very low values).

(iii) Fake data. You might see this if data were collected from a few cases and then just copied over and over again or data were generated and then trimmed a little to remove extreme values. I have only very occasionally seen such fraud and that was usually about adding (or subtracting) a little to (or from) height to increase (or reduce) the estimated prevalence.

(iv) Design effect - WHZ might be associated with size of village and PPS places the sample in larger villages.

(v) A "sharing" or fair society - all people are starving together rather than at-risk groups moving away from the less at-risk groups at separate paces (which might lead to SD > 1 and a large left tail).

There are others (I leave these for others to propose).

There are things to do than can help narrow down the possibilities. A small left-tail would favour (ii) and maybe (v). A narrow distribution with similar tails might favour (iii), (iv), or (v). You can check data to see if there are many children with the same data to investigate (iii).

The main thing is that you report the issue in any report you make from this data.

I hope this is of some use.

Back to top

» Post a reply