Name Analysis of survey data

Analysis of survey data in EPI INFO 6 and SMART (both against NCHS reference) seems to give different results. Has anyone else faced with the same problem? What is the reason? How can it be rectified?

It is a long time since I used EpiInfo and I have never used the SMART software (I think it is a Windows only package). There are a few thinks that could be happening ... (1) Oedema is treated differently. I know that some versions of EPINUT did not treat oedema properly (i.e. it reported prevalence only by WHZ < -2 and < -3). (2) Censoring / flagging - The packages might be censoring (i.e. excluding from analysis) the odd records (e.g. W/A < -5) differently. (3) Estimators - There are different ways of calculating estimates and confidence intervals and the two packages might differ. How big is the difference you are seeing? I think the first step would be to look at the discordant cases (i.e. case by EPINUT vs, case by SMART and vice-versa).

Mark Myatt

Technical Expert

Answered:

15 years ago

There are several possible reasons. 1) Between epiinfo 5 and epiinfo 6 they changed the way oedema cases are handled - so now to include oedema in epiifro 6 one has to purposefully set the options (inclusion is not the default). SMART automatically includes oedema and give results of MAM and SAM with oedema and also wasting separately (in Beta version). 2) small discrepancies can be due to rounding errors in epiinfo. Epinfo 6 calculates Z-score and %median to the nearest 2 decimal places - SMART calculates to many decimal places - so a child in epiinfo who is -2Z might actually be -2.001 or -1.999. In Ediinfor neither would be counted as <2Z in Epiinfo but one would be counted as <-2 in SMART. This would only affect a very few children. 3) there is a choice of how to manage flags in SMART. In epiinfo the flags are very wide and meant to include all children that have weights-for-height that are biologically feasible. This leads to children who have erroneous measures being included in the tails. This is an option in SMART - but the method advocated is to flag children who are more than 3SD away from the mean of the sample on the basis that they are more likely to be errors of measurement than true values (some will be true values, but most are errors). SMART-beta gives an analysis of the distribution with the different flag methods applied so you can see if this makes a difference. For the new WHO standards, SMART has been extensively tested against the SAS algorithms produced by WHO and gives exactly the same results. The Confidence intervals are caluclated using the SUDAAN procedure - as advocated by CDC - and used in SPSS etc. I am unsure how CI is calculated in epiinfo 6 - but in the windows version of epiinfo they use the SUDAAN procedure - and by the way the windows version of epiinfo uses SMART as an add in for its calculations. If you think that there is a bug in SMART then please send the raw data to either myself or to "Juergen Erhardt" (the programmer) - we will examine the data and see why there is a discrepancy - we are always happy to have feedback about SMART and it is undergoing continuous upgrading - so before you analyse your data please download the latest version - and let us know of any problems with its functionality. Cheers Mike

Michael Golden

Answered:

15 years ago

Hi Everyone, The Epi Info DOS version used a Turbo Pascal program that was created from the original FORTAN program for the NCHS growth curves. There was extensive testing of the Epi Info code vs. the original FORTAN code so the Epi Info results are accurate. The information is displayed to two decimal places in Epi Info, but all calculations are performed using double precision numeric values. I was not involved in the SMART program anthropometry development. As mentioned by others, it may be in the definition of extreme values. I always prefered to write my own code to define extreme values rather than having the program define extreme values. Other issues are in exactly how age was defined and/or calculated. The most accurate method is to have a date of birth and the date of measurement and have the program calculate the age in months with decimal places. Note that for anthropometry purposes the age in months is calculated as 365.25 / 12, i.e., the anthropometry uses a "biologic" month which is 1/12th of a year rather the calendar month which can range from 28 to 32 days. Another issue is that the growth curves for height and weight are based on metric units and if the measurements were in other units, such as inches and pounds, these are converted to metric units. There could be inaccuracies in how the programs convert from other measurement units to metric. You may need to do a little investigation and have a file with all three z-scores calculated by both programs and compare these.

Kevin Sullivan

Answered:

15 years ago