mirot

                                       Quantitative  Literacy and Public Health

                                              Jerry Mirotznik, Ph.D., M.P.H.

                                               Health and Nutrition Sciences

 

Introduction

Public health is a distinct perspective and approach for reducing morbidity and mortality, and promoting health.  In contrast to clinical medicine, which mainly offers treatment after people are already sick, public health focuses on identifying risk factors for disease in order to prevent people from getting sick in the first place.  Another important distinction between these two approaches is their respective units of observation.  While clinical medicine focuses on disease in the individual, public health focuses on disease in the population or community.  To assess the health status of communities, public health is dependent on numbers, and hence is an essentially quantitative discipline.

A final distinguishing feature of public health is its non-experimental scientific approach.  Rather than manipulating exposures to see potentially harmful health effects, which would be unethical to do, the public health approach merely observes natural variation in those exposures to establish associations with disease outcomes.  Since observational designs do not involve random allocation of exposures, there is always the possibility that any association uncovered between a potential risk factor and disease outcome is an artifact of confounding.  As such, in epidemiologic research, for instance, we must always be aware of the possibility of confounding as a plausible alternative explanation for our results.  And consequently we must always be skeptical of our results; indeed, critical of numbers in general.

With this understanding as background, one can now suggest that what quantitative literacy means in the context of public health may differ, if not in kind then in degree, from what it means for other disciplines.  Specifically, in the context of public health, quantitative literacy does not have so much to do with teaching students to be producers of numbers or problem solvers, although these things are important and are taught.  Rather, quantitative literacy in public health predominantly concerns learning to be critical consumers of numbers.  Put another way, it has to do with teaching students that numbers, that data, do not speak for themselves and, in turn, teaching students how to read into numbers.

Four Level Framework for Quantitative Literacy

How might students be taught to critically evaluate numbers?  Last spring, Tim Shortell, from Sociology, and I addressed this question for a workshop we conducted on using newspapers to teach quantitative literacy.  Through our collaboration, we developed a framework that can be applied to help students develop skills in evaluating numbers.  The framework suggests that in order to become a critical consumer of numbers one must pay attention to four levels of concern.

Level 1 focuses on the statistical constructs and strategies used to convey quantitative information.  It suggests that one must understand how statistical constructs, such as the mean, median, mode, standard deviation, etc., are calculated, what kind of data they are appropriate for and what kind of information those constructs convey.  Level 1 issues with regard to public health are briefly illustrated below.

Public health and more particularly epidemiology assess the health status of populations by paying attention to three kinds of phenomena - death, disease and disability.  Since these phenomena are events that one either does or does not experience, epidemiology depends heavily on statistics for categorical variables.  Hence it makes great use of the:

 

 

A         X constant

Proportion  =

 A + B 

 

 

 

Ratio  =     A        X constant

 

      B   

 

 

             

Rate =    Delta A    X constant

 

  Delta B          

 

Based on these statistical constructs epidemiologists have developed a family of mortality measures, morbidity measures, and disability measures.  For example, a fundamental mortality measure is Crude Death defined as:

 

Crude Death      =    # deaths        X  1,000

                            

      Total population

 

Crude death is relatively quick and easy to determine and provides the broadest or most general summary of the mortality profile of a population.  It also conveys information about the overall probability of dying in a population.  The label "crude" is a synonym for general or non-specific.  The Crude Death measure is non-specific in that it does not provide information about which people are dying nor what they are dying from.  To provide this type of information epidemiology uses what are called specific-mortality measures.  To address the question of who is dying, epidemiology has developed a set of age, race, and sex specific mortality measures.  For instance, Age-Specific Mortality informs us about the probability of dying among people in a certain age category, e.g., 25 to 34.  The formula for age, as well as race and gender measures, are presented below.

Age-Specific Mortality  = #  deaths in age category   X 100,000

 

                                         Population in age category

 

Race-Specific Mortality = # deaths in race category     X 100,000

 

                                           Population in race category

 

 

 

Sex-Specific Mortality   = # deaths in sex category    X 100,000

 

                                          Population in sex category

 

An important measure to address the question of what people are dying from is Cause-Specific Mortality.

 

Cause-Specific Mortality    =  # deaths from specific cause  X 100,000

 

                                                 Total population

 

Cause-Specific Mortality indicates the probability of dying from a specific health disorder in a population.  This measure allows diseases to be ordered in terms of their “burden of mortality” and, as a result, their importance as a public health threat.

Exercise 1 illustrates an assignment for enhancing students’ familiarity and facility with mortality measures.  As can be seen, the exercise also provides an opportunity to introduce students to the graphic display of data, in this case the use of tables.

 

 

                                                                Exercise 1

 

 

                                                          Mortality Measures

 

 

 

 

                                                                   Table 1

 

                 Mortality by Selected Age Groups, Males and Females, United States, 1991

 

 

                      Males                               Females                                  Total                

____________________________________________________________________                                  

Age                                    Number                              Number                          Number

(years)           Population   of deaths       Population  of deaths     Population   of deaths

                                                                                                                                          

15-24           18,797,000   27,549         17,816,000     8,903         36,613,000    36,452

25-34           21,609,000   43,709         21,476,000   15,919         43,085,000    59,628

35-44           19,507,000   60,552         19,847,000   27,570         39,354,000    88,122

                                                                                                                                           

 

 

 

 

                                                                   Table 2

Total Mortality from Selected Causes, Males and Females, United

States, 1991

 

Males                       Females              Total

  __________________________________________________________________                                                                      

All causes                    1,121,665                 1,047,853        2,169,518     

Accidents                          59,730                      29,617             89,347

Cancer                             272,380                  242,277            514,657      

Viral hepatitis                       1,132                         708                1,840  

Infant deaths                      21,008                    15,758              36,766 

Maternal deaths         XXXXXX                            323                   323 

                                                                                                                                            

The total population in 1991 was 252,683,000 (males = 123,431,000 and females = 129,257,000).  The total number of births was 4,110,907

 

 

a) Calculate the Crude Death

b) Calculate Cause-Specific Death for accidents, cancer, and viral hepatitis.

c) Repeat the calculations in b) for males and females separately.

d) Calculate the Age-Specific Mortality for people 25-34.  

e) Repeat the calculations in d)for males and females separately.

 

 

 Level 2 pays attention to numerical fallacies, that is, the misinterpretation or misuse of numbers.  Several numerical fallacies in public health have to do with a failure to identify an appropriate denominator or population figure.  One error of this sort is referred to as  Just Numerator Data."  A classic example of this error involves an attempt to draw conclusions about risk to a population, based on data drawn from patients seen in a clinical setting (Colton, 1974).  Investigators identified all female patients with carotid stroke in a neuro-surgical unit during a ten-year period of time.  Twenty-three of the 65 patients were pregnant at the time of stroke.  Table 3 presents the pregnant and non-pregnant stroke patients’ age distribution.

 

 

                                                                   Table 3

Age distribution of non-pregnant and pregnant women investigated for stroke

 

Age (yr.)                               Non-pregnant          Pregnant

_________________________________________                      _     _

 

15-19                                    1                                  -

20-24                                    2                                  7

25-29                                    4                                  6                          

30-34                                    4                                  7

35-39                                  10                                  3 

40-44                                  21                                   -

                                                                                                           

 Total                                  42                                23

                                                                                                                                                


Looking at these data it can be seen that about 74% of the non-pregnant stroke patients were 35 to 45 years of age, while 13% of the pregnant stroke patients were that age.  Based on this it was concluded that non-pregnant women become more prone to stroke at later ages, i.e., that older non-pregnant women are at greater risk of stroke than older pregnant women.

Can one draw this conclusion from these data?  In fact, one cannot.  Why?  Because there is no information about the appropriate denominators, namely, the total number of non-pregnant and pregnant women in the population or community within which this neuro-surgical unit is located.  It is possible the reason why there are so few older pregnant women with strokes is not because they are less at risk, but rather because there are few older pregnant women in the population.  Indeed, if we had access to the total number of 35-45 year old pregnant and non-pregnant women in the population, we might find that there are a greater proportion of older pregnant women with stroke and that consequently they are at greater risk of having this condition.

So "Just Numerator Data Error" is characterized by the absence of an “epidemiologic denominator,” i.e., no data on a population comprised of treated cases, untreated cases, and “non-cases,” that is people who do not have a disease of interest.

Three additional numerical errors could be discussed in this context.  “Wrong Denominator Error” occurs when there is an epidemiologic denominator, or put otherwise, population figure, however, it is the wrong population.  “Errors in Cross-Level Inference” occur when data are collected about a specific level of observation, e.g., a population of countries, but conclusions are drawn about another level, e.g., the population of individuals within a country.  Associations between a certain exposure and disease outcome on the former level may not hold for associations between that exposure and disease on the latter level.  A last example of misuse or misinterpretation of numbers concerns the failure, when comparing different groups or populations, to consider age as a confounder.

Level 3 involves understanding the measurement process that generates the numbers or data.  There are three broad dimensions of importance regarding measurement:

1. Conceptualization - how the variables upon which the data are based are defined;

2. Operationalization - what procedures or rules are used to assign numbers

  or quantify the variables;

3. Accuracy - how valid and reliable the measured scores are.

These dimensions can be meaningfully illustrated in terms of one of the most important, current topics in public health, health disparities, or inequalities in health in particular between Whites and Blacks.

Research indicates that with regard to almost every indicator of health status Blacks are at a disadvantage.  For example, during the period 1997 through 1999 Crude Death for Whites was 858.4 per 100,000 and for Blacks 1,141.9.  Infant Mortality during 1982-1998 for Whites was 6 per 1,000 and more than double for Blacks, 13.9.  Whites live on average 77.1 years and Blacks 71.4 years (Health, United States, 2001).  These data were collected by the Federal Government.  In particular, the Census Bureau collects population data, or epidemiologic denominator data, and the Center for Disease Control and Prevention (CDC) collects mortality data, or epidemiologic numerator data.

For most of its history, the variable race when used by the Federal Government was defined in terms of biological, anthropological, and genetic differences.  More currently, according to the Office of Management and Budget, which sets standards for collecting, presenting and maintaining data on race, race "reflects a social definition."  Exactly what race means, however, is unclear.

This lack of clarity is one of the reasons the Institute of Medicine (IOM) now suggests that the Federal Government revamp its population taxonomy and substitute the concept of ethnicity for that of race.  This, the IOM maintains, would lead to a conceptual shift away from the emphasis on fundamental biological differences among "racial" groups to an appreciation of the range of cultural and behavioral attitudes, beliefs, lifestyle patterns, diet, environmental living conditions and other factors that may affect health risks.  However, it can be argued that ethnicity is a variable whose meaning is almost as problematic as that of race.

How is race operationalized?  Question 8 of the 2000 decennial Census states "What is person 1's race?  Mark (X) one or more races to indicate what this person considers himself/ herself to be.”  The question is followed by 15 response options representing 6 major categories: 1. White, 2.  Black, African American, or Negro, 3. American Indian or Alaska Native, 4. Asian, 5. Native Hawaiian  or other Pacific Islander, and 6. Some other race (United States Bureau of the Census).  The 2000 census was the first time ever in which respondents could check more than one racial category.  The Census also allows for a designation regarding ethnicity, in particular Hispanic ethnicity versus non-Hispanic ethnicity.

What questions can we raise about the accuracy of the Census Bureau's measurement of racial categories?  An important question concerns the differential undercount of certain racial groups.  Blacks have on average about a 4.7% larger undercount than Whites.  And certain gender/age segments of the Black community have an extraordinarily high undercount.  For instance, it is estimated that 14% of Black Males ages 30-34 are missed, while the respective figure for non-Black Males in that age category is 3.8% (United States Bureau of the Census).  This larger undercount has, of course, implications for mortality data.  As mentioned, census data are used to calculate denominators for mortality measures.  The use of a denominator that is undercounted inflates or overestimates the measure in exact proportion to the undercount in the denominator.  This suggests that Black-White differences in mortality may have been overestimated.

Another problem of the race measure concerns its self-report nature.  There are data that suggest such self-reports are unreliable, that people provide different answers regarding their racial status at different times (Williams, 1998).  This type of measurement error can attenuate any true differences that exist in mortality between Whites and Blacks. 

Errors in mortality differences between Whites and Blacks can also result from the way numerator data are collected.  These data derive from Death Certificates.  Death Certificate information about race is filled in by the funeral home director, coroner or medical examiner.  Guidelines recommended that the next of kin provide information about the race of the deceased.  However, research indicates that approximately half of the funeral home directors, coroners or medical examiners make their own determination of the deceased's race.  This can result in, what in epidemiology is called, misclassification error.  Such error can also lead to biased estimates.

Of course, death certificates also provide information for mortality statistics on the cause of death.  This information is entered onto the certificates by a physician.  There is much research to suggest that cause of death data are of questionable accuracy.  This in turn needs to be considered in evaluating Black-White differences in mortality statistics.

Level 4 involves understanding the political and value judgments underlying the measurement process and the analysis of data.  Census and mortality data are collected by the Federal Government, a political entity.  This entity is susceptible to and influenced by political forces.  Those political forces with their value orientation in turn influence the data the Federal Government collects.  As a quick illustration, it has been suggested the reason the 2000 Census provided the option of selecting multiple racial categories was because of political pressures brought to bear on the Federal Government by multiracial individuals who felt the census did not acknowledge their existence.  A second, perhaps even more obvious example of the influence of values on data concerns the Bush Administration’s refusal to adjust the census data for the rather well established undercount of racial minorities.  It has been suggested adjusting the census would lead to an increase in the number of Democratic Representatives, thus putting Republicans at a political disadvantage.

Conclusion

It is more than theoretically plausible that Quantitative Literacy might come to signify different things in the context of different disciplines and perspectives.  With regard to Public Health, Quantitative Literacy could most meaningfully be thought of as focusing on teaching students to be critical consumers of numbers/data.  It has been suggested that to become critical consumers students must learn to think of numbers in terms of four levels: Level 1: Statistical constructs and strategies used to convey quantitative information; Level 2: Numerical fallacies; Level 3: The measurement process; Level 4: The political and value judgements that influence numbers/data.  Using such a framework, it is hoped that our students would eventually learn to approach data in an active rather than passive manner, always questioning the meaning and worth of data, never simply taking it for granted that data speak for themselves. 

 

 

References

Colton, Theodore. (1974). Statistics in Medicine.  Boston, MA; Little, Brown and

            Company.

Health, United States, 2001.  Federal Data.  United States Department of Health and

            Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, http://www.cdc.gov/.

United States Bureau of the Census. United States Department of Commerce,

            http://www.census.gov/.

Williams, David, R. (1998). African-American health : The role of the social

            environment.     Journal of Urban Health: Bulletin of the New York Academy of

            Medicine, 75, 300-322.