National Center for Health Statistics Edward J. Sondik, Ph.D., Director Jack R. Anderson, Deputy Director Jack R. Anderson, Acting Associate Director for International Statistics Lester R. Curtin, Ph.D., Acting Associate Director for Research and Methodology Jennifer H. Madans, Ph.D., Acting Associate Director for Analysis, Epidemiology, and Health Promotion P. Douglas Williams, Acting Associate Director for Data Standards, Program Development, and Extramural Programs Edward L. Hunter, Associate Director for Planning, Budget and Legislation Jennifer H. Madans, Ph.D., Acting Associate Director for Vital and Health Statistics Systems Douglas Zinn, Acting Associate Director for Management Charles J. Rothwell, Associate Director for Data Processing and Services Division of Data Services Philip R. Beattie, M.S.P.H., Director Margot Palmer, Deputy Director Division of Health and Utilization Analysis Diane M. Makuc, Dr.P.H., Director Compressed Mortality File 1968-88 on CD-ROM (CD-ROM Series 20, No. 2A ASCII Version) Compressed Mortality File 1968-88 Introduction.................................................................................................................................1 NCHS Data Use Agreement........................................................................................................2 Files on CD-ROM.......................................................................................................................3 Description of Mortality File and File Layout...............................................................................4 Description of Population File and File Layout.............................................................................8 Guidelines for Citation of Data..................................................................................................13 Introduction The Compressed Mortality File 1968-88 (CMF 1968-88) is a county-level mortality and population data file for the United States spanning the years 1968-88. The file permits the calculation of national, state, and county death rates for race-sex-age groups of interest. The mortality file contains only a select set of key analysis variables, namely, 1) state and county of residence, 2) year of death (rather than the full date of death), 3) race (recoded to white, black, other races), 4) sex, 5) age group at death (specific age recoded to 16 age groups), 6) underlying cause-of-death (4-digit ICD code), and 7) 69 or 72 cause-of-death recode. The national, state, and county population estimates on the CMF are from the Bureau of the Census. The age, race, and sex detail of the population file matches that of the mortality file. Details of the data use restrictions are given in NCHS Data Use Agreement (see page 2). Further detail about the mortality and population files on the CMF can be found in the Documentation. NCHS Data Use Agreement The Public Health Service Act (Section 308) (d) provides that the data collected by the National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC), may be used only for the purpose of health statistical reporting and analysis. Any effort to determine the identity of any reported case is prohibited by this law. NCHS does all it can to assure that the identity of data subjects cannot be disclosed. All direct identifiers, as well as any characteristics that might lead to identification, are omitted from the dataset. Any intentional identification or disclosure of a person or establishment violates the assurances of confidentiality given to the providers of the information. Therefore, users will: 1. Use the data in these datasets for statistical reporting and analysis only. 2. Make no use of the identity of any person or establishment discovered inadvertently and advise the Director, NCHS, of any such discovery. 3. Not link these datasets with individually identifiable data from other NCHS or non-NCHS datasets. Files on CD-ROM README.WPD This file is in WordPerfect version 6.1 format. This file includes general descriptions of the mortality and population files on the Compressed Mortality File 1968-88 and the file layouts. DOCUMENT.PDF This file is in PDF format. It contains the file documentation for the CMF for the period 1968-88. The file contains the NCHS Data Use Agreement, descriptions and record layouts for the mortality and populatino data files, detailed information about the mortality and population data, cause-of- death coding, computation of death rates, and a dictionary of the FIPS state and county codes and names. SASCODE.TXT This file is an ASCII text format. The file provides sample PC SAS programs for creating a format library, a mortality file, and a population file from the data files on the CD-ROM. Data Files MORT6878 The mortality data file for 1968-78 MORT7988 The mortality data file for 1979-88 POP6878 The population data file for 1968-78 POP7988 The population data file for 1979-88 Description of the Mortality Files The mortality data, for all years except 1972, are based on records for all deaths occurring in the United States. For 1972, the data are based on a 50 percent sample and weighted by a factor of 2. Deaths to foreign residents are excluded. Deaths to U.S. residents who died abroad are not included on this file. Appendix A in the Documentation provides a description of the vital statistics reporting system maintained by the NCHS. The source records were condensed to 23-bytes by retaining only a select set of key analysis variables. The variables included on the condensed record are: 1) state and county of residence, 2) year of death (rather than the full date of death), 3) race (recoded to white, black, other races), 4) sex, 5) age group at death (specific age recoded to 16 age groups), 6) underlying cause-of- death (4-digit ICD code), and 7) 69 or 72 cause-of-death recode. Including only these few variables on the file and recoding some of them into a limited number of categories resulted in numerous records having identical values on all of the variables. The number of records on the file was reduced substantially by aggregating records with identical values on all of the variables into one record. A count indicating the number of identical records was added to the aggregate record. For example, two white male residents of Clay County, Alabama, with ages between 35 and 44 years, died from "bronchus and lung, unspecified" (ICD 162.9) in 1979. Their records were combined into one, with a 2 in the count field. Note that there are no records on the file with zero in the count field. If no deaths occurred for a particular combination of variable values, no record appears. Specific details 1. Underlying cause-of-death for the years 1968-78 is classified in accordance with the Eighth Revision International Classification of Diseases, Adapted for Use in the United States (ICDA-8) codes. Cause-of-death for the years 1979-88 is classified in accordance with the International Classification of Disease, Ninth Revision (ICD-9) codes. For a further description of the ICD codes see Appendix B in the Documentation or Volume II of the annual mortality volumes produced by the NCHS, such as Vital Statistics of the United States, 1978, Volume II-Mortality, Part A,or Vital Statistics of the United States, 1988, Volume II-Mortality, Part A. For a list of comparable ICD codes for the 8th and 9th revisions and estimated comparability ratios, see Appendix B in the Documentation. 2. The fourth digit of the ICD code can assume the values 0-9 and blank. If the fourth digit is a "blank", it is a blank on this file. Care must be taken when reading the file to distinguish between blanks and zeros. 3. For injuries and poisonings, the external cause is coded (E800-E999) rather than the Nature of Injury (800-999). The letter "E" is not included in the code. 4. For 1988, if there were three or fewer deaths for a given Georgia county of residence (of deaths occurring in Georgia) with HIV infection (ICD codes *042-*044, 796.8) cited as a cause-of-death (underlying or non-underlying cause), these records were assigned a "missing" place of residence code (FIPS code = 13999). 5. The FIPS state and county codes contain leading zeros in both the 2-byte state code and the 3-byte county code. File Specifications for the Mortality Files File names Years Number of records Record Length Format MORT6878 1968-78 8,774,864 23 ASCII MORT7988 1979-88 16,448,435 23 ASCII The files are sorted by locations 6-9, 1-5, 10, 11-12, 13-16. Field Item and Location Size Code Outline Format FIPS Codes (See Appendices E and F in the Documentation) 1-2 2 FIPS state code Numeric 3-5 3 FIPS county code Numeric 6-9 4 Year of death Numeric 10 1 Race-sex Numeric 1 White male 2 White female 3 Black male 4 Black female 5 Other male 6 Other female 11-12 2 Age at death Numeric 01 under 1 day 02 1-6 days 03 7-27 days 04 28-364 days 05 1-4 years 06 5-9 years 07 10-14 years 08 15-19 years 09 20-24 years 10 25-34 years 11 35-44 years Field Item and Location Size Code Outline Format 12 45-54 years 13 55-64 years 14 65-74 years 15 75-84 years 16 85+ years 99 Unknown 13-16 4 ICD code for underlying cause-of-death Numeric 1968-78: ICDA-8 1979-88: ICD-9 17-19 3 Cause-of-Death Recode Numeric (See Appendix B in the Documentation) 1968-78: 69 Cause-of-Death Recode 1979-88: 72 Cause-of-Death Recode 20-23 4 Number of deaths Numeric Description of the Population File There are national, state, and county population estimates on the population file of the CMF. The population estimates are based on U.S. Bureau of the Census estimates of U.S. national, state, and county resident populations. The 1968-69 national estimates and all of the estimates for 1971-79 and 1981-88 are intercensal estimates of July 1 resident populations. The 1970 and 1980 population estimates are April 1 modified (modified age-race-sex) census counts. The 1968 and 1969 state and county population estimates were calculated by NCHS using linear extrapolation. A brief description of the population estimates is provided here; a more detailed description is provided in Appendix D in the Documentation. Specific details 1. There is one record on the file for each geographic unit (total U.S., state, county) x year x race-sex group. 2. Modifications of the population estimates made by NCHS: a. To permit the calculation of infant mortality rates, NCHS live-birth data were substituted for the estimates of the population under one year of age. The race code for these records is derived from "race of mother". b. When the age group 1-4 years did not appear on the Census file, the age group 0-4 years was multiplied by 0.8 to obtain an estimate of the population 1-4 years. c. For non-censal years prior to 1992, the NCHS Division of Vital Statistics uses national population estimates rounded to the nearest 1,000 to calculate published death rates. On the CMF, the national population estimates for 1968-69 and 1971-79 are rounded to the nearest 1,000 in accordance with this practice. However, this means that calculation of rates for aggregate age, race, and/or sex groups involves using population estimates that were rounded before aggregation rather than after aggregation. As a result, national death rates for aggregate groups calculated using the rounded estimates on the CMF may differ slightly from those published by NCHS. The national population estimates for 1981-88 on the CMF are not rounded so that the user can round them after aggregating across subgroups and avoid the rounding error problem. 3. National, state, and county population estimates can be identified by using the FIPS code or the record type variable in location 140. National population records have a FIPS code of "00000". State population records have a valid 2-digit FIPS state code and a county code of "000" (see Appendix E in the Documentation). The record type variable assumes the value "1" for national records, "2" for state records, and "3" for county records. It is necessary to provide separate sets of estimates for each geographic level because the methodology used to produce the intercensal estimates (1971-79 and 1981-88) did not smooth them sufficiently. Thus, for the intercensal years, the sum of the population estimates of counties within a state may not equal the state population estimate, and the sum of all state population estimates or all county population estimates may not equal the national population estimates. For these years, the national population estimates should be used when calculating national death rates and the state population estimates should be used when calculating state death rates. 4. The FIPS state and county codes contain leading zeros in both the 2-byte state code and the 3- byte county code. 5. For 1988, there was an additional county in Georgia with a "missing" county code of "999". The six records for this county have population counts of zero. 6. Brief description of population estimates for individual years 1968-69 population estimates - National population estimates are U.S. Bureau of the Census intercensal estimates of the July 1 resident population. State and county population estimates were calculated by NCHS using linear extrapolation from the corresponding July 1, 1970 and July 1, 1971 estimates. 1970 population estimates - National, state, and county population estimates are from a modified version of the April 1, 1970 census. The original census counts were modified by the U.S. Bureau of the Census to correct: 1) errors discovered in the data, 2) race misclassification - persons of Hispanic origin who reported their race as "other" were recoded as "white". 1971-79 population estimates - National and county estimates are U.S. Bureau of the Census intercensal estimates of the July 1 resident population. The Bureau of the Census did not produce state population estimates by age, race, and sex for the 70's. Therefore, the state population estimates for 1971-79 on this file are simply the sum of the population estimates for the counties in each state. Three Virginia independent cities (Manassas, Manassas Park, and Poquoson) did not appear on the Census file prior to 1981. While these independent cities are not on the mortality file for 1968-78, they are on the file for 1979 onwards. Therefore, the 1979 populations for these three cities were estimated from the July 1, 1980 and July 1, 1981 estimates of these cities. The 1979 population estimates for the counties containing the cities were reduced by the estimated city populations. 1980 population estimates - National, state, and county population estimates are from a modified version of the April 1, 1980 census. The original census counts were modified by the U.S. Bureau of the Census: 1) persons who reported their race as "other" (the majority being of Hispanic origin) were reassigned to one of the official race groups, 2) an adjustment was made for the overcount of centenarians April 1, 1980 population estimates for three Virginia independent cities, (Manassas, Manassas Park, and Poquoson) had to be extrapolated from July 1, 1980 estimates. The April 1 populations for the three cities were calculated as a proportion of the April 1 county population, with the proportion obtained from the July 1, 1980 city/county estimates. The April 1 population estimates for the counties containing the three cities were reduced by the estimated April 1 city populations. 1981-88 population estimates - National, state, and county estimates are U.S. Bureau of the Census intercensal estimates of the July 1 resident population. File Specifications for the Population Files File name Years Number of records Record length Format POP6878 1968-78 206,712 140 ASCII POP7988 1979-88 189,966 140 ASCII The files are sorted by locations 6-9, 1-5, 10. Field Item and Location Size Code Outline Format FIPS codes (See Appendices E and F) 1-2 2 FIPS state code Numeric 3-5 3 FIPS county code Numeric 6-9 4 Year Numeric 10 1 Race-sex Numeric 1 White male 2 White female 3 Black male 4 Black female 5 Other male 6 Other female 11-18 8 Number of live births Numeric 19-26 8 Population in age group: 1-4 years Numeric 27-34 8 Population in age group: 5-9 years Numeric 35-42 8 Population in age group: 10-14 years Numeric 43-50 8 Population in age group: 15-19 years Numeric 51-58 8 Population in age group: 20-24 years Numeric 59-66 8 Population in age group: 25-34 years Numeric 67-74 8 Population in age group: 35-44 years Numeric Field Item and Location Size Code Outline Format 75-82 8 Population in age group: 45-54 years Numeric 83-90 8 Population in age group: 55-64 years Numeric 91-98 8 Population in age group: 65-74 years Numeric 99-106 8 Population in age group: 75-84 years Numeric 107-114 8 Population in age group: 85+ years Numeric 115-139 25 County name Character (See Appendix F in the Documentation) 140 1 Record type Numeric 1 National population record 2 State population record 3 County population record Guidelines for Citation of Data With the goal of mutual benefit, the National Center for Health Statistics (NCHS) requests that recipients of data files cooperate in certain actions related to their use. Any published material derived from the data should acknowledge NCHS as the original source. The suggested citation to appear at the bottom of all tables is as follows: Source: National Center for Health Statistics (span of years used) When cited in a bibliography, the citation should read: National Center for Health Statistics (2000). Data File Documentation, Compressed Mortality File, 1968-88 (machine readable data file and documentation, CD-ROM Series 20, No. 2A), National Center for Health Statistics, Hyattsville, Maryland. The published material should also include a disclaimer that credits any analyses, interpretations, or conclusions reached to the author (recipient of the data file) and not to NCHS, which is responsible only for the initial data. Consumers who wish to publish a technical description of the data should make an effort to insure that the description is not inconsistent with that published by NCHS.