U.S. Federal Census Records 1850, 1860, 1900, 1910: Version C-3


Public Use Tape on the Aging of Veterans of the Union Army

U.S. Federal Census Records 1850, 1860, 1900, 1910: Version C-3

California, Connecticut, Delaware, District of Columbia, Illinois, Indiana, Iowa, Kansas, Kentucky, Maine, Maryland, Massachusetts, Michigan, Minnesota, Missouri, New Hampshire, New Jersey, New Mexico, New York, Ohio, Pennsylvania, Vermont, West Virginia, and Wisconsin Regiments

This data comprise a portion of the historical data collected by the project (Early Indicators of Later Work Levels, Disease, and Death (EI). The goal of this project is to construct datasets suitable for longitudnal studies of factors affecting the aging process. The primary sample for the Early Indicators project consists of 35,747 white males mustered into the Union Army during the Civil War.

There are three principal datasets in the EI project. The largest is the "Military, Pension, and Medical Records," which is derived from miliary-related documents housed in the National Archives in Washington, D.C. These include both war-time records and applications made by veterans for pension support. Associated with this these pension applications are detailed physical examinations completed by physicians, certifying the veterans' health and disability status. Information from these examinations is collected in the second major dataset, known as the "Surgeons' Certificates" dataset. Finally, the Census Records dataset contains the information that is available in the U.S. Federal Censuses of 1850, 1860, 1900, and 1910, though not all veterans can be successfully linked to Census documents. All individuals in the Early Indicators sample can be linked by a unique indentification number recidnum. All Early Indicators data were collected under the direction of the Department of Economics at Brigham Young University (BYU) and processed by the Center for Population Economics (CPE) at the University of Chicago.

Census questions vary across years, but information on the following topics was gathered in at least one Census: age, date of birth, children, color of skin, disability, education, employment status, gender, immigration/naturalization, language, literacy, marital status, occupation, parents' birthplace, property/home ownership, veteran, and wealth. The dataset also contains quality codes and remarks, enumeration date and place, and Census record information such as page number on the original Census manuscript.

The dataset contains 22,115 observations on 1,483 variables, from a total of 19,740 individuals for whom a Census record can be found. These data can be linked to the other EI datasets by the recruit's identification number, recidnum

A Data Extraction System and More Information on the "Census Records" data set is available at the <Center for Population Economics at the University of Chicago.

The .Z and .zip data files are UNIX-compressed ASCII file in axt format. An axt-format data file has one variable per line and a blank line between records. Both ".Z" and ".zip" files can be uncompressed with winzip. In addition, ".Z" files can be uncompressed using the UNIX uncompress command and ".zip" files can be unzipped with uncompression software. To check ability to uncompress these files, download the small files compress.Z or compress.zip. They give an example of how to read in .Z and .zip ASCII files into SAS for UNIX without decompressing the files. Note that the cport file is not transferable to other formats using software such as Stat/Transfer. Also, may of the observations in the Early Indicators dataset are over three times wider than the maximum width allowed by Stata. To download files in Internet Explorer, right click on them and select "Save Target As...". If the pdf documents appear to be all blank pages, get the latest Acrobat Reader at www.abobe.com.

Internal users can access the data from a UNIX shell at /homes/data/cpe_census or on an NBER PC via Network Neighborhood --> NBER --> home --> data --> cpe_census

Updates and changes

Works referring to the dataset or codebook should contain the following citation:

Fogel, R. W. (2000) Public Use Tape on the Aging of Veterans of the Union Army:
       U.S. Federal Census Records, 1850, 1860, 1900, 1910, Version C-3.  Center for
       Population Economics, University of Chicago Graduate School of Business, and
       Department of Economics, Brigham Young University.


Data -- UNIX-compressed ASCII (6 MB) cen.axt.Z
Data -- Pkzipped ASCII (5 MB) cenaxt.zip
Data -- UNIX-compressed SAS cport file (11 MB) cen.cport.Z
Data -- Pkzipped SAS cport file (6 MB) cencport.zip
Data -- UNIX-compressed SPSS por file (10 MB) cen.por.Z
Data -- Pkzipped SPSS por file (8 MB) cenpor.zip
Codebook -- PDF cen.pdf
SAS program -- makes a sas data file from the ASCII data cen.sas
SAS program -- exports the cport file to a sas dataset cimport.sas
Variable Dictionary cen.txt
Alphabetic Variable Index cen.idx
link.lst -- count of recruits in each EI datafile link.lst
link.sas -- creates link.lst link.sas
link.rect.Z -- UNIX-compressed ASCII data for link.sas link.rect.Z
link.zip -- Pkzipped ASCII data for link.sas link.zip


Contact data@nber.org with questions, comments, or suggestions.

Last Update: June 12, 2001 Created by Jean Roth January 4, 2001