Special Collections Department
403 Parks Library
Iowa State University
Ames, IA 50011-2140
RS 13/24/0/5
Master Sample of Agriculture
Records, 1938-1944, undated
The Master Sample of Agriculture
Wayne A. Fuller
Iowa State University
The joint project of the U. S. Department of Agriculture, the U. S. Bureau of the Census, and the Statistical Laboratory, in which a national area sample of agriculture was developed is described. The basis for the Master Sample project, the nature of the sample, and the impact of the project on survey sampling are discussed.
The Statistical Laboratory of Iowa State College had been in existence about five years when it entered into a "Project Agreement for Agricultural Research" with the U. S. Department of Agriculture. The agreement went into effect July 1, 1938, and initiated cooperation between the two institutions that continue to this day. As Snedecor, Gaskill, and Friley (1940) reported, the agreement with the Department of Agriculture resulted in a large increase in the staff of the Laboratory. Arnold J. King, Floyd E. Davis, George D. Harrell, Glen D. Simpson, Roy A. Bair, Dale E. McCarty, and Robert J. Monroe were stationed at Ames as employees of the U. S. Department of Agriculture and as resident collaborators. Six new positions in the Laboratory at the staff level were created and the clerical and computing staffs were enlarged. In the fall of 1938, four graduate assistants, Earl Houseman, Paul Homeyer, Emil Jebe, and Raymond Jessen, were working on the cooperative agreement.
Random sampling was not the accepted method in 1938 that it is today. One of the data collection procedures heavily used by the Department of Agriculture was the Rural Mail Carrier Survey. In this operation, rural postal carriers distributed questionnaires to the mailboxes of farmers. Those questionnaires returned by the farmers were processed to provide estimates of crop and livestock quantities. Neyman's (1934) paper "On the Two Different Aspects of the Representative Method; the Method of Stratified Sampling and the Method of Purposive Selection" illustrates the nature of the controversies in sampling. While this paper is often cited for its contribution to optimal allocation in stratified sampling, most of it is devoted to empirical comparisons that demonstrate the superiority of random sampling over purposive selection.
In the 1930s, the United States economy was struggling. For agriculture it was a time of physical and economic stress and a time of technological change. Both 1934 and 1936 had been marked by severe droughts and resulting dust storms in the Central Plains. Crop production was shifting from animal power to tractor power, and hybrid corn was being adopted. Because of the depressed economy, many government programs, including several farm programs, had been introduced under the so-called "New Deal." The administration of these programs required accurate data. It was at this time of demand for data and awakening interest in random sampling that the Department of Agriculture and the Statistical Laboratory entered into their cooperative agreement. The objectives of the 1938 Agreement were stated to be:
1. The development of efficient methods of sampling individual farms in taking economic surveys of American agriculture.
2. The development of efficient experimental designs in Field Plot Experiments that will permit differentiation among genetic, soil, cultural, and direct and indirect meteorological factors as they influence plant growth, yield and quality of crop production.
3. The development of appropriate techniques for discovering interrelations of yields of agriculturally important crops and their meteorological environments.
4. The discovery of adequate and valid procedures for the analysis of time series.
5. The examination of such available data in the Department of Agriculture and the Agricultural Experiment Stations as may give promise of useful information not yet extracted.
Consistent with these objectives, the Department of Agriculture and the Statistical Laboratory cooperated in research on experimental design (including plot shape and layout), time series, crop forecasting, and other statistical methodologies, as well as in the sampling investigations that are our topic.
A number of studies of sampling were undertaken shortly after the initiation of the cooperative agreement. Among publications resulting from these studies are King and Simpson (1940), King, McCarty, and McPeek (1942), Jessen (1942), and Strand and Jessen (1943). Snedecor and King (1942) review some of the sampling work being conducted at the Statistical Laboratory during those years. Cochran's (1942) work on regression estimation developed out of the study of alternative sampling units. Jessen's (1942) bulletin "Statistical Investigation of a Sample Survey for Obtaining Farm Facts" is a landmark study in which the area sampling technique was used. The sampling unit in Jessen's study was a quarter section of land (a square ½ mile on a side containing 160 acres). Quarter sections are units associated with the land survey which covers much of the United States and, in particular, most of the area obtained through the Louisiana Purchase. The sample consisted of all farms with headquarters in the sample quarter sections. A portion of the farms were reinterviewed after approximately one year. Jessen's study is also noteworthy for the use of cost and variance functions in an attempt to specify optimum designs. The validity and efficiency of the survey design and the use of subsampling and stratification were studied.
The "Nineteen County" study (King and Simpson, 1940) was initiated prior to the USDA-Statistical Laboratory Agreement, but was completed after King came to Ames. In the Nineteen County Study, acreages obtained from aerial photographs were compared to acreages obtained by direct measurements of fields and with data collected by other methods. Sampling units that were various multiples of the survey section were compared for sampling efficiency.
Aerial photographs covering a good portion of the agricultural area of the United States were used to obtain accurate acreages for individual farms, as required by the agricultural adjustment programs. In some ways, the appearance of aerial photographs on the sampling scene in the 1930s is comparable to the modern day appearance of data collected by satellites. The 1930s and early 1940s also saw the creation of another resource that has proved useful for sampling, namely, county highway maps. Largely through the activities of public works programs, county highway maps were constructed for almost every county in the United States. As the name implies, a primary purpose of the maps was to show the road network in the county, but the maps also included a good deal of "culture" in the rural parts of the county. In addition to showing the roads by type of surface, the maps showed rivers and streams, bridges, churches and schools, and dwellings; the dwellings were designated as farm, nonfarm, or vacant. These maps were the basic materials used in constructing the Master Sample of Agriculture and have been widely used in sampling ever since their creation. [Incidentally, another public works program, the WPA (Works Progress Administration), had a direct effect on our local statistical activities. The WPA was responsible for the construction of the original part of what is now Snedecor Hall. [A plaque attesting to this fact is just inside the front door.]
Arnold King, in his 1945 paper "The Master Sample of Agriculture: I. Development and Use," credits the idea of the Master Sample to Rensis Likert of the Bureau of Agricultural Economics. Because the Bureau had a number of data collection projects, there was a felt need for a procedure that would provide samples for several nearly simultaneous studies. The idea was to have a large sample from which subsamples of farmers could be selected. Likert and King prepared a proposal for such a master sample, and in April 1943, the Bureau reached a decision that the Statistical Laboratory would design a national area sample of about 5,000 farms. In later usage, the term "Master Sample" has come to be applied to the materials used in the creation of the first sample. That is, the term is often applied to the frame rather than to the sample itself.
A meeting was held in Ames in August 1943 to discuss the construction of the Master Sample of Agriculture. Brooks (1977, p. 143) gives the following quote from a memorandum from King to P. L. Koenig and Likert.
To crystallize the ideas of different individuals as to how the master sample should be drawn, three committees were appointed. One committee, consisting of W. G. Cochran as Chairman, M. S. Girshick, Margaret Jarman Hagood, Miss Gertrude Cox, E. E. Houseman, R. J. Jessen, and Walter Hendricks, was set up to outline the methods of sampling that were to be used in various parts of the country. This committee received valuable suggestions from Mr. Cornfield and Dorothy Brady who generously took part in the discussions giving the committees the benefit of their wealth of experience in enumeration of sampling units.
A second committee, consisting of E. E. Houseman as Chairman, Miss Morrell, Miss Stone, and Mr. McCarty, was set up to outline the mechanics of drawing the sample. A third committee, consisting of E. M. Brooks as Chairman, J. R. Goodman, W. D. Goodsell, and A. R. Johnson, was set up to consider the possibility of having the AAA offices supply the information needed for the Master Sample and also to consider what and how much information should be obtained.
While the originally proposed master sample for agriculture was a large project, the scope of the project increased during its lifetime. The original plan called for 5,000 farms in the sample, but this was increased first to 25,000 and later to a sample of 300,000 farms to meet the requirements for the 1945 Census of Agriculture. At an early point in the development, the Bureau of the Census was made aware of the master sample project. The Bureau of Agricultural Economics may have brought the project to the attention of the Bureau of the Census by making inquiries about the possibility of obtaining schedules from the 1945 Agricultural Census for the farms in the Master Sample. The Bureau of the Census became interested in the Master Sample as a method of obtaining a sample of farms from which to obtain answers to a supplementary questionnaire that was to be a part of the 1945 Census of Agriculture. As a result, the Bureau of the Census entered into an agreement with the Statistical Laboratory and the Bureau of Agricultural Economics to contribute to the development of the Master Sample. The Bureau of the Census made large contributions of money, manpower, and technical assistance to the project. The Bureau of the Census later requested that the Master Sample be expanded to provide a framework for an area sample of the total rural population of the United States. The Master Sample work initiated a nearly continuous cooperation between the Bureau of the Census and the Statistical Laboratory that is currently reflected in a Joint Statistical Agreement.
The county highway maps were the primary materials used in creating the frame for the Master Sample of Agriculture. Fig. 1 is an example of such a map. This map is for Polk County, Iowa, and is an example map in the original master sample materials. The roads have a grid appearance, typical of the heavily populated part of the United States that was settled after the land survey.
In creating the Master Sample frame, every county was divided into three zones. Zone 1 contained all incorporated cities and towns. The city of Des Moines in the lower center of the figure is in Zone 1. In the terminology of the Bureau of the Census and of the Master Sample, an incorporated town or city, such as Des Moines, with population greater than 2,500 in the 1940 Census, was called an urban place. Certain other unincorporated, densely populated areas were also defined to be urban places by special rules.
Zone 2 was composed of unincorporated named towns having an estimated population of 100 or more and other unincorporated areas that appeared on the maps to have a population density of 100 or more persons per square mile, other than those designated as urban places by special rule. These areas were called rural places or rural name places. (In the terminology of the Bureau of the Census, incorporated towns with populations less than 2,500 are also called rural places.) One of the operational rules for determining whether or not to include an unnamed area of population concentration in Zone 2 seems to have been the following: If the area in question cannot be covered by a 25-cent piece on a highway map (scale 1 inch = 1 mile), include it in Zone 2; otherwise, leave it in Zone 3.
Zone 3, called open country, consisted of the remainder of the county. The existing minor civil division boundaries (civil township boundaries in Iowa and most other areas) were used to subdivide the open country zone. Because they were also used by the Bureau of the Census to define enumeration and tabulation units, it was natural to use these boundaries in the creation of the sampling materials. Within minor civil divisions, the open country zone was further subdivided into areas called count units. To the extent possible, "natural" features such as roads, railroads, and major streams were used to define count unit boundaries, but minor civil division boundaries and the urban and rural place boundaries used to delineate Zones 1 and 2 were, necessarily, also used as count unit boundaries in Zone 3. The count units were formed to contain roughly 6 to 30 expected farms. A solid square on the Polk County map denotes a farm operator's dwelling; a square that is diagonally solid black and white denotes a nonfarm dwelling. These squares were used in forming the count units. The indicated number of farms (INOF), the indicated number of dwellings (INOD), and the number of sampling units that could be formed from the count unit were recorded for each count unit.
Obviously, there was a great deal of clerical work involved in creating these materials. The work was justified because of the gains in sampling efficiency, especially in the 1945 Census of Agriculture application. The fact that measures of the size were available made possible the efficient sampling design. The final step in sample selection was the creation of the sampling units. Each count unit selected in the sample was subdivided into the number of sampling units designated for that count unit. One of these sampling units was then selected for the sample. The selected sampling units were called "area segments" or, simply, "segments." Fig. 2 is a map of Polk County with the actual sample segments for the Master Sample. The segments were designed to have an expected size of five farms on the basis of the 1940 Census. For the country as a whole, the average was about two and one half square miles in size, ranging from 0.71 square miles in Indiana to 108 square miles in Nevada. The sample of farms was composed of those with headquarters within the outline sample segments. The segments in Polk County average about 4.3 expected farms. The reader is referred to the articles by Jessen (1945, 1947) for further details.
The Master Sample of Agriculture contained about 67,000 area segments in the roughly 3,070 counties of the United States. About 2,300 segments were in incorporated places, 4,300 in unincorporated rural places, and 60,400 in open country. The segments contained about 300,000 of the 6,000,000 farms in the United States at that time. The sample selection was completed in November, 1944, in time for the 1945 Census of Agriculture.
The administration of the project was handled largely by King. Jessen was primarily in charge of the technical aspects. Records that would give a precise measure of the size of the project are incomplete. The 1944-45 Annual Report of the Statistical Laboratory mentions "150 to 200 clerks and supervisors" working on the Master Sample. The clerical operation was housed in the Armory on the Iowa State campus and in rented quarters in Boone and Nevada. A number of USDA and Bureau of the Census staff were involved in the project. J. R. Grant, Morris Hansen, William Hurwitz, G. W. Morris, Jack Ogus, T. J. Reed, Joseph Steinberg, R. E. Straszheim, Benjamin Tepping, and Glen Veageront were among those working on the project who spent some time at Iowa State.
The final frame materials consisted of an envelope for each of the counties, containing the county map and the listing of the count units giving their measures of size in terms of the INOF, the INOD, and the number of sampling units. These materials required 38 drawers in filing cabinets. The open country materials were maintained by the Laboratory until the beginning of this year, when they were retired to the archives of the University Library.
The Master Sample was used in the 1945 Census of Agriculture by the Bureau of the Census. Farms in the Master Sample completed a supplementary schedule. The material in this supplementary schedule was tabulated for the sample and published as Census of Agriculture 1945 Special Reports. This was the only use of the entire 67,000 area units of the original Master Sample of Agriculture, but the frame was heavily used for a number of years.
The Master Sample materials were used to select a sample for the Bureau of Agricultural Economics in a quarterly survey of agriculture conducted in 1945 and 1946. This sample was a multistage sample with about 101 counties serving as the primary sampling units. The Bureau of Agricultural Economics conducted a survey of farms in 1947 that used a subsample of the original Master Sample in 816 counties (Jessen, 1947).
A general purpose sample for Iowa was designed using the Master Sample materials (Jessen, 1947). This sample was used in 1946 for a survey of food preferences and for a survey of morbidity and mortality of farm animals. During this early period, the Master Sample materials were used to draw samples Pennsylvania, Illinois, Kansas, Oklahoma, and Missouri. Earl Houseman, who was with the USDA in Washington when the Master Sample was completed, estimates that he used the Master Sample frame to draw 60-80 samples per year during the 10 years after its construction. The materials were used by the Statistical Laboratory at Iowa State University into the 1970s. During that period, they were used to draw an estimated 155 samples.
The area sample currently used by the Statistical Reporting Service, U. S. Department of Agriculture, in the June Enumerative Survey to construct estimates of livestock numbers and crop acreages is a direct descendant of the Master Sample of Agriculture. The first samples of the 1950s were drawn from the original Master Sample frame, but the materials were updated for subsequent samples. The materials used in constructing the area samples are now regularly revised. Aerial photographs, satellite imagery, and other information are currently used to construct measures of size. For a number of years the Statistical Laboratory cooperated with the USDA in updating the Master Sample frame. A new frame for Iowa was completed in 1973.
The Statistical Laboratory cooperated with the Bureau of the Census in expanding the Master Sample of Agriculture into a sample for the portion of the population not living in cities having block statistics. This material was used in the sample for the survey of the labor force that is now the Current Population Survey.
The national sample of the land area of the United States designed for the Soil Conservation Service by the Statistical Laboratory in 1957 is also a national area sample. An expanded and modified version of that sample is currently in use; for a description see Goebel and baker (1983). The SCS sample is, perhaps, a more direct lineal descendent of the Nineteen County Study (King and Sampson, 1940). This is because data are collected for the land in the segment, not for the operational farm with headquarters in the segment.
While large-scale area samples had been used in India and in Europe in the 20s and 30s (Kind, 1945; Stephan, 1948), the Master Sample of Agriculture represented a major step forward in sampling technique. It is distinguished by the extensive use of materials (aerial photos and highway maps) to obtain measure of size for the sampling units. As an outgrowth of earlier research on efficiency (e.g., Jessen, 1942), an attempt was made to optimize sampling units with respect to travel time and interviewing costs. Earlier area samples of human populations used relatively large civil divisions as sampling units and seldom, if ever, were able to construct reliable estimates of population totals. The sampling units of the Master Sample were relatively small and the nits were designed by the statisticians, specifically for the purpose of the sample. Unbiased estimates of population totals could be constructed from the resulting sample. These innovations set standards for area sampling, both public and private, that continues to this day.
I express my appreciation to T. A. Bancroft, Morris Hansen, Earl Houseman, Emil Jebe, Ben Tepping, and, particularly, Raymond Jessen and Arnold King for information about the Master Sample project and its background. Charles Caudill provided documents from the files of the Statistical Reporting Service, U. S. Department of Agriculture. Harold Baker conducted document research and provided the data on the use of the Master Sample frame at Iowa State University. Harold Baker, Raymond Jessen, and Oscar Kempthorne made useful comments on drafts of the manuscript. All remaining errors are the responsibility of the author.
References
Bureau of Agricultural Economics, U. S. Department of Agriculture (1936). Proceedings of Conference on Statistical Methods of Sampling Agricultural Data. Conference held July 14-17, 1936, Iowa State College, Ames, Iowa.
Brooks, E. M. (1977). As we recall: the growth of agricultural estimates, 1933-1961. Statistical Reporting Service, U. S. Department of Agriculture, Washington, D. C.
Cochran, W. G. (1942). Sampling theory when the sampling units are of unequal sizes. Journal of the American Statistical Association 37, 199-212.
Goebel, J. J. and Baker, H. D. (1983). The 1982 National Resources Inventory sample design and estimation procedures. Report prepared for the Soil Conservation Service. Iowa State University, Ames, Iowa.
Houseman, E. E. and Becker, J. A. (1967). A centenary profile of methods for agricultural surveys. American Statistician 21, 15-21.
Jessen, R. J. (1942). Statistical investigation of a sample survey for obtaining farm facts. Research Bulletin No. 304, Iowa Agricultural Experiment Station. Iowa State College, Ames, Iowa.
Jessen, R. J. (1945). The Master Sample of Agriculture: II. Design. Journal of the American Statistical Association 40, 46-56.
Jessen, R. J. (1947). The master sample project and its use in agricultural economics. Journal of Farm Economics 29, 531-540.
Jessen, R. J. (1978). Statistical Survey Techniques. Wiley, New York.
Jessen, R. J. and Houseman, E. E. (1944). Statistical investigations of farm sample surveys taken in Iowa, Florida, and California. Research Bulletin No. 324, Iowa Agricultural Experiment Station. Iowa State College, Ames, Iowa.
King, A. J. (1945). The Master Sample of Agriculture: I. Development and use. Journal of the American Statistical Association 40, 38-45.
King, A. J. and Simpson, G. D. (1940). New developments in agricultural sampling. Journal of Farm Economics 22, 341-349.
King, A. J., McCarty, D. E., and McPeek, M. (1942). An objective method of sampling wheat fields to estimate production and quality of wheat. U. S. D. A. Technical Bulletin No. 814.
Mahalanobis, P. C. (1944). On large-scale sample surveys. Philosophical Transactions of the Royal Society of London B, 231, 239-451.
Neyman, J. (1934). On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97, 558-606.
Reed, T. J. (1948). The Master Sample. Unpublished manuscript, Iowa State University, Ames, Iowa.
Snedecor, G. W., Gaskill, H. V., and Friley, C. E. (1940). The Statistical Laboratory of Iowa State College. The Iowa State College Bulletin 36, No. 50. Ames, Iowa.
Snedecor, G. W. and King, A. J. (1942). Recent developments in sampling for agricultural statistics. Journal of the American Statistical Association 37, 95-102.
Snedecor, G. W. and King, A. J. (1945). The Statistical Laboratory at Ames. The Agricultural Situation 29, 9-11.
Statistical Laboratory (1944-1965). Annual Reports (various issues). Iowa State University, Ames, Iowa.
Stephan, F. F. (1948). History of the uses of modern sampling procedures. Journal of the American Statistical Association 43, 12-39.
Strand, N. V. and Jessen, R. J. (1943). Some investigations on the suitability of the township as a unit for sampling Iowa agriculture. Research Bulletin No. 315, Iowa Agricultural Experiment Station, Iowa State College, Ames, Iowa.
U. S. Bureau of the Census (1945). United States Census of Agriculture: 1945. Special report on the 1945 sample Census of Agriculture. U. S. Government Printing Office. Washington, D. C.