3.2 Spatial and Demographic Patters in Data Quality
One critical aspect of using the ACS is that these trade-offs between spatial and temporal solution and uncertainty are not uniform ; some places and some populations have more precise datum. For case, shows the distribution of the coefficient of variation ( CV ) of tract-level estimates of median income for african-american households. We have chosen this variable for the bare reason that is an crucial indicator in urban and population geography in the U.S. Like the margin of mistake, the CV is a measure of uncertainty ; it is simply the proportion of the criterion error to the estimated value. The CV is a utilitarian statistic because it expresses uncertainty as a relative percentage of the estimate : a SE of $ 10,000 would be a CV of 0.5 ( 50 % ) for a $ 20,000 calculate and 0.1 ( or 10 % ) for a $ 100,000 income appraisal. Each banish in represents five percentage of census tracts after they have been sorted by african american median family income, such that the left-most barricade represents to poorest five percentage of tracts and the correct most cake represents the wealthiest five percentage of tracts. Within each bar, we can see the relative distribution of quality for these census tracts, where timbre is measured by the CV. The ACS user ’ s guide presents the CV as a way to assess an estimate ’ sulfur “ fitness for function ” ( U.S. Census Bureau 2009b, A-3 ), but does not provide courtly criteria for what constitutes “ fit ” data. The National Research Council ( NRC ) suggests that a CV of 10 to 12 percentage ( or less ) is a “ fair standard of preciseness ” ( Citro 2007, p. 64 ). environmental Systems Research Institute ( ESRI ) ( 2011 ) states that a CV less than 12 percentage means “ gamey dependability, ” a CV in the 12 to 40 percentage range means “ mince dependability, ” and a CV over 40 percentage means “ low reliability. ” shows that the “ dependability ” of these ACS data is strongly dependent upon the economic conditions of the nerve pathway ; very inadequate or very affluent tracts yield lower choice ( higher CV ) estimates. For african american english median family income, more than 75 percentage of all census tracts in the United States fail to meet the NRC “ reasonable ” standard of preciseness. These estimates are particularly bad for the poorest 15 percentage of census tracts, where more than half fail to meet ESRI ’ s more liberal standard of “ moderate dependability. ”
The relationship between data quality and income in is troubling and unmanageable to attribute to a individual factor. While the USCB does not explicitly over- or under-sample areas based on their socio-economic characteristics, they do allocate more ACS samples to areas with depleted answer in order to equalize coverage rates. In the 2007-2011 ACS tract data there is a identical weak, substantively negligible, correlation between sample size and the median family income of the tract ( r=0.02, p-value < 0.001 ). This suggests that taxonomic variation in sample size does not explain the gradient in, that is poorer and richer neighborhoods do not have appreciably different sample distribution sizes. overall coverage rates for the african american population are low in the ACS ( 89 % in 2011, United States Census Bureau 2013a ), this type of omission may bias the ACS but it does not directly increase uncertainty in small area estimates. Another potential explanation lies in the fact that some questions are not completed by the survey respondent, causing the USCB to “ impute ” the value based on a jell of decision rules. The imputation rates for income variables are high, approaching 20 %, however this rate is not available disaggregated by race or income ( United States Census Bureau 2013b ). The traffic pattern of association between income and data choice holds for the entire population, not just african-american households. shows CV of median family income for all census tracts in the 150 largest U.S. cities ( defined using the USCB ’ s Metropolitan Statistical Area ( MSA ) boundaries ). Income is scaled such that for each nerve pathway we calculate its income percentile relative to all tracts in the MSA. The leftmost banish in represents the first percentile in each MSA, that is, the poorest 1 percentage of tracts in each city, the dollar measure defining the first percentile category is city specific. This approach was taken because there is meaning regional variation in income and without this correction the relationship between income and data quality is difficult to see. This type of correction was not necessary for. shows that poorer tracts, in the 150 largest cities, have lower quality estimates than wealthier tracts. The bars in show the interquartile range for each percentile, the median is denoted with a white line.
This sociable pattern in data quality translates into geographic patterns a good. The CV for median family income, considering all tracts in the United States, has a Moran ’ s I of 0.22 ( p-value < 0.001 ) meaning that tracts with a high ( or low ) CV tend to be surrounded by tracts with like CVs. In particular, tracts in the concentrate of cities have a higher CV for income than tracts in the suburbs ( ). shows the CV of median family income estimates for all tracts in the 150 largest metropolitan areas in the US. All tracts in each city are ordered based upon their distance from the “ kernel ” of the city, where center is determined using the city ’ sulfur coordinates as listed in the USGS Geographic Names Information System. The use of a single center for a large polycentric city may be debatable, but the figure is exemplifying of patterns however. Like the first bar represented the beginning percentile, in this case the first percentile ( leftmost bar ) contains the tracts closest to the city center. The rightmost bar contains the 1 percentage of tracts in each MSA that are furthest from the center. proportional distances were used on the X axis to control for the significant variation in MSA extent. shows that data timbre for median family income estimates varies systematically within urban areas. One concern hypothesis for the patterns in - is that samples of the lapp size yield estimates that vary in timbre because of systematic variations in the demographic composing of census tracts. Keeping sample size constant, a sample taken from a divers population will have more sample erroneousness, and hence uncertainty, than one from a homogeneous population ( see section 1.1 ).
Read more: How to Make Money as a Coin Collector
shows that there is an association between the median family income in a nerve pathway and the come of diverseness in family incomes ; shows both the gini coefficient on family income and the median family income for each tract in the US from the 2007-2011 ACS. The gini coefficient is a measure of the equality of a distribution, if income were evenly distributed in the population such that all households had the lapp annual income the gini coefficient would equal 0 and the division ( or heterogeneity ) in income would be zero. One the other hand if a single family earned all of the income in a tract the gini coefficient would equal 1 and the variance in income would be high. One of the key determinants of uncertainty in survey estimates is the amount of heterogeneity ( variability ) in the population ( see incision 1.1 for a discussion ). The practice in parallels the model in, gloomy income and high income tracts have a higher gini coefficient and a higher variability in income. center income tracts, which have a lower gini coefficient have a substantially more even income distribution and a lower income division. Sample size and estimate procedures are roughly constant across the crop of median family incomes. It is possible that the traffic pattern in – exists because low and senior high school income neighborhoods have more variance in income than center income neighborhoods—holding sample distribution size constant more variation in the population means more doubt in the estimate. additionally, this may hold for, tracts in the center of cities may have more income diverseness than those in the suburb. Income diverseness gradients have long been a function of american urban animation and there is evidence of increasing neighborhood-level diversity in the US ( Spielman and Logan 2013 ; Farrell and Lee 2011 ). pas seul in the composition of neighborhoods, specially in the tied of heterogeneity, can have an affect on ACS data timbre. The variations in region composition is one potential causal agent of the taxonomic variations in data quality. Given the complex design of the ACS it is difficult to estimate the final impression of neighborhood-level diversity but it remains an import potential informant of uncertainty in the ACS little area estimates. This income-diversity gradient is one of many potential demographic gradients that may affect ACS data quality, these demographic correlates of ACS data timbre are underexplored in the literature and are an significant area for future inquiry .
Category : Economy
Leave a Comment