Thursday, January 16, 2014

Revisiting MOE:A Lesson About Margins of Error in the ACS for Median Household Income

A while back I posted about understanding the margins of error (MOEs) in the American Community Survey estimates. As you may recall, ALL estimates implicitly come with a margin of error - that is to say some degree to which the provided estimate may be close to the actual number for characteristic "X". Because the estimates are based on fairly small samples for the ACS, statisticians like to build in some "fudge factor" and recognize that the estimate is probably right "plus or minus some margin of error". So we all recognize that the resulting sample data is probably not dead on the actual number if we surveyed every single person. Instead we build a "confidence interval" around the sample estimate which we are willing to say (typically) we're 90% confident that the actual number, if we surveyed the entire population, would be within.

Perhaps a good example would be something like median household income.

The 2012 ACS Five Year Estimate shows that the median household income for Oneida County is $49,148. Now this is based on a sample in which about one in 50 households were surveyed. If we surveyed ALL of the other households in the county would we get the same median income number? Possibly, but very, VERY unlikely. So instead, what demographers and statisticians like to go is take the margin of error (MOE) for this piece of data and construct a 90% confidence interval around this data point. this is done by going one MOE above and one MOE below the estimated number.

In the income case, we would add and subtract the MOE (which is +/- $999) and now say that we are 90% confident that the ACTUAL value of the median household income for the county lies between  $50,147 and $48,149.

Let's take this a step further and look at this visually. Here's the Oneida County Median Household income on a graph; the green box represents the estimated value, and the vertical line shows the range of 90% confidence interval/ We are 90% confident that the actual median household income lies between $50,147 and $48,149.

Now, how do the various towns compare to the county estimated? The way to tell this is to plot out the median household income estimates AND their margins of error and see where they overlap. When the county's overlaps with a towns, that means that they are, essentially no different. Technically it is safe to say that statistically we see no significant difference between the town and the county median household income.

On the other hand, where they do NOT overlap, that means that a town's median household income is either significantly higher, or lower, than the rest of county's as a whole. The graph below shows all of the town median household income estimates and their 90% confidence intervals.

Click to Enlarge

To make this a bit more understandable, I've drawn in red lines showing the 90% confidence interval for the county so you can more easily see how it overlaps, or doesn't overlap, each towns median household income 90% confidence intervals.

Click to Enlarge

Looking at the right side of the graph, note how the 90% confidence intervals for Lee, Deerfield, Marshall, Trenton, Marcy and Westmoreland are all above the red line depicting the top edge of the county's MOE. However for the twon of Western, while the estimated value of their median household income (the green box for Western) is above the county's, the confidence intervals overlap. This means it's possible that the numbers, in fact, could be identical ! So you'd have to say statistically that they are not significantly different !

On the left hand side you can see that Utica,  Annsville and Rome are all below the county median household income - they do not overlap, so therefore they are significantly lower than the county when it comes to median household income.

This same process could be done with any of the towns in order to compare them to other towns in the county. For example, what could you say about the Town of Trenton? What towns are not significantly different when it comes to median household income? Which are below them?
Click to Enlarge

Whenever you are looking at ACS data, you need to be aware of these margins of error and what they say about the estimates, especially in comparison to other geographies !