I would put this topic in the category of pretty basic, but
useful to remember. The simplicity of
computing an index value, based on the calculation is evident…interpreting it,
or understanding how to interpret it is where the value of an index gets more
interesting.
The way to calculate an index is as follows:
The main thing you
need to have before calculating an index is a data set. Lets use the example of population by state
as an example. Let’s say I have a sample
of 10 states and I want to prepare an index value based on the mean of the
population (so if there is a new value, I want to know where it stands as
compared to the other states in my sample).
Here is a sample of 10 states (all based on real data from the April 1,
2010 US Census Bureau stats (Annual Estimates of the Population for the United
States, Regions, States, and Puerto Rico—NST-EST2013-01—link: http://www.census.gov/popest/data/state/totals/2013/),
with some basic descriptive statistics:
Alabama: 4,779,739
Florida: 18,801,310Idaho: 1,567,582
Utah: 2,763,885
New York: 19,378,102
California: 37,253,956
Ohio: 11,536,504
Vermont: 625,741
West Virginia: 1,852,994
Pennsylvania: 12,702,379
Minimum: 625,741
Maximum: 37,253,956
It is also
helpful to do a distribution of your values in a graph:
As you can
tell, you index ranges from 5.62 (Vermont) on the low end to 334.83 on the high
end (California). To interpret this, say
for Ohio’s at 103.69…if you take this index value – 100, you can say that Ohio
is 3.69% higher than the average (11,216,219).
Math-ing it out is…(103.69-100=3.69), then take 11,216,219 +
(3.69*11,216,219) = est. 11,630,097.
Obviously, this is not exact because I only went out two decimal
points. The further you go out in
decimal places, the closer you will get to the 11,536,504 actual (you get the
picture). If you state is below the
mean, this also works. Vermont, for
example. Take 5.62-100 = -94.38, meaning
that its population is 94.38 percent LESS than the average. Math-ing that out comes to…
11,536,504-(11,526,504*.9438)=11,536,504-10,888,152.4752=645,351.5248. Again, this is not exactly the actual value
of 625,741 because I went out only 2 decimal places (its close enough though
for example purposes).
Note that
you can really use any ‘base’ value to compare against, depending on your
purpose. If you used minimum, it would
be creating an index that compares against Vermont. If you use the Median, it would be creating
an index against a midpoint value for the states in the sample you have. If you wanted to, you could take an average
of all states and use that as the base value for comparing it against the
population of a region within another country (for example). Really, your choice of the base is dependent
on your purpose and you will know what that is.
Now, lets
say you want to simplify things and make index values that have less variation
(especially since in our population, California is so high and Vermont is so
low). To accomplish the creation of a
less varied index, you can transform the population first, then calculate your
base value and create an adjusted index (understanding that if you want to come
out with a reasonable value that can be interpreted, you have to un-transform
afterwards). For simplicity’s sake, I
will only look at the square root and log transformations. The results and adjusted indexes come out as
follows:
When
interpreting the differences from your base, its different. Lets say you use the square root
transformation and your state is Ohio.
Its index value for the square root transformation is 116.90. That means it is about 16.90% higher than the
square root average of the population.
Math-ing that would be…2906+(2906*.169)=3397.114. Notice how this matches the closely to the
actual figure in the table above. To
convert this back to the original population figure, of course you square it =
11,540,354 (again, it’s the rounding that gives you the estimated number, not
the exact one).





