[OPE-L:6971] [OPE-L:463] Re: Value-price correlations

Allin Cottrell (cottrell@ricardo.ecn.wfu.edu)
Mon, 22 Feb 1999 00:46:37 -0500 (EST)

On Sun, 21 Feb 1999, David Laibman wrote:

> Surprise! For once, I find myself agreeing with Andrew
> and Alan! I have always been suspicious of the labor values
> derived from I-O data, and of the high correlation
> coefficients between them and observed prices.

I see that David shares Andrew's and Alan's skepticism, based on
the idea of spurious correlation of industry prices and values.
I have tried to tackle the underlying basis for this skepticism
before (a year or so ago?) but let me try harder.

I'd like to take Andrew's tale of the "Dog Ownership Theory of
Employment" (DOTE) as a starting point, since it's a good
illustration of genuine spurious correlation (so to speak), and
try to show why the case in hand is not analogous.

Recall Andrew's exposition: Somebody claims that employment is
governed by dog ownership, and proposes to verify this via a
cross-sectional regression of employment on number of dogs
owned, country by country. We expect a fairly high correlation,
but at the same time we can easily see that this provides no
support for the DOTE; such correlation as we find will be an
effect of variation in country size.

Please note first that in this case there is a clear and
uncontroversial measure of "size" of the countries. To unmask
the spurious correlation we would not "scale" the countries by
land area or number of lakes or llama population, or even GDP,
but by human population. We'd want to see the correlation
between dog ownership _per capita_ and employment _per capita_
across the countries (and we would not expect this to be
significantly different from zero).

Consider now the price/value case. I suspect that when people
talk about the distinction between (industry) aggregate and
"unit" prices or values, they have in mind a paradigm based on
certain sorts of easily individuated commodities that consumers
usually buy singly (cars, washing machines, loaves of bread).
But note three things here. First, not all commodities can be
individuated in this way: Is the "unit" of electricity the
kilowatt-hour or the gigawatt hour? Second, even when there is
a "natural" unit from the consumer's point of view this unit
need not have any claim to provide a unique measure of scale:
the drinker thinks of beer by the bottle, but the distributor
may think by the case and the producer by the gallon or litre.
Third, and most important, even if every industry had a unique
and "natural" physical unit of output, these would be
incommensurable and hence useless for assessing the relative
"size" of industries (in the way that we were able to use
population as a measure of country size in relation to the
DOTE). If the US auto industry produces more cars per year than
the German auto industry that gives a reasonable basis for
saying that the US industry is bigger than the German. But if
the US baking industry produces more loaves per year than the US
auto industry produces cars, that clearly does not give any
basis for saying that the baking industry is bigger than the
auto industry in the US.

Alright, you say, isn't this labouring the point? Don't we just
need some sort of socio-economic, rather than "natural", measure
of industry size? Well, what are the candidates? Andrew uses
aggregate cost. One might equally well propose total assets,
aggregate employment, or (why not?) the aggregate price or value
of output. But if one uses price or value the correlation in
question is undefined (with one or other "scaled" variable
becoming degenerate). Andrew's approach is to choose a "size"
measure that is very highly correlated with both price and
value, but that does not actually induce degeneracy, and to
"hope for the best" (or rather, the worst).

But we might usefully pause to think here. The fact that
"correcting" for spurious correlation in this case is so
problematic (some apparently reasonable "scalings" automatically
destroy the correlation, regardless of the information in the
data) suggests that one of two things may be wrong.

1. The whole notion of correlation or regression here is deeply
flawed from the start. Not only is there a problem of spurious
correlation, but there's apparently no way to correct it!

2. The idea of spurious correlation does not properly apply here
after all.

I'd like to suggest that people take seriously the second of
these possibilities.

Think again about the DOTE example. Why is it so obvious that a
positive correlation would really provide no support for the
"theory"? Surely it's because (a) we expect that, by and large,
countries with larger populations will have both more people in
employment and more pet dogs, and (b) -- this is crucial but
perhaps less obvious -- we are prepared to take the population
of each country in the sample as a given, as an exogenous
variable unaffected by the level of employment or dog ownership.
We see population as a clear "third factor" that drives both dog
ownership and employment without in turn being driven by either
of those variables. There is a one-way causal street between
the natural "size" variable and the two spuriously correlated
variables.

This condition is not met in the price/value case. In fact it
seems to me that the notion of spurious correlation in this case
rests on a tempting but erroneous conflation of the two notions
of industry size mentioned above: the "natural" (e.g. number of
cars produced by the auto industry) and the socio-economic (e.g.
employment in auto-making, or total cost). Each of these "size"
measures plays _one_ role of human population in the DOTE
example.

"Natural size" seems suitably independent: given technology, the
aggregate value and price of the output of the US auto industry
are both driven by the number of cars produced. But as we've
seen, this sort of size measure is meaningless in inter-industry
comparisons.

"Socio-economic size" provides a _commensurable_ measure, but
here independence fails badly. Cost of production, to take
Andrew's example, is clearly not an independent influence on
value in the sense of total required labour input. "Unit value"
(e.g. value per car) is primarily determined by technology and
social relations at the point of production, while the aggregate
value of an industry's output depends in addition on the
physical scale of its output, as measured in its own physical
units. In relation to this determination, cost of production is
"epiphenomenal". (It's almost as if a critic of the DOTE were
to offer the following "refutation": Your correlation between
dog ownership and employment is spurious since it's a side
effect of differing scale of the countries in your sample, where
scale is measured by the number of dog collars in each country.)

>From this perspective, the absence of a clear and unproblematic
means of correcting a "spurious" correlation is a strong hint
that the correlation is not in fact spurious: there is no "third
factor" whose independent variation is responsible for inducing
a correlation between the other two variables. Industry size
assessed in "natural" units can't play that role because it
provides no way of saying which of any two heterogeneous
industries is "bigger" -- it is simply not something that
"varies across industries" in any meaningful way. Economic size
measures like Andrew's can't do the job either, because they are
not exogenous with respect to the explanatory variable in
question. And you can't stick these two sorts of size measures
together.

One more angle using the DOTE (dis)analogy. I have objected to
Andrew's scaling procedure, saying that it tends to destroy a
valid P,V correlation due to the very high correlation between
his "size" and value itself. What would the corresponding
objection look like in the DOTE case? It would be as if the
DOTE advocate rejected a scaling of his cross-national data set
by population (putting both employment and dog ownership on a
per capita basis), arguing that owing to a very high correlation
between population and dog ownership this procedure would be
likely to destroy the correlation he was looking for. Hmmm.
We wouldn't find that argument very impressive. So why should
you find my defence against Andrew's claims any more persuasive?

Well, in the DOTE case it's natural to suppose that if the
"theory" had any validity it would _have_ to apply on a per
capita basis. But it's not the case that the labour theory of
value _has_ to apply at the level of the "markups" of price and
value over cost. This is because of the lack of independence of
cost from value mentioned above. One channel whereby value gets
to be correlated with price is that labour content governs cost
(e.g. via the wage bill and via non-labour inputs whose prices
in turn reflect labour contents). Therefore "controlling for"
cost of production is controlling for too much. The DOTE
analogy to this would be the claim that dog ownership governs
employment in part by governing the population level, something
we would be unlikely to grant!

Allin Cottrell.