Sunday, March 23, 2008

Sunday Comics


I was told by my attorney that I can't be sued if it's an homage. So, if you never hear from me again, or from Team PJMB after today, blame it on my attorney, and PGS for giving me posting privileges. I hope he doesn't rue that day (and that I used rue correctly).

(Click to make it big.)

--Soon-to-be-Jaded Dissertator

30 comments:

Anonymous said...

I finally understand the one with the sad moon and "VAP" written in the stars.

Pseudonymous Grad Student said...

Uh oh. I know your attorney, and I know his alter ego is the "Wine Fairy." I fully expect to be in a stress position in Bagram Airbase within 36 hours.

Anonymous said...

nah, being told you were wrong on p. 46 is nothing--peanuts, so to speak.

the real 'lucy' treatment is the one that involves your adviser telling you that you have a bright future ahead of you, there are great jobs out there, you could be a star, your ideas show real promise,

except! we're very disappointed in you and perhaps you just aren't cut out for philosophy.

i mean, the football is bigger than just p. 46, you know?

otherwise--great strip. the monocled dino is primo.

Anonymous said...

The wine fairy knows next to nothing about IP law – take the fairy’s card and try to call before you’re drugged and put on a plane. That’s more up the fairy’s alley. Although this should not be considered binding legal advice, it would appear as if any references to existing pop culture icons would likely be considered fair use under 17 USC 107, as STBJD’s work is not actually an appropriation of someone else’s work, because it is displayed on a web page intended for non-commercial (and arguably educational) purposes, and because references to existing cartoons, if there are any in STBJD’s pieces, amount to peanuts in terms of their effect on the market value of any existing cartoons.

Anonymous said...

Anon 4:56,

You're right, the Lucy treatment is much more devastating than being told that you're wrong on a certain page. Perhaps it's more like being told you're the front-runner for a job, but then some asshole, well-entrenched SC member just blocks your appointment (b/c of page 46), or the funds dry up.

But, all those situations would've required more panels and the current incarnation already took some time to get right, believe it or not.

In the future, I'll try to better capture (read: rip off), that Peanuts vibe.

Anonymous said...

stbjd--
you know, as soon as i posted my 4:56, i asked myself "schmuck, you've never heard of the telling detail? you've never heard of saying more by saying less? you really thought citizen kane was about a sled?"

i apologize: getting called out for p. 46 is as good of a microcosm of the whole experience as could be asked for. the fault is not in your presentation, but in my appreciation. i'll try to grow up.

Anonymous said...

Question: Leiter often makes predictions about what impact some department's senior hires will have on their overall ranking on the PGR. (For example) I haven't done any scientific investigation or anything, but based on a small number of predictions concerning my department, it has seemed to me that these predictions never come true. We'll make some senior hire, Leiter will announce it and suggest that it will cause us to move up in the rankings by some specific interval, and then we don't -- we stay exactly where we are. So the question is, how often do these predictions come true?

I ask because it has seemed to me that the overall rankings on the PGR are generally quite stable, even though the actual lists of faculty associated with the various departments tend to be unstable -- people seem to move around an awful lot. This could be explained by a kind of parity between the typical departments losses and acquisitions, but it could also be explained by the PGR being self-reinforcing. So isolating those hires that Leiter thinks will have an impact and then looking to see whether there is any actual impact might help confirm one hypothesis over the other. Thoughts?

Anonymous said...

I'm of the mindset that Leiter's rankings should really be viewed ordinally and not cardinally. So, Princeton, Rutgers, NYU, Harvard, I mean, right: they're all great departments. There are some really interesting people there. Is one person there better than another somewhere else? That is an extraordinarily tough call. Too tough to call, as a matter of fact. So it goes for many schools on the ranking: in some areas they do very well, in other areas they're weak, largely depending on the number of faculty who work in a given area and the extent to which one or the other publication has or has not raised the right kind of buzz. Really, the rankings are effectively reputation measures of individual faculty members as they sit in the penumbra of comparatively good other faculty members both inside and outside their home department.

The Driver and Sorenson move is significant. It certainly makes the Washington University department stronger (though, of course, it was already very strong). Both are great hires, and both are losses for Dartmouth. (Tough winter, New Hampshire???) If I were a grad student or faculty member at Wash U, I'd be stoked. I think that's all that could possibly be meant by Leiter's claim that the department should improve in the rankings.

Again, Leiter's rankings are not any indication of the extent to which graduate students will get support from the faculty, or of the extent to which they will benefit from the program, or even be able to achieve their academic or career objectives. One student may thrive at Madison, another at Northwestern, and another at Stony Brook. If grad students would stop putting so much stock in the rankings and just get to the business of making their dossiers as complete and attractive as possible, this would help search committees find the best candidates among them.

Anonymous said...

How about this for size: Leiter has absolutely no business making "predictions" about how departments will fair in future rankings if he wants to maintain any credible claim that his rankings are an objective poll of what others think, rather than reflecting his own biases.

Does Zogby begin its phone calls with "we think that Obama gave a great speech last night, and we're predicting he will really pull ahead in the polls this week. Having said that: press 1 if you support clinton, press 2 if you support obabma."?

Anonymous said...

I don't see how the Leiter rankings could be ordinally meaningful but not cardinally. They are produced by averaging a lot of individual judgments. If those individual judgments aren't meaningful cardinally, then averaging them makes no sense and will not produce a composite that is ordinally meaningful. If the individual ones are cardinally meaningful, then their average should be as well.

Anonymous said...

how about this for size, 8:37: if mr. zero is right, then you are wrong.

what mr. zero says is that leiter keeps making these pronouncements, and *nobody listens*. it doesn't affect the rankings as leiter predicts, because it doesn't affect the opinions of the ratings boards in the least.

it's almost like the people who make up the rankings are capable of independent thought, and are not telepathically controlled by leiter.

or at least that's worth trying on for size.

Anonymous said...

I'm sort of sympathetic with anon 8:37, but the sympathies get much stronger when Leiter starts doing things like this prediction based in (small, perhaps) part of a junior hire. Even if David Baker's the total shizzle (I've got no reason to think he isn't --- he did turn down, after all, NYU and Wisconsin), this just looks like a bad precedent. It looks like Leiter is saying, "David Baker, even though he's a freshly minted Ph.D., carries the same sort of Leiter-ranking-weight as, say, Gordon Belot and Laura Ruetsche." Given the already inordinate amount of power he seems to have in American philosophy, I think he should avoid any appearance of endorsing certain job market candidates --- it will be a sad world when, in order to get a top job, candidates will have to impress not only the SCs but Brian Leiter, too.

Anonymous said...

How about this for size: Leiter has absolutely no business making "predictions" about how departments will fair in future rankings if he wants to maintain any credible claim that his rankings are an objective poll of what others think, rather than reflecting his own biases.

This was exactly my thought when I read the above! Thanks for putting it eloquently.

Anonymous said...

10:04,

Whether people listen or not is beside the point. I happen to think they do, and that mr zero is overstating his case. But what difference does that make? If you are the one making the measurements, then you keep your mouth shut or you sacrifice your objectivity. its fairly simple scientific methodology.

everyone else who collects opinion data understands this.

Anonymous said...

I guess I just mean it this way:

Okay, is Harvard better for a candidate than EBF University? Probably. Is Pitt better for a candidate than the Snog Academy? Yep probably. Is Harvard better than Pitt? Hard to say. You can look at any two schools and sorta say that one is better than the other, provided that those schools are significantly far apart in the ranking. If they're not far apart in the ranking, as maybe Harvard and Pitt aren't, it's virtually impossible to claim one over the other. The reason for this boils down to Arrow's paradox: that in a multi-option race with multiple criteria, three or more voters can lead to counterintuitive rankings for any option (or something like that). This is partly due to the complexity of criteria used to evaluate the betterness and worseness of departments.

I think readers of the Philosophical Gourmet should just use the rankings as a rough guide to give them ordinally better and worse senses of average faculty reputation in any given department, and not depend on the ranking as a clear cardinal indication of where a given department stands in relation to the departments immediately above or below it. Maybe use five or so departments up or down as a buffer and say that, on the whole, Dept #7 is better than Dept #25, but not clearly better than Dept #11 and not worse than Dept #3. (I am not singling out Dept #7, whatever it is.)

I suppose that's a sloppy way of thinking, but it's also pretty sloppy to slap cardinal rankings on departments, even if the methodology is pretty rigorous.

Anonymous said...

The PGR rankings are ordinal, not cardinal. The ratings people give in the surveys are probably not on a cardinal scale--it's hard to say for sure because it's hard to say what, specifically, the survey measures. In any case, I don't think that the rankings support even an interval scale of quality. That is, there's no reason to think that one "unit" near the bottom of the scale is equal in size or absolute magnitude to one "unit" near the top. I don't even think there are "units" in the scale at all. And there is definitely no reason to think that, for example, mean rating "4" is twice as good as mean rating "2", or anything like that. There are no valid ratios, so there's no cardinality.

(Also, since it's not clear what the ratings are really measuring, or if ratings from different people measure the same thing, it is also unclear if the averages of these numbers measure anything, or are in any way meaningful.)

I am sympathetic to the view that Leiter should not be making these predictions. Even if nobody pays any attention and they have no effect, he should avoid making any statements that could possibly have any effect, or that could have the appearance of influence, if he wants to maintain that the rankings are independent and objective. (Not that I think they are, or anything.) Even if nobody pays any attention to him.

That said, I still think there's an interesting question there. There are certain senior (and junior) hires that Leiter himself thinks are significant and will have a measurable impact on the rankings. But it seems to me that they don't actually have these effects. I'll investigate soon.

Anonymous said...

Mr. Zero,

The PGR rankings are ordinal, not cardinal.

They are 'rankings', so of course they are ordinal. But the way the scores are used in computing the results assumes that they are cardinal.

Participants in the survey are *not* asked to rank the programs. They are asked to assign to each program a score. If A ranks the programs exactly as B does, but A's numerical scores are spread differently from B's, then their inputs are not the same and do not have the same effect on the output. This means that the inputs are cardinal.

Take a simplified example. Suppose I give these ratings:

Program X: 5
Program Y: 4.5
Program Z: 0.5

And you give them these ratings:

Program X: 3
Program Y: 1
Program Z: .5

My input is much better for Program Y than yours is. If the input was purely ordinal, then the effect of the two inputs would have to be the same.

And there is definitely no reason to think that, for example, mean rating "4" is twice as good as mean rating "2", or anything like that. There are no valid ratios, so there's no cardinality.

Temperature is measured on a cardinal scale, even though a Celsius temperature can't tell you which objects are "twice as hot" as others.

Using the inputs cardinally may assume more structure than really exists; I'm not disputing that. But in their use in the Philosophical Gourmet, they are in fact cardinal.

Anonymous said...

Temperature is measured on a cardinal scale, even though a Celsius temperature can't tell you which objects are "twice as hot" as others.

Sorry, that is wrong. The Celsius scale for temperature is an interval scale, not a cardinal scale. That is, its units carry information about sameness of interval, but not sameness of ratio. That's why, as you point out, Celsius numbers can't tell you when something is "twice as hot" as something else, but can tell you information about differences in temperature. The key feature of a cardinal scale is a non-arbitrary, natural zero point. The zero on the Celsius scale does not correspond to a natural zero, though the Kelvin zero does. The zero on the Gram scale for mass does, too. Both of those scales are cardinal scales.

Furthermore, the fact that Leiter manipulates the numbers generated by the surveys as though they were units on a cardinal scale does not entail that they actually are units on a cardinal scale. This is one of the more serious methodological problems with the PGR: what are those numbers, and what gives him the mathematical right to perform these statistical operations?

In fairness, you could find the mean on a mere interval scale, but what makes you think that those numbers represent genuine intervals? How would you make sure that the distance between 1 and 2 on your scale is the same as the distance between 4 and 5? How would you make sure that my numbers are comparable to yours in any way, even assuming we're trying to assign numbers to the same thing?

You say that your hypothetical rating of the schools is better than mine; I say, you don't have enough information to make that determination. It looks better, but if it's really ordinal, that appearance might be misleading or false. There's controversy about just this sort of thing in lots of areas, from Student Evaluation Forms to your GPA. In these sorts of settings, people appear to be applying statistical operations that are not warranted by the kind of scale they are generated on.

Anonymous said...

Mr. Z.,

Well, I think you are mistaken about the usage of the expression 'cardinal scale'. At least the SEP article on game theory talks about cardinal utilities, and it is clear that only ratios of intervals are significant, not ratios of absolute values. But its a technical term, and maybe it's used in different ways in different contexts.

Furthermore, the fact that Leiter manipulates the numbers generated by the surveys as though they were units on a cardinal scale does not entail that they actually are units on a cardinal scale.

I'm not sure what you mean. The question is how the numbers are being used, right? It's not as if those numbers had some Platonic property, being cardinal, or failed to have it.

You say that your hypothetical rating of the schools is better than mine; I say, you don't have enough information to make that determination.

I did? Where?
Or was that a rhetorical device?

Let me return to my point, since I'm afraid it has been obscured. The Leiter scores are meaningful cardinally if they are meaningful ordinally. I am not in fact claiming that they are meaningful at all. But if the scores that individuals put into the mechanism are not cardinally meaningful, then averaging them to get an ordinal ranking doesn't make sense. I think you are agreeing with this.

Anonymous said...

I apologize for how long this got to be.

I didn't read the Stanford article you cite. However, I think you may have misunderstood it, at least as it applies to the Celsius scale. This way of classifying scales is due to S. S. Stevens's “On the Theory of Scales of Measurement,” Science 103 (1946). pp. 677 – 80. (The watershed work is Krantz, et al. Foundations of Measurement.) According to it, scales are classified according to "permissible transformations." A transformation is permissible iff it leaves the scale "form invariant." A transformation leaves the scale form invariant iff it preserves the key feature measured by the scale.

For ordinal scales, any order-preserving transformation is permissible. So as long as the new scale preserves the order of the parent scale, it's a permissible transformation. This kind of scale can have everything but the order be arbitrary. So if you want to rate school y 4.5 and I want to rate it 78, as long as the other things are in the correct ordinal relation, there's no difference.

For interval scales, any interval-preserving transformation is permissible. This is stronger, since the new scale has to carry, without distortion, the "distances" or differences between the items ranked on it. Here you can have an arbitrary zero and an arbitrary unit, but the intervals must be the same throughout. The Celsius and Fahrenheit scales are interval scales.

For cardinal scales, any ratio-preserving transformation is permissible. This is a very strong condition. In order for this to happen, the unit can be arbitrary but the zero cannot. It's important to recognize that sameness of intervals and sameness of ratios of intervals are totally different and not at all the same. That is why 10 grams is twice as much as five grams, but 10 degrees C is not twice as much as 5 degrees C.

The "permissible transformations" are important because they determine which statistical operations make sense when applied to numbers on the scales. You need at least an interval scale in order to calculate averages.

The question is how the numbers are being used, right?

No. The question is, which operations does it make sense to use on these numbers, given the manner in which they were generated and the mathematical features of the scale we're dealing with. If the ratings of the schools are not being done on an interval scale, it does not make sense make use of these numbers to calculate averages. Even iff the ratings are interval-preserving, but are done on different units, say, my 4 is equivalent to your 4.2, then it still doesn't make sense to use the numbers to calculate an average. The question isn't "how is he using the numbers" but "why should we think he can use these numbers in this manner?"

The Leiter scores are meaningful cardinally if they are meaningful ordinally.

No. Calculating averages cannot be done on an ordinal scale, and you're skipping over interval scales. They can be done on an interval scale, but why think that the ratings are done all on one single interval scale, with the same units and zero point? They're not.

Anonymous said...

I wasn't aware that kicking the football out of the park was a goal of that particular game.

Anonymous said...

Maybe a simple example will help? Suppose profs 1, 2, and 3 are judging an essay competition and doing it by assigning papers number grades. Both 1 and 2 think paper A is the best, awesome, etc., but they come from a school of thought where "100%" is reserved for *perfect* papers, so they give it a 95%. Prof 3 doesn't like it so much, but thinks its pretty good, so he gives it a 90%. So paper A gets an average of 93.3%.

But prof 3 loves paper B, and gives it a 100% (he's from a school of thought where "100%" is *not* just reserved for perfect papers). Profs 1 and 2 think paper B is pretty good, definitely A material, but only barely, so they each give it a 91%. As a result, paper A gets an average of 94%.

But two profs thought paper A was great and B only barely grade-A-material, whereas only one prof thought paper B was great, and thought paper A was *better*-than-barely-A material. So intuitively, the ordinal ranking should be paper A, then B; but by averaging, you get paper B, then A.

Anonymous said...

Hi Mr. Zero,

Hm. You haven't read the article, but you think I misunderstood it. I love philosophers!
Could I ask you to read it? I even provided a link, and the link goes straight to the relevant section (though I couldn't get a link to the relevant paragraph). It uses 'cardinal' for scales the way I do, not the way you do. I happily admit that other writers may use it your way. Okay? The rest of your first five paragraphs I agree with.
From now on, I'll say 'interval scale', so we're on the same page. The transformation the preserves the property is affine.

I said:
The Leiter scores are meaningful cardinally if they are meaningful ordinally.

You responded:

No. Calculating averages cannot be done on an ordinal scale, and you're skipping over interval scales.

Right, so if the ratings by the individual raters carry only ordinal information, then the actual Leiter rankings are not meaningful ordinally! Which is what I said. So why did you say "No"?
(I was not skipping interval scale; I was using 'cardinal' in the way it is used in the SEP.)

why think that the ratings are done all on one single interval scale, with the same units and zero point? They're not.

Well, they are normalized before they are averaged. If you think this works, then you should agree that the results are *cardinally* meaningful. If you think it doesn't work, then you should think that the results are not even meaningful as a *ranking*. This is why I said:

The Leiter scores are meaningful cardinally if they are meaningful ordinally.

Nothing like a rousing dispute over the structure of Leiterspace to take one's mind off the job market! The consolations of philosophy.

Anonymous said...

Cardinal Meaning guy,

I read some of that SEP article. I found it obscure. This business about whether or not "magnitudes matter" as an account of cardinality won't cut it at all, and is, I think, what lead to you to think that the Celsius scale was cardinal. (Maybe I'm reading the wrong part. I did a search for 'cardinal' and read any paragraphs in which it was used.)

I initially didn't read it because I am familiar with the literature concerning measurement theory, and so I figured there would be no real need. I just assumed you were confused, because I assumed that the author of the SEP article knows the difference between interval and cardinal scales. Perhaps that was irresponsible of me. Sorry.

You said,

The Leiter scores are meaningful cardinally if they are meaningful ordinally.

And you defend this claim in the following way:

so if the ratings by the individual raters carry only ordinal information, then the actual Leiter rankings are not meaningful ordinally!

Although this last bit is true, I don't see the connection. Your original claim concerns the scores, not the ordinal ranking. The question is whether there is any basis for thinking that the ordinal ranking accurately represents reality.

The claim concerning the individual ratings, or the average ratings published on the PGR, that if they carry ordinal information then they carry cardinal information, is false. Suppose they carry ordinal information but there is no equality of interval and no natural zero. Then they don't carry any cardinal information.

I couldn't find any information concerning how the scores are "normalized" before they are averaged.

you said,

If you think [the normalization procedure] doesn't work, then you should think that the results are not even meaningful as a *ranking*.

Of course I think they're "meaningful." I think the ranking carries information about the ordinal relations between the various departments. What I wonder about is whether that information is accurate, and whether the methods Leiter employs to derive that information from his data justifies believing it. But you're right about this: if it's not the case that the ratings are all done on the same interval scale, then the ordinal ranking that results from averaging them should be regarded with suspicion.

Anonymous said...

Please stop.

Anonymous said...

I think it's funny that Zero takes issue with the SEP article on game theory, written by Don Ross. According to Ross's webpage at the U of Cape Town:

Don Ross obtained his doctorate at the University of Western Ontario in 1990. In 1997 he came to UCT. Since 2003, he is also Professor of Philosophy and Professor of Economics at the University of Alabama at Birmingham. He now divides his time between Cape Town and Birmingham. He consults extensively for South African industry and government.

Professor Ross is author/editor of eleven books and numerous articles. Most recently he is author of Economic Theory and Cognitive Science: Microexplanation (MIT Press, 2005), co-author of Every Thing Must Go: Metaphysics Naturalized (Oxford University Press, 2007), co-editor of Development Dilemmas (Routledge, 2005), co-author of Midbrain Mutiny: The Picoeconomics and Neuroeconomics of Disordered Gambling (MIT Press, 2008), and co-editor of the Oxford Handbook of Philosophy of Economic Science (2008).


I'm sure Zero has read plenty on measurement theory, but I think it's safe to say that Ross probably knows what he's talking about. And check out the acknowledgments, where Ross thanks about twenty people for reading his entry and making suggestions. I guess none of them have read Zero's work on this yet. By the way, SEP articles are refereed by the editorial board before they're made public. None of this makes Zero wrong, of course, but I guess I'd give Ross the benefit of the doubt.

Anonymous said...

Mr. Zero,

On the question of usage, I'm afraid I think the SEP entry is pretty decisive. But anyway, it's just a terminological issue.


The claim concerning the individual ratings, or the average ratings published on the PGR, that if they carry ordinal information then they carry cardinal information, is false. Suppose they carry ordinal information but there is no equality of interval and no natural zero. Then they don't carry any cardinal information.

But they are produced by averaging individual scores.
Suppose those individual scores carry cardinal information. Then the average does too. Suppose the individual scores do not carry cardinal information. Then their average does not carry ordinal information. So, if the composite scores carry ordinal information, they carry cardinal information.

Your view:
Of course I think they're "meaningful." I think the ranking carries information about the ordinal relations between the various departments.

So, you think that the individual scores (that are averaged) carry cardinal information, right?

I couldn't find any information concerning how the scores are "normalized" before they are averaged.

Turns out he's stopped doing it. The explanation is still in the previous report, if you're interested.

Anonymous said...

Not Ross,

I hope I'm not coming off like an obnoxious know-it-all who thinks he's smarter than the author of the SEP article, and that you'll allow me a bit of self-defense:

I didn't say that the author didn't know what he was talking about. I assumed that the author of the SEP article did know what he was talking about, which is why I inferred that the "Cardinal Meaning" guy who read the article misunderstood it.

I did say that the SEP article didn't adequately define cardinal measurement. I stand by this claim. (But maybe I missed it. If so, I hope CM will help me out.) There's an intuitive idea that he's using, and he's not wrong, but it's stated in a confusing and incomplete way. As evidence, I cite the fact that Cardinal Meaning read it and got the idea that the Celsius scale is cardinal, when it's the classic example of a non-cardinal, interval scale.

However, since the article is not about cardinal measurement at all, this isn't really a defect in the article. Though I find it inadequate as a definition, what the author says about cardinal measurement is fine for his purposes, since he only brings it up to point out that it isn't important for his purposes.

Also, I wasn't trying to say that I am the expert, or that I know more than Ross. I'm not and I don't. I'm not even an "epicycles" guy on this stuff. I cited a couple of sources, though.

Cardinal Meaning,

I didn't see anything that I would regard as decisive in the SEP article. If you could paste a sentence or two in here, that would help a lot.

I agree that if the individual scores are on a cardinal scale, then averages of those scores are also on a cardinal scale.

I think that the rankings and the numbers that are used to determine them are "meaningful" in the sense that they are being used to express propositions about the relative degrees of quality of philosophy departments. My worry is that one of the following things has happened: a) the propositions the rankings actually express are different from the ones they seem to express because the information is being presented in a misleading way; b) the propositions they express are false because they have been subject to statistical operations that you can't perform on the kind of scale we're dealing with.

So, the ratings might "carry cardinal information" but be wrong, or might carry merely ordinal information but present it in a cardinal-seeming way. Either thing could happen if Leiter were being methodologically careless. If, for example, he were taking averages of numbers on an ordinal scale, or he were taking averages of numbers on distinct cardinal scales, the resulting rankings would be false or incredibly misleading.

Sorry if I'm being too much of a chatterbox.

Anonymous said...

Mr. Zero,

I didn't say that the author didn't know what he was talking about. I assumed that the author of the SEP article did know what he was talking about, which is why I inferred that the "Cardinal Meaning" guy who read the article misunderstood it.

You didn't understand the article, so you inferred that I had misunderstood it? Here’s another possibility: in fact, linguistic convention among experts does sometimes use ‘cardinal’ for ‘interval’ in talk about scales. This was the whole point of my citing the article.

You want me to include a couple of sentences; okay.

Ross writes:

Later, when we come to seeing how to solve games that involve randomization—our river-crossing game from Part 1 above, for example—we'll need to build cardinal utility functions. The technique for doing this was given by von Neumann & Morgenstern (1947), and was an essential aspect of their invention of game theory.

Von N & M construct an interval scale, of course, not a ratio scale. Nobody in game theory thinks utility is measured on a ratio scale.

As evidence, I cite the fact that Cardinal Meaning read it and got the idea that the Celsius scale is cardinal, when it's the classic example of a non-cardinal, interval scale.

I was never confused about what kind of scale Celsius is. I’ve said several times that the only difference I had with what you were saying was a matter of terminology. I, like Ross, use 'cardinal' for interval scales. You do not. That is a difference in usage.

I don’t know what you mean by saying that the Leiter ratings might carry information but be false. They could carry information about something by tracking it. You put ‘carry meaning’ in quotation marks as if it were somehow a doubtful expression, even though you are the one who introduced it. I said I don’t see how the ratings could be ordinally but not cardinally meaningful. You seem to think that’s wrong. So, you think that they could be ordinally meaningful and not cardinally meaningful. How exactly are we supposed to understand the individual ratings in such a way that their average is ordinally meaningful but not cardinally meaningful?

Anonymous said...

CM,

I'm willing to leave the issue about cardinal/ratio/interval measurement to one side, and to apologize for misinterpreting your remarks. I wasn't trying to be a dick. I see the actual interpretation of the Leiter rankings as being the interesting issue here. So, sorry.

Here's what I'm trying to get at: Leiter sends around all these surveys for people to rate faculty quality on a 1-to-5 scale, which is associated with a certain heuristic. Then he takes the ratings for each department and averages them, generating some sort of an amalgam of opinion concerning the quality of each group of faculty. (Just so we're on the same page.)

My worry is, what do those numbers mean? What information do they actually express, as opposed to merely seeming to? Suppose I rate X = 5, Y = 4, and Z = 3. If my number assignments don't track the intervals between these ratings as well as their order, then it doesn't make any sense to use them to compute averages, since you need an interval scale to calculate a mean.

The interpersonal case is even more problematic. If I rate X =5 and you rate X = 4, our ratings have to both be on the same interval scale in order for it to make sense to calculate the average in an intuitive way and come up with 4.5. If I mean "on the fifth (and highest) interval" and you just mean "second best" (where 5 is the best), there's no average of the two numbers.

Furthermore, even if the ratings are done on interval scales, but they're not done on the same one, then it doesn't make sense to average them. That would be like averaging a bunch of numbers representing temperature, some of which are in Celsius and some are in Fahrenheit.

If any of that stuff was happening, it might seem as though there was a legitimate, interval-scale rating of the departments, but this would be incorrect. We would be fooled by the fact that we can take averages of these numbers, but the numbers in this case would not actually represent phenomena that can be averaged.

You might think that these ratings really are all on a single interval scale. I say, that would be an astonishing coincidence. Leiter's heuristic doesn't seem to me to provide any evidence that there are equally-spaced intervals between the ratings, and even if there were different people could have different intervals in mind.

You might think that if you get enough ratings these differences will cancel each other out. I say, maybe, but maybe not. A lot of care would have to go into designing the heuristic.

So that's why I think that the ratings and the ordinal ranking supported by them is suspect. Again, sorry we got off on a tangent, and sorry if I was being a dick before.