This work is available here free, so that those who cannot afford it can still have access to it, and so that no one has to pay before they read something that might not be what they really are seeking.  But if you find it meaningful and helpful and would like to contribute whatever easily affordable amount you feel it is worth, please do do.  I will appreciate it. The button to the right will take you to PayPal where you can make any size donation (of 25 cents or more) you wish, using either your PayPal account or a credit card without a PayPal account.

Examples of a Common Kind of Fallacy in the Social Sciences
Rick Garlikov


Social scientists often claim to be able to empirically and objectively measure a subjective or vague characteristic by equating it with a precise objective criteria they say captures the meaning of the concept or provides necessary and sufficient evidence for the subjective determinations we make.  I think all too often they fail to get it right, if it can be done right at all.  Any and every such claim should be viewed with suspicion and subjected to serious scrutiny.








Because it makes the kind of error common in the social sciences, I want to examine a passage from a recent article printed in Hillsdale College’s publication Imprimus.  The article is “Restoring America’s Economic Mobility” by Frank Buckley and is in the September 2016 issue.  It is also online at https://imprimis.hillsdale.edu/restoring-americas-economic-mobility/  I think there are a number of common kinds of errors in the reasoning stated in the article, and these errors are not errors of form, and are thus ‘informal’ fallacies or what I think would be better called 'non-formal' fallacies.  However I don’t believe these are informal fallacies that have names or would be listed in lists of informal fallacies, though I could be mistaken about that. 


After I analyze this passage, I will also try to show that some experimental results in psychology make the same kind of error, using as examples, some particular research methods interpreted to show that people are not as rational as previously supposed.  I will not be arguing that people are necessarily rational, but that the experiments believed to show they are not, do not show that.

 

“[Many people today imagine America to still be] a country defined by the promise that whoever you are, you have the same chance as anyone else to rise, with pluck, industry, and talent. But they imagine wrong.  The U.S. today lags behind many of its First World rivals in terms of mobility. A class society has inserted itself within the folds of what was once a classless country, and a dominant New Class—as social critic Christopher Lasch called it—has pulled up the ladder of social advancement behind it.

 

“One can measure these things empirically by comparing the correlation between the earnings of fathers and sons. Pew’s Economic Mobility Project ranks Britain at 0.5, which means that if a father earns £100,000 more than the median, his son will earn £50,000 more than the average member of his cohort. That’s pretty aristocratic. On the other end of the scale, the most economically mobile society is Denmark, with a correlation of 0.15. The U.S. is at 0.47, almost as immobile as Britain.

 

“A complacent Republican establishment denies this change has occurred. If they don’t get it, however, American voters do. For the first time, Americans don’t believe their children will be as well off as they have been.” [1] 

 

Social scientists often claim to be able to empirically and objectively measure a subjective or vague characteristic by equating it with a precise objective criteria they say captures the meaning of the concept or provides necessary and sufficient evidence for the subjective determinations we make.  I think all too often they fail to get it right, if it can be done right at all.  Any and every such claim should be viewed with suspicion and subjected to serious scrutiny.  An easy example is IQ measurement by means of a test score on a certain kind of test,  whereby the higher one’s IQ score, the more intelligent one supposedly is.  However, clearly “intelligence” involves characteristics more than, and often different from, scoring high on such a test, particularly if the high score is attributable to coaching more than, or rather than, to some sort of inherent ability.  Intelligence may be about seeing connections other don’t (whether in comedy or in physics) or seeing them much faster, about learning new things quickly, about having deeper understanding and seeing ramifications, about capacity for learning and/or remembering, about having great common sense, being perceptive, etc. in ways a standard IQ test doesn’t measure or pretend to measure.  It might be about combining many different ideas over time to discover or invent something else no one has before, and which cannot be tested at some given time on a test where the answer is already known.  When one of my daughter’s was in sixth grade, she auditioned for something where one of the skill tests was sight-reading, but the piece they gave her was one she had played before.  Had she not told them, she would have likely seemed to be a great sight reader, though maybe she would have had to make some slight errors to carry out the pretense successfully.  (Possibly they knew she had studied this piece previously and it was a test of her candor, perhaps considered to be a test of honesty.)


So in the above passage we are given a way that is claimed to measure the notion of economic mobility and whether people have a good chance to succeed in America and “rise above their station of birth” even if they start from lowly beginnings.

 

Notice the formula is something of a complicated measure and is different from what it might seem to be at first.  Nothing in the measure shows whether a child makes more or less than its parent (I am using “parent/child” rather than “father/son”, since the latter seems sexist in this day and age.)  And it is misleading to claim, as it does, that this is about “comparing the correlation between the earnings of fathers and sons.” It is not that the child in the example makes half what its parent did, but that if you subtract the average income of the child’s generation in the country from the child’s income, it will be half the number you get from subtracting the average income of the parent’s generation in the country from the parent’s income.  It is the ratio of the child’s income difference from the child's peers to the parent’s income difference from the parent’s peers.   It is not easy to see what sort of “thing” that number represents.  And though in some cases it might coincide with economic mobility, or with signs of economic mobility, it need not.  It is not necessarily, and perhaps even not at all, the same thing as what is meant by economic mobility or as economic opportunity to rise economically or to be economically successful.

 

To see what it means or doesn’t mean, consider this scenario: suppose your mother made $100,000 a year while the median income of her age was $40,000.  That would yield a difference for her of $60,000 more than average, so she made quite a bit more money than most of her fellow contemporary citizens.  She made 2 and a half times what the median average person her age did, and she was in a much higher ‘class’ than they were (if we equate class with income).  And now suppose that by some economic booming circumstance, the average contemporary of yours makes $1 million annual income, and you make $1,001,500.  Now, you are making more than ten times what your mother made, but you are making ‘only’ $1500 more than your fellow citizens, so the comparison gives the number (1500/60,000 = .025) which would make your society highly mobile by this measure, as it indeed seems to be.  Lots of people went from lower class with their parents being relatively poor to themselves being quite rich, and you went from your mother’s upper class to being even richer (10 times richer) than she was, though not as much as your peers rose above their parents because they became 25 times richer than their parents.   And your position compared with others, declined quite a bit compared with your mother’s position relative to her contemporaries.  Yet, you are making ten times what she made.  So on this measure, compared with your peers you were downwardly mobile relative to your mother’s class compared with her peers, even though you make more than ten times what your mother made. 

 

Now suppose, that instead of your income being $1,001,500, your income is $1,030,000, while everyone else’s is $1,000,000.   That puts the correlation now at (30,000/60,000), which is .5 and is supposedly not a very economically mobile society, even though now everyone is rich, when only a few were before.  Everyone else grew the same amount as they did in the first case, and you earned a lot more too, but not proportionately as much more as they and not as proportionately higher than they as your mother was to her peers.

 

Oppositely, if the country were to fall on really desperate economic times and your income was $10,000 and everyone else’s was $5,000, the correlation would then be (5000/60,000 = .083) making it almost twice as “economically mobile” as Denmark.  But notice the “mobility,” if anything, is downward, not upward.  So this measure does not measure a good thing and insofar as it is a measure of economic mobility, economic mobility is then not a good thing – and certainly not something that shows you have a better chance of being more well off than your mother was.

 

Even if you find all this hard to follow, you can see the objective measure is not something that reflects what we would consider to be economic mobility, which should have more to do with how your income stacks up in ability to meet or exceed your needs and where you stack up in regard to whether you have resources in addition to purchase conveniences and luxuries.  How you stack up against your peers (which may determine your relative ‘class’ rung on the ladder) is not as much a sign of upward mobility as being able to live in a higher lifestyle than your parents in terms of your access to more security, conveniences, and luxuries.  In fact, the lament at the end of the quoted passage that “For the first time, Americans don’t believe their children will be as well off as they have been” is a serious problem but not one of economic “mobility”, but of economic progress, and actually would even contribute to ‘mobility’ on the measure used by the Pew Economic Mobility Project if it leveled out people’s incomes or just lowered the incomes substantially of the children of wealthier parents.

 

Finally, consider what it means to be “a country defined by the promise that whoever you are, you have the same chance as anyone else to rise, with pluck, industry, and talent”.  You are not kept down by your original class.  At first blush that would seem to be a good measure of an economic system’s mobility, and one of the major conservative economists of the last century, Friedrich Hayek, thought that an economic system’s fairness was signified by something of this sort – that everyone had the same chance to become wealthy as anyone else.  But 1) if no one has much chance to rise, that meets the criteria but it is a hollow promise or empty enterprise, and 2) notice that everyone has the same chance of becoming wealthy by winning the lottery, but that would not make an economic system based on a lottery or any other sort of gambling or unproductive distribution of existing wealth be a fair or good system, even if there were numerous lotteries.

 

I don’t think that mobility of the sort that is desirable has to do with ratios of poor to rich or with relative changes in class among different generations of rich and poor.  What is desirable is that everyone have a fair and reasonable opportunity to apply themselves and work hard at something right or good to do (not something like crime) and will likely succeed because of that, not that they have an equal chance (which could be zero or 1 in a billion) as everyone else to succeed.  And by ‘succeed’ I mean be able to have a decent life and one that fairly rewards people proportionately to the contribution they make toward the total bounty of available goods and services. What you want is that if everyone who is able to, works and has a fair opportunity to work, they all together produce enough for a good life for all and that each person receives his/her fair share of what they all produce.[2]   Upward economic mobility is about the reasonably good opportunity to have more goods and services than your parents did, particularly if they had relatively little access to security, necessities and conveniences, and you have great access to them and to at least some luxuries.  Upward mobility from everyone's being millionaires to being billionaires is not nearly as important as upward mobility from starving or barely getting by to being reasonably comfortable and secure.


Experimental Results in Psychology Making the Same Kind of Error: using particular
objective measures to confirm or deny claims about subjective or abstruse characteristics

The following in normal black font is from different sections of "Rethinking Rationality: From Bleak Implications to Darwinian Modules" Richard Samuels, Stephen Stich, and Patrice D. Tremoulet.  My analysis of it will be in red font in the appropriate places:
About thirty years ago, Amos Tversky, Daniel Kahneman and a number of other psychologists began reporting findings suggesting much deeper problems with the traditional idea that human beings are intrinsically rational animals. What these studies demonstrated is that even under quite ordinary circumstances where fatigue, drugs and strong emotions are not factors, people reason and make judgments in ways that systematically violate familiar canons of rationality on a wide array of problems. Those first surprising studies sparked the growth of a major research tradition whose impact has been felt in economics, political theory, medicine and other areas far removed from cognitive science.
...

"The Selection Task:
In 1966, Peter Wason reported the first experiments using a cluster of reasoning problems that came to be called the Selection Task. A recent textbook on reasoning has described that task as "the most intensively researched single problem in the history of the psychology of reasoning." (Evans, Newstead & Byrne, 1993, p. 99) A typical example of a Selection Task problem looks like this:

"What Wason and numerous other investigators have found is that subjects typically do very poorly on questions like this. Most subjects respond, correctly, that the E card must be turned over, but many also judge that the 5 card must be turned over, despite the fact that the 5 card could not falsify the claim no matter what is on the other side. Also, a large majority of subjects judge that the 4 card need not be turned over, though without turning it over there is no way of knowing whether it has a vowel on the other side. And, of course, if it does have a vowel on the other side then the claim is not true. It is not the case that subjects do poorly on all selection task problems, however. A wide range of variations on the basic pattern have been tried, and on some versions of the problem a much larger percentage of subjects answer correctly. These results form a bewildering pattern, since there is no obvious feature or cluster of features that separates versions on which subjects do well from those on which they do poorly.
...

"The Conjunction Fallacy
Ronald Reagan was elected President of the United States in November 1980. The following month, Amos Tversky and Daniel Kahneman administered a questionnaire to 93 subjects who had had no formal training in statistics. The instructions on the questionnaire were as follows:

In this questionnaire you are asked to evaluate the probability of various events that may occur during 1981. Each problem includes four possible events. Your task is to rank order these events by probability, using 1 for the most probable event, 2 for the second, 3 for the third and 4 for the least probable event.

Here is one of the questions presented to the subjects:
Please rank order the following events by their probability of occurrence in 1981:
(a) Reagan will cut federal support to local government.
(b) Reagan will provide federal support for unwed mothers.
(c) Reagan will increase the defense budget by less than 5%.
(d) Reagan will provide federal support for unwed mothers and cut federal support to local governments.
The unsettling outcome was that 68% of the subjects rated (d) as more probable than (b), despite the fact that (d) could not happen unless (b) did (Tversky & Kahneman, 1982). In another experiment, which has since become quite famous, Tversky and Kahneman (1982) presented subjects with the following task:
Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Please rank the following statements by their probability, using 1 for the most probable and 8 for the least probable.
(a) Linda is a teacher in elementary school.
(b) Linda works in a bookstore and takes Yoga classes.
(c) Linda is active in the feminist movement.
(d) Linda is a psychiatric social worker.
(e) Linda is a member of the League of Women Voters.
(f) Linda is a bank teller.
(g) Linda is an insurance sales person.
(h) Linda is a bank teller and is active in the feminist movement.
In a group of naive subjects with no background in probability and statistics, 89% judged that statement (h) was more probable than statement (f). When the same question was presented to statistically sophisticated subjects -- graduate students in the decision science program of the Stanford Business School -- 85% made the same judgment! Results of this sort, in which subjects judge that a compound event or state of affairs is more probable than one of the components of the compound, have been found repeatedly since Kahneman and Tversky's pioneering studies.
 
Base-Rate Neglect
On the familiar Bayesian account, the probability of an hypothesis on a given body of evidence depends, in part, on the prior probability of the hypothesis. However, in a series of elegant experiments, Kahneman and Tversky (1973) showed that subjects often seriously undervalue the importance of prior probabilities. One of these experiments presented half of the subjects with the following "cover story."

A panel of psychologists have interviewed and administered personality tests to 30 engineers and 70 lawyers, all successful in their respective fields. On the basis of this information, thumbnail descriptions of the 30 engineers and 70 lawyers have been written. You will find on your forms five descriptions, chosen at random from the 100 available descriptions. For each description, please indicate your probability that the person described is an engineer, on a scale from 0 to 100.

The other half of the subjects were presented with the same text, except the "base-rates" were reversed. They were told that the personality tests had been administered to 70 engineers and 30 lawyers. Some of the descriptions that were provided were designed to be compatible with the subjects' stereotypes of engineers, though not with their stereotypes of lawyers. Others were designed to fit the lawyer stereotype, but not the engineer stereotype. And one was intended to be quite neutral, giving subjects no information at all that would be of use in making their decision. Here are two examples, the first intended to sound like an engineer, the second intended to sound neutral:

Jack is a 45-year-old man. He is married and has four children. He is generally conservative, careful and ambitious. He shows no interest in political and social issues and spends most of his free time on his many hobbies which include home carpentry, sailing, and mathematical puzzles.

Dick is a 30-year-old man. He is married with no children. A man of high ability and high motivation, he promises to be quite successful in his field. He is well liked by his colleagues.

As expected, subjects in both groups thought that the probability that Jack is an engineer is quite high. Moreover, in what seems to be a clear violation of Bayesian principles, the difference in cover stories between the two groups of subjects had almost no effect at all. The neglect of base-rate information was even more striking in the case of Dick. That description was constructed to be totally uninformative with regard to Dick's profession. Thus the only useful information that subjects had was the base-rate information provided in the cover story. But that information was entirely ignored. The median probability estimate in both groups of subjects was 50%. Kahneman and Tversky's subjects were not, however, completely insensitive to base-rate information. Following the five descriptions on their form, subjects found the following "null" description:

Suppose now that you are given no information whatsoever about an individual chosen at random from the sample.

The probability that this man is one of the 30 engineers [or, for the other group of subjects: one of the 70 engineers] in the sample of 100 is ____%.

In this case subjects relied entirely on the base-rate; the median estimate was 30% for the first group of subjects and 70% for the second. In their discussion of these experiments, Nisbett and Ross offer this interpretation.

The implication of this contrast between the "no information" and "totally nondiagnostic information" conditions seems clear. When no specific evidence about the target case is provided, prior probabilities are utilized appropriately; when worthless specific evidence is given, prior probabilities may be largely ignored, and people respond as if there were no basis for assuming differences in relative likelihoods. People's grasp of the relevance of base-rate information must be very weak if they could be distracted from using it by exposure to useless target case information. (Nisbett & Ross, 1980, pp. 145-6)"
These analyses are incorrect.  In the Reagan and Linda cases, it is presumed that the test subjects think of the combined cases as independent, when in fact they likely do not -- particularly as shown by the answers they give.  And it is perfectly reasonable to think that in some cases a combined event is more likely to occur than either event individually without the other insofar as one believes that they are somehow related at least probabilistically.  In the Reagan case, for example, where respondents ranked "(d) Reagan will provide federal support for unwed mothers and cut federal support to local governments" more probable than "(b) Reagan will provide federal support for unwed mothers" it seems to me plausible to think they believed that if he were to be able to provide the support for unwed mothers, he would also cut support to local governments, as perhaps one way to keep a budget in balance, and that if he couldn't do both (or more precisely, if he couldn't find a way to keep the budget in balance and get those funds from somewhere else without raising taxes), he would be less likely to provide support for unwed mothers.

Or suppose you had to order the following in terms of probability in regard to some random person P, where P could be anyone anywhere in the world at a any time:
       P died in New York City on 911 in the World Trade Towers collapses.
       P died in New York City.
      P died on September 11.
There is a perfectly good sense, though it is difficult to articulate, in which the first statement is more likely true than either of the others because there is only a 1 in 365.25 chance dying on September 11, and a small chance of anyone’s dying in New York City (unless you know they live there or work there) -- just a little more than 1 chance in a thousand (based on a population of 8.5  million in New York City and a world population of 7.4 billion, in statistics from 2015) -- but there is a big chance that someone who died in New York City on September 11 was a victim in the terrorist attack on the Trade Towers.  Surely more people died in that attack then otherwise died in New York City on that or most other days, since the average daily number of deaths in all of New York State in 2014 was 404, which was exactly the same in 2008, 2012, and 2013, and was 403 in 2011, and 397 in each of 2009 and 2010.  If the second statement were P died in London and/or the third statement was P died on April 3, people would probably still be inclined to pick the first statement as more likely not only in spite of the combination but because of it.
 
So the question is ambiguous or misleading as to what probability is being sought.
 
And in the Jack and Dick, lawyer/engineer base-rate questions, it seems pretty obvious to me that respondents believed they picked up on personality or other behavioral cues that the test creators did not think existed in the descriptions.  When it is said that
Some of the descriptions that were provided were designed to be compatible with the subjects' stereotypes of engineers, though not with their stereotypes of lawyers. Others were designed to fit the lawyer stereotype, but not the engineer stereotype. And one was intended to be quite neutral, giving subjects no information at all that would be of use in making their decision
it seems clear to me that the design and intentions were not successful, and that Nisbett and Ross, for example, are mistaken in assuming that what they call “useless target case information”, “totally nondiagnostic information”, and “worthless specific evidence” is perceived in that way by respondents who, if Nisbett and Ross were right, should ignore it and give the same answers they would give with only the base-rate information.  But clearly, respondents think that the 70/30 ration between engineers/accountants or between accountants/engineers is insufficient to make them question their ability to discern traits they think apply to one or the other, whether psychologists think those traits in the descriptions are neutral or not.  Just because Kahneman and Tversky believed they had descriptions neutral in regard to stereotypes of lawyers or engineers did not mean their test subjects perceived those descriptions that way.  And apparently they did not.  They perceived something they thought overrode the ratios given, because when there were no descriptions given, they went simply by the ratios.  And the fact you know that only 30% of the group in question is lawyers is not going to deter you from guessing someone is a lawyer if you believe you perceive cues you think more likely indicative of a lawyer than an engineer.  Now whether the cues you think indicate someone's being more likely a lawyer than an accountant are reasonable or not is perhaps open to conjecture, but insofar as one does have such a cue in mind, the base rate information is not relevant, and it is reasonable to ignore it.  While it may not be reasonable to believe x implies y, if you do believe it, then it is reasonable to believe y is true when you perceive x.

All kinds of information can seem to indicate or stereotype people and professions.  I was in a supermarket checkout lane one time between a woman who was buying only M&Ms and a bottle of wine.  I commented that at least she had two of the major food groups even though they didn’t go together.  She said that her husband actually liked them together and that every night he had a glass of wine with nine M&Ms.  I joked that it seemed seven or ten would be better with the wine, and she said “No, he always has precisely nine.”  So I asked whether he was an accountant, and she said “No, he is an engineer.”  I should have guessed that first because I heard countless stories of engineers who were totally rigid in their thinking and had seen it first hand with some myself.  And in fact, one engineer (who did not fit the stereotype) told me one time that “engineers are people who are good at math but who don’t have enough personality to be an accountant”.  And another engineer told me that one joke popular among engineers is “The way you tell the difference between an engineer who is an introvert and one who is an extrovert is that the extrovert engineers looks at the shoes of the person he is talking to instead of his own.”  And who would have thought that "looking at shoes when talking" is a stereotype sign of an engineer.  There are probably all kinds of elements in the Kahneman and Tversky’s descriptions that respondents take to be more likely indicative of either lawyers or of engineers than Kahneman and Tversky or other psychologists recognize.  If I am right, then if they had asked the respondents why they chose the answers they did which went against the given base-line odds, they would have seen what the perceived cues were and that they existed in the minds of the respondents.  As I point out later about Piaget’s experiments with children, normally one should at least ask respondents why they give the answers they do before jumping to speculative conclusions about why they did.  I was giving a talk to a group of retired highly professional senior citizens one time and to illustrate some point I was making, I gave them a progression of numbers and asked what the next number should be in the progression.  All but two gave the answer I expected, but two of them gave an answer that just seemed screwy to me.  I started to just say the majority answer was the right one, but instead of doing that, I asked why the two people gave the number they did.  It turned out they saw a different formula that also accounted for the progression of the numbers already given, and their formula generated a different next number than the formula the rest of us saw.  That happily illustrated what was the main point of my talk to the group that it is unfair to use student answers on most kinds of tests to grade them, unless you at least also find out the reasons the students gave those answers.
"Before leaving the topic of base-rate neglect, we want to offer one further example illustrating the way in which the phenomenon might well have serious practical consequences. Here is a problem that Casscells et. al. (1978) presented to a group of faculty, staff and fourth-year students and Harvard Medical School.

"If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming that you know nothing about the person's symptoms or signs? ____%

"Under the most plausible interpretation of the problem, the correct Bayesian answer is 2%. But only eighteen percent of the Harvard audience gave an answer close to 2%. Forty-five percent of this distinguished group completely ignored the base-rate information and said that the answer was 95%.

...

"...there are some versions on which performance improves dramatically. Here is an example from Griggs and Cox (1982). 


"From a logical point of view this problem is structurally identical to the [Letter/Number problem above], but the content of the problems clearly has a major effect on how well people perform. About 75% of college student subjects get the right answer on this version of the selection task, while only 25% get the right answer on the other version. Though there have been dozens of studies exploring this "content effect" in the selection task, the results have been, and continue to be, rather puzzling since there is no obvious property or set of properties shared by those versions of the task on which people perform well."
But the content makes a difference because it is easier for people to keep in mind what the connection is between the opposite sides of the cards and thus apply the logic to it, in the drink/drinking age test than in the letter/number test.  People know the drinking age in many places is 21.  So it is easy to know that if you are 25, you can drink or not drink and that if you are drinking coke, it doesn't matter what age you are.  So the only two important cards are the drinking beer one and the 16 year old one, for if someone is drinking beer, the other side better show they are over 20, and if someone is 16, the other side of the card better not be that they are drinking beer.  But the relationships between one side being a letter and the other side being a number is not as easy to keep in mind or reason about. 

The tests about which cards to turn over to test whether underage people are drinking or whether cards with vowels on one side have a consonant on the other, are tests of whether people correctly understand "if, then" statements and whether they understand what confirms the four basic kinds of arguments below that utilize them.  It is difficult for most students to know the difference among the four "if, then" argument forms when presented abstractly just using letters to represent statements.  Whenever you have a statement of the form "If statement A is true then statement B is true", A is called the antecedent and B is called the consequent, and the 'if, then' statement says that 'A implies B' or that "A's being true lets you know that B is also true".  One rough way to think of that is to interpret it as "A leads to B".  This is sometimes represented as "A > B".  The important thing to notice is that the implication is stated to be in only one direction: from A to B, not from B to A.  If the implication happens to be one that can go in both directions, then that would need to be stated separately as both "A > B" and "B > A"; "if A is true, then B is true and if B is true, then A is true."  There are some relationships which do go in both directions, but most only go in one direction.  Being a triangle with all equal angles and being a triangle with all equal sides imply each other.  But being a square and being a rectangle do not imply each other; being a square does imply being a rectangle but being a rectangle does not imply being a square.  In some cases the implication is easy to see and to remember; in others it is not, and this will be an important point in regard to the different results in the above two tests of supposedly the same reasoning skill, which is essentially the following.
There are four argument forms that start with the premise, 'if A, then B', depending on whether the second premise affirms or denies the truth of the antecedent, or affirms or denies the truth of the consequent.  Two of the forms are always valid, giving good deductions; and two are always invalid, giving flawed deductions -- fallacies:
     valid form: affirming the antecedent to derive the consequent; i.e., knowing the consequent is true because the antecedent is; this is always valid:
1) if A, then B.
2) A. 
Therefore 3) B.

For example:
1) If Annie is a dog, then Annie is an animal.
2) Annie is a dog.
Therefore 3) Annie is an animal.
     valid form: denying the consequent to derive that the antecedent is also not true; i.e., knowing the antecedent must be false because the consequent is; this is always valid:
1) if A, then B.
2) not B. 
Therefore 3) not A.

For example:
1) If Annie is a dog, then Annie is an animal.
2) Annie is not an animal.
Therefore 3) Annie is not a dog.
     fallacy of affirming the consequent (mistakenly believing the antecedent must be true because the consequent is; no argument in this form is valid):
1) if A, then B. 
2) B. 
Therefore 3) A.

For example:
1) If Annie is a dog, then Annie is an animal.
2) Annie is an animal.
Therefore 3) Annie is a dog.
     fallacy of denying the antecedent (mistakenly believing the consequent must be false because the antecedent is; no argument in this form is valid):
1) if A, then B. 
2) not A. 
Therefore not B.

For example:
1) If Annie is a dog, then Annie is an animal.
2) Annie is not a dog.
Therefore 3) Annie is not an animal.
In the abstract without an example, this is somewhat difficult for people to understand, but it is particularly difficult for most people to apply when they are not familiar with the content because it is difficult for them to keep in mind what is related to what and in 'which direction'. 
(As an aside, though an important one, the logic of this, as one way implications, is also difficult for people to apply when they are very familiar with the content and they know or believe that the particular content goes in both directions, and that not only "if A, then B", but "if B, then A", which is the same as "A if and only if B", and which comes out to be the same as "if A, then B, and if not A, then not B".  For example in the test for whether a cake is finished baking or not: if a toothpick stuck into the center comes out clean, the cake is done, and if it comes out with particles stuck to it -- i.e., not clean -- the cake is not done.  The toothpick comes out clean if and only if the cake is done and it comes out dirty if and only if the cake is not done.  However, that is not the problem for the letter/number test and the drink/drinking age test.  The difference between those two test results can be explained by the familiarity/unfamiliarity distinction.)
For a different, more problematic, example, if your spouse says "I may go to the store later" that clearly lets you know that if she goes to the store, then she won't be home (at that time), but it also can imply that if she is not at home, she will be at the store, even though it does not definitively imply it because she is not saying that she won't go out otherwise or go somewhere else too.  So it may be understandable that if you come home and your spouse is not there, to assume s/he went to the store.  And it is also the case that if you guess wrong, your spouse will imply it was your fault.  E.g., if you ask whether they went to the store or not and they said "Well, I told you that I might go out to the store, and I was out, wasn't I!" that meant they thought they had said they were only going out to the store if they went out.  And if they say, ""I never said I was only going to the store if I went out" that means they didn't mean it both ways and it may even imply s/he didn't go to the store at all, but went somewhere else instead.

However, most people do not have problems with the above four argument forms when they are knowledgeable about and familiar with the relationship between the truth of A and the truth of B, and  know that it only goes in one direction.  For example, you know that if someone was murdered that they are dead, and you know that people can be dead without having been murdered.  Now there are four possibilities that you find out about Jones:
     Jones was murdered (it is on the news).  In this case, you know Jones is dead.

     Jones is not dead (you see him working in his yard).  In this case, you know Jones was not murdered.

     Jones is dead (you find his body or his tombstone).  In this case if you think that means Jones was murdered, you can't really know that for sure, even if Jones seemed to be perfectly healthy earlier.

     Jones was not murdered.  In this case, if you think that means Jones is not dead, then you may be wrong or you may be right, but it is a mistake to think it must mean Jones is not dead.
But if you are told "If the card has a vowel on one side, it will have an odd number on the other" it is difficult to connect vowels with odd numbers, because that is not any kind of natural or familiar relationship and it easily could have gone the other way.  So it is easy to get confused about which card is necessary to turn over to disconfirm the statement that "If the card has a vowel on one side, it will have an odd number on the other".  That is a fairly convoluted problem, and failure to get it right, doesn't show lack of logic any more than it shows you just got confused about which kinds of letters were lined up with which kinds of numbers.

 On the other hand, knowing that Jones is dead because you found out he was murdered is not necessarily a result of your having made a deduction of the above sort either.  So even if you were given cards that on one side says "Jones is dead" or "Jones is alive" and on the other side says "Jones was murdered" or "Jones was not murdered" and you were asked which cards would confirm that the cards have the correct things on the opposite sides that match what is on the side you see, and you know which ones to turn over, that doesn't mean you are doing it from reasoning alone.  You can easily know that turning over a card that says "Jones is dead" won't tell you anything because even if the cards are done correctly the other side can say he was murdered or he was not , and neither will turning over a card that says "Jones was not murdered", because he could be dead or not be, and you can't tell the card is wrong from either way.  It is only the following cards that would show the cards are not labeled correctly:  cards that say "Jones was murdered" on one side and say "Jones is not dead" on the other.  So cards that say either of those two things are the only cards you need to turn over to see what is on the other side.

There are many things we tend to know more by familiarity than by reasoning.  E.g., the understanding of “place value” in arithmetic.  Children have a difficult time learning it (see Understanding and Teaching Place-Value) but most adults can work with it pretty well even though it is unlikely they really understand its logic, which I explain in the essay The Socratic Method: Teaching by Questions and in response to which I have received emails from elementary school math teachers saying that reading it let them understand place value for the first time.  One can work with things and answer questions about them -- even teach them -- without understanding their logic or basis.  Another example of that is that many students can do arithmetic but have great difficulty with algebra.  They can work with familiar concrete manipulations but do not really understand numerical relationships they have not had to work with or have not discovered on their own. 

In regard to the test about the false positives above, the concepts involved in the question are particularly unfamiliar (i.e., a 5% erroneous test about a .1% occurring condition), and that, combined with the ambiguity of the statement "The test gives 5% false positives" to mean either that 'the test is wrong 5% of the time when it says the condition is present' or mean (as the researchers take it to mean) that "the test will say positive in 5% of negative cases", is highly likely to make people give the wrong answer, regardless of their reasoning ability -- if they cannot ask for clarification of the question or have any reason to believe there is some computation involved in what you are asking, or that the .1% frequency of the disease is relevant to what is being asked.  E.g., if 5% of the tests are false positives, then 95% are not -- meaning that they either are positive and not false, or they are not positive.  And in fact, it is really difficult to see what it even means to ask "What is the chance of being right with a 5% false positive test for a .1% condition."  That anyone cannot do that while answering a set of questions they probably don't care about in a short amount of time, doesn't show they have difficulty reasoning.  Taking it to mean they do, gives false positives itself.

In general if you give people who should be reasonably knowledgeable questions many of them miss, it is perhaps a sign there is something wrong with the question and how you are interpreting the results you get from using it.  You should normally ask people why they chose the answer they did before just speculating from their answer.

Plus, not being able to solve a logic problem at a particular time may show lack of the right creative inspiration, not lack of reasoning ability.  There are plenty of “brain teaser” logic puzzles that are extremely difficult to discover right answers about, but very easy to see when the right is pointed out (and explained, if necessary).  E.g., “the father of your father is not your father; and the cousin of your cousin may or may not be your cousin also.  Is the sibling of your sibling also your sibling?”  Most people would say "yes," but the correct answer is “not necessarily.”  The sibling of your may be your sibling, but the sibling of your sibling may also be you; and you are indeed the only sibling of your sibling if your parents only had two children.  And, obviously, you are not your sibling.

Or consider this: A young boy, his father, and his grandfather are riding in a car that gets into a horrific accident.  The father and grandfather are killed immediately, and the boy is in bad shape but still alive; he needs surgery.  He is transported immediately to a nearby trauma center that has one of the country’s best surgeons on duty.  But upon seeing the patient, the surgeon says “I cannot operate on this patient because he is my son.”  How can that be?  Most people cannot figure out an answer to this, but everyone immediately recognizes how the answer  makes perfectly good sense -- which illustrates once again the difference between the reasoning and the creativity of problem solving.  That one is not creative enough to find the logical solution does not mean one does not have the reasoning ability necessary to solve it.  The answer is that the surgeon is the boy's mother.  I have had even ardent feminists not be able to answer it and then, upon hearing the answer, be immediately really upset at themselves for being unconsciously biased.

Moreover, some problems have more than one answer, depending on how one understands them.  One such question was raised by Martin Gardner in Scientific American (see https://en.wikipedia.org/wiki/Boy_or_Girl_paradox), where his answer was challenged.  The question is “If a family has two children and you know one of the two is a boy, what are the odds that both children are boys?”  Many people responded the odds were 50/50, since there is a 50/50 chance the other child is a boy.  But Gardner’s answer was there is a 1 out of 3 chance.  It turns out that if you are talking about a particular family where you know one child is a boy, the odds are 50/50 that the other child is also.  But if you are just asking about families with two children in general that have one boy, 1/3 of all such families will have two boys because there are three equal possibilities: the older child is a boy and the younger one is a boy; the older child is a boy and the younger one is a girl; the younger child is a boy and the older one is a girl.  There are roughly two families with one boy and one girl for every family with two boys.  There are roughly equal numbers of two children families which have a) two girls, which have b) two boys, which have c) older girl younger boy, and which have d) older boy, younger girl.  But since the question precludes families with both girls, that leaves twice the number with one girl and one boy as the number with both boys.   So if a psychologist asks a question with only one interpretation and answer in mind, but the subject has a different interpretation and gives a correct answer to it, the psychologist will misread the knowledge or reasoning ability of the subject. 

 

Some of Piaget’s questions of children to see whether they understood concepts at different ages is an example of this, because he seemed to have particular interpretations of the problems that were possibly not the interpretations of the children, and it is entirely possible that they did not have his interpretations because there experience was limited, not because their conceptual ability was undeveloped. 

One such question is which of two glasses will hold more water, showing the child a tall glass or a shorter one that is fatter (i.e., has a larger circumference).  Younger children tend to say the taller glass.  But that does not show they do not have a concept of volume.  It may show that they think the question is about which will hold the taller amount of water, or it may mean they don’t realize that increasing the diameter of a cylinder adds more volume to it than increasing its height/length does.  Most adults probably don't even know that, and don't know it is because the volume of a cylinder is
πr2h so that whatever you increase the radius by is squared but what you increase the height by is not.  Obviously children do not know that, but children also do not generally even have the experience of pouring liquids from one shape glass (or container) into another of a different shape but apparently similar size and being surprised that, or which, one holds more.  The fact children have no reason to know the taller glass might hold less does not mean they do not know what "less" means or that they do not have the concept of (greater) volume.  They simply could be thinking height matters more than width, since they have no experience to believe otherwise and height tends to be more noticeable or enticing.  The fact that most adults cannot tell you how many jelly beans are in a large jar, no matter what shape the jar, and the fact that guesses will probably be in a wide range among different adults, does not mean none those adults do not know what volume means or what "number of marbles" means and do not have the concept of either.

Similarly showing a child two strings of equal length -- one stretched out straight and the other in a serpentine configuration -- and asking which is longer, does not show that the children who pick the stretched out one do not have a concept of length.  It may show they don’t realize how much longer a curved line will look if straightened out.  You cannot likely choose shoe laces for your shoes on the basis of their length; typically you have to be told which lengths fit shoes with which number of eyelets.  And most of us could not even choose a belt that would fit us or someone else just by looking at different length belts without knowing their sizes in inches, and it is extremely difficult to guess the correct length of various curved lines. 

Or it may be that the child thinks “longer” means “further from the starting point (as in ‘as the crow flies’), in which case the stretched out straight string is longer than the serpentine one (no matter how long the serpentine one would be if stretched out straight) whose other end does not extend as far from the starting point.  And the fact adults may not mean it that way in a case like this is something one learns simply from experience with adults, not necessarily from the acquisition of an intellectual or conceptual ability one didn't have.  Without understanding the child's reasoning, which may or may not be easy to ascertain, it is a likely to mistake to assume it is deficient in some way. 


One should be very skeptical of any claim by a social scientist that some precise objective test captures what is represented or meant by any subjective quality and judgment, or that it measures how we make the judgment and whether we do it reliably or reasonably or consistently or not.

[1] He then goes on to explain why so many people will vote for Donald Trump because of this, in order to protest against and  thwart both, the Democratic and the Republican Establishment, because Trump seems to these voters to demonstrate understanding of the people’s frustration with lack of opportunity.


[2] This omits for here how it is right to treat or take care of those who cannot contribute much because of youth, old age, disability, illness, etc.  Those are important issues, but separate from what it means to be an economically mobile society.


This work is available here free, so that those who cannot afford it can still have access to it, and so that no one has to pay before they read something that might not be what they really are seeking.  But if you find it meaningful and helpful and would like to contribute whatever easily affordable amount you feel it is worth, please do do.  I will appreciate it. The button to the right will take you to PayPal where you can make any size donation (of 25 cents or more) you wish, using either your PayPal account or a credit card without a PayPal account.