DEPENDENCE OF THE SCALE"S RELIABILITY ON THE TRAINING OF THE EXAMINER.
On this point two radically different opinions have been urged. On the one hand, some have insisted that the results of a test made by other than a thoroughly trained psychologist are absolutely worthless. At the opposite extreme are a few who seem to think that any teacher or physician can secure perfectly valid results after a few hours"
acquaintance with the tests.
The dispute is one which cannot be settled by the a.s.sertion of opinion, and, unfortunately, thoroughgoing investigations have not yet been made as to the frequency and extent of errors made by untrained or partially trained examiners. The only study of this kind which has so far been reported is the following:--[37]
[37] Samuel C. Kohs: "The Binet Test and the Training of Teachers," in _The Training School Bulletin_ (1914), pp. 113-17.
Dr. Kohs gives the results of tests made by 58 inexperienced teachers who were taking a summer course in the Training School at Vineland. The cla.s.s met three times a week for instruction in the use of the Binet scale. During the first week the students listened to three lectures by Dr. G.o.ddard. The second week was given over to demonstration testing.
Each student saw four children tested, and attended two discussion periods of an hour each. During the third, fourth, and fifth weeks each student tested one child per week, and observed the testing of two others. The student was allowed to carry the test through in his own way, but received criticism after it was finished. Twice a week Dr. G.o.ddard spent an hour with the cla.s.s, discussing experimental procedure. The subjects tested were feeble-minded children whose exact mental ages were already known, and for this reason it was possible to check up the accuracy of each student"s work.
Kohs"s table of results for the trial testing of the 174 children showed:--
(1) That 50 per cent of the work was as exact as any one in the laboratory could make it;
(2) That in an additional 38 per cent the results were within three fifths of a year of being exact;
(3) That nearly 90 per cent of the work of the summer students was sufficiently accurate for all practical purposes;
(4) That the records improved during the brief training so that during the third week only one test missed the real mental age by as much as a year.
Since hardly any of these students had had any previous experience with the Binet tests, Dr. Kohs seems to be entirely justified in his conclusion that it is possible, in the brief period of six weeks, to teach people to use the tests with a reasonable degree of accuracy.
What shall we say of the teacher or of the physician who has not even had this amount of instruction? The writer"s experience forces him to agree with Binet and with Dr. G.o.ddard, that any one with intelligence enough to be a teacher, and who is willing to devote conscientious study to the mastery of the technique, can use the scale accurately enough to get a better idea of a child"s mental endowment than he could possibly get in any other way. It is necessary, however, for the untrained person to recognize his own lack of experience, and in no case would it be justifiable to base important action or scientific conclusions upon the results of the inexpert examiner. As Binet himself repeatedly insisted, the method is not absolutely mechanical, and cannot be made so by elaboration of instructions.
It is sometimes held that the examination and cla.s.sification of backward children for special instruction should be carried out by the school physicians. The fact is, however, that there is nothing in the physician"s training to give him any advantage over the ordinary teacher in the use of the Binet tests. Because of her more intimate knowledge of children and because of her superior tact and adaptability, the average teacher is perhaps better equipped than the average physician to give intelligence tests.
Finally, it should be emphasized that whatever the previous training or experience of the examiner may have been, his ability to adjust to the child"s personality and his willingness to follow conscientiously the directions for giving the tests are important factors in his equipment.
INFLUENCE OF THE SUBJECT"S ATt.i.tUDE. One continually meets such queries as, "How do you know the subject did his best?" "Possibly the child was nervous or frightened," or, "Perhaps incorrect answers were purposely given." All such objections may be disposed of by saying that the competent examiner can easily control the experiment in such a way that embarra.s.sment is soon replaced by self-confidence, and in such a way that effort is kept at its maximum. As for mischievous deception, it would be a poor clinicist who could not recognize and deal with the little that is likely to arise.
Cautions regarding embarra.s.sment, fatigue, fright, illness, etc. are given in Chapter IX. Most of the errors which have been reported along this line are such as can nearly always be avoided by ordinary prudence, coupled with a little power of observation.[38] We must not charge the mistakes of untrained and indiscreet examiners against the validity of the method itself.
[38] See, for example, the rather ludicrous "errors" of the Binet method reported in _The Psychological Clinic_ for 1915, pp. 140 _ff._ and 167 _ff._
It is possibly true that even if the examiner is tactful and prudent an unfavorable att.i.tude on the part of the subject may occasionally affect the results of a test to some extent, but it ought not seriously to invalidate one examination out of five hundred. The greatest danger is in the case of a young subject who has been recently arrested and brought before a court. Even here a little common sense and scientific insight should enable one to guard against a mistaken diagnosis.
THE INFLUENCE OF COACHING. It might be supposed that after the intelligence scale had been used with a few pupils in a given school all of their fellows would soon be apprised of the nature of the tests, and so learn the correct responses. Experience shows, however, that there is little likelihood of such influence except in the case of a small minority of the tests. Experiments in the psychology of testimony have demonstrated that children"s ability to report upon a complex set of experiences is astonishingly weak. In testing with the Stanford revision a child is ordinarily given from twenty-four to thirty different tests, many of which are made up of three or more items. Of the total forty to fifty items the child is ordinarily able to report but few, and these not always correctly.
Such tests as memory for sentences and digits, drawing the square and diamond, reproducing the designs from memory, comparing weights and lines, describing and interpreting pictures, aesthetic comparison, vocabulary, dissected sentences, fables, reading for memories, finding differences and similarities, arithmetical reasoning, and the form-board test, are hardly subject to report at all. While almost any of the other tests might, theoretically, be communicated, there is little danger that many of them will be. It is a.s.sumed, of course, that the examiner will take proper precautions to prevent any of his blanks or other materials from falling into the hands of those who are to be examined.
The following tests are the ones most subject to the influence of coaching: Ball and field, giving date, naming sixty words, finding rhymes, changing hands of clock, comprehension of physical relations, "induction test," and "ingenuity test."
In several instances we have interviewed children an hour or two after they had taken the examination, in order to find out how many of the tests they could recall. A boy of 4 years, after repeated questioning, could only say: "He showed me some pictures. He had a knife and a penny.
He told me to shut the door." A girl of 3 years could recall nothing whatever that was intelligible.
An 8-year-old boy said: "He made me tie a knot. He asked me about a ship and an auto. He wanted me to count backwards. He made me say over some things, numbers and things."
A boy of 12 years said: "He told me to say all the words I could think of. He said some foolish things and asked what was foolish [he could not repeat a single absurdity]. I had to put some blocks together. I had to do some problems in arithmetic [he could not repeat a single problem].
He read some fables to me. [Asked about the fables he was able to recall only part of one, that of the fox and the crow.] He showed me the picture of a field and wanted to know how to find a ball."
It is evident from the above samples of report that the danger of coaching increases considerably with the age of the children concerned.
With young subjects the danger is hardly present at all; with children of the upper-grammar grades, in the high school, and most of all in prisons and reformatories, it must be taken into account. Alternative tests may sometimes be used to advantage when there is evidence of coaching on any of the regular tests. It would be desirable to have two or three additional scales which could be used interchangeably with the Binet-Simon.
RELIABILITY OF REPEATED TESTS. Will the same tests give consistent results when used repeatedly with the same subject? In general we may say that they do. Something depends, however, on the age and intelligence of the subject and on the time interval between the examinations.
G.o.ddard proves that feeble-minded individuals whose intelligence has reached its full development continue to test at exactly the same mental age by the Binet scale, year after year. In their case, familiarity with the tests does not in the least improve the responses. At each retesting the responses given at previous examinations are repeated with only the most trivial variations. Of 352 feeble-minded children tested at Vineland, three years in succession, 109 gave absolutely no variation, 232 showed a variation of not more than two fifths of a year, while 22 gained as much as one year in the three tests. The latter, presumably, were younger children whose intelligence was still developing.
G.o.ddard has also tested 464 public-school children for three successive years. Approximately half of these showed normal progress or more in mental age, while most of the remainder showed somewhat less than normal progress.
Bobertag"s retesting of 83 normal children after an interval of a year gave results entirely in harmony with those of G.o.ddard.
The reapplication of the tests showed absolutely no influence of familiarity, the correlation of the two tests being almost perfect (.95). Those who tested "at age" in the first test had advanced, on the average, exactly one year. Those who tested _plus_ in the first test advanced in the twelve months about a year and a quarter, as we should expect those to do whose mental development is accelerated.
Correspondingly, those who tested _minus_ at the first test advanced only about three fourths of a year in mental age during the interval.[39]
[39] Otto Bobertag: "Ueber Intelligenz Prufungen," in _Zeitsch. f.
Angew. Psychol._ (1912), p. 521 _ff._
Our own results with a mixed group of normal, superior, dull and feeble-minded children agree fully with the above findings. In this case the two tests were separated by an interval of two to four years, and the correlation between their results was practically perfect. The average difference between the I Q obtained in the second test and that obtained in the first was only 4 per cent, and the greatest difference found was only 8 per cent.[40]
[40] See _The Stanford Revision and Extension of the Binet-Simon Scale for Measuring Intelligence_. (Warwick and York, 1916.)
The repet.i.tion of the test at shorter intervals will perhaps affect the result somewhat more, but the influence is much less than one might expect. The writer has tested, at intervals of only a few days to a few weeks, 14 backward children of 12 to 18 years, and 8 normal children of 5 to 13 years. The backward children showed an average improvement in the second test of about two months in mental age, the normal children an average improvement of little more than three months. No child varied in the second test more than half a year from the mental age first secured. On the whole, normal children profit more from the experience of a previous test than do the backward and feeble-minded.
Berry tested 45 normal children and 50 defectives with the Binet 1908 and 1911 scales at brief intervals. The author does not state which scale was applied first, but the mental ages secured by the two scales were practically the same when allowance was made for the slightly greater difficulty of the 1911 series of tests.[41]
[41] Charles Scott Berry: "A Comparison of the Binet Tests of 1908 and 1911," in _Journal of Educational Psychology_ (1912), pp. 444-51.
We may conclude, therefore, that while it would probably be desirable to have one or more additional scales for alternative use in testing the same children at very brief intervals, the same scale may be used for repeated tests at intervals of a year or more with little danger of serious inaccuracy. Moreover, results like those set forth above are important evidence as to the validity of the test method.
INFLUENCE OF SOCIAL AND EDUCATIONAL ADVANTAGES. The criticism has often been made that the responses to many of the tests are so much subject to the influence of school and home environment as seriously to invalidate the scale as a whole. Some of the tests most often named in this connection are the following: Giving age and s.e.x; naming common objects, colors, and coins; giving the value of stamps; giving date; naming the months of the year and the days of the week; distinguishing forenoon and afternoon; counting; making change; reading for memories; naming sixty words; giving definitions; finding rhymes; and constructing a sentence containing three given words.
It has in fact been found wherever comparisons have been made that children of superior social status yield a higher average mental age than children of the laboring cla.s.ses. The results of Decroly and Degand and of Meumann, Stern, and Binet himself may be referred to in this connection. In the case of the Stanford investigation, also, it was found that when the unselected school children were grouped in three cla.s.ses according to social status (superior, average, and inferior), the average I Q for the superior social group was 107, and that of the inferior social group 93. This is equivalent to a difference of one year in mental age with 7-year-olds, and to a difference of two years with 14-year-olds.
However, the common opinion that the child from a cultured home does better in tests solely by reason of his superior home advantages is an entirely gratuitous a.s.sumption. Practically all of the investigations which have been made of the influence of nature and nurture on mental performance agree in attributing far more to original endowment than to environments. Common observation would itself suggest that the social cla.s.s to which the family belongs depends less on chance than on the parents" native qualities of intellect and character.
The results of five separate and distinct lines of inquiry based on the Stanford data agree in supporting the conclusion that the children of successful and cultured parents test higher than children from wretched and ignorant homes for the simple reason that their heredity is better.
The results of this investigation are set forth in full elsewhere.[42]
[42] See _The Stanford Revision and Extension of the Binet-Simon Measuring Scale of Intelligence_. (Warwick and York, 1916)
It would, of course, be going too far to deny all possibility of environmental conditions affecting the result of an intelligence test.
Certainly no one would expect that a child reared in a cage and denied all intercourse with other human beings could by any system of mental measurement test up to the level of normal children. There is, however, no reason to believe that _ordinary_ differences in social environment (apart from heredity), differences such as those obtaining among unselected children attending approximately the same general type of school in a civilized community, affects to any great extent the validity of the scale.
A crucial experiment would be to take a large number of very young children of the lower cla.s.ses and, after placing them in the most favorable environment obtainable, to compare their later mental development with that of children born into the best homes. No extensive study of this kind has been made, but the writer has tested twenty orphanage children who, for the most part, had come from very inferior homes. They had been in a well-conducted orphanage for from two to several years, and had enjoyed during that time the advantages of an excellent village school. Nevertheless, all but three tested below average, ranging from 75 to 90 I Q.
The impotence of school instruction to neutralize individual differences in native endowment will be evident to any one who follows the school career of backward children. The children who are seriously r.e.t.a.r.ded in school are not normal, and cannot be made normal by any refinement of educational method. As a rule, the longer the inferior child attends school, the more evident his inferiority becomes. It would hardly be reasonable, therefore, to expect that a little incidental instruction in the home would weigh very heavily against these same native differences in endowment. Cases like the following show conclusively that it does not:--
X is the son of unusually intelligent and well-educated parents.
The home is everything one would expect of people of scholarly pursuits and cultivated tastes. But X has always been irresponsible, troublesome, childish, and queer. He learned to walk at 2 years, to talk at 3, and has always been delicate and nervous. When brought for examination he was 8 years old. He had twice attempted school work, but could accomplish nothing and was withdrawn. His play-life was not normal, and other children, younger than himself, abused and tormented him. The Binet tests gave an I Q of approximately 75; that is, the r.e.t.a.r.dation amounted to about two years. The child was examined again three years later. At that time, after attending school two years, he had recently completed the first grade. This time the I Q was 73. Strange to say, the mother is encouraged and hopeful because she sees that her boy is learning to read. She does not seem to realize that at his age he ought to be within three years of entering high school.