87. One"s................in life................upon so............
factors ................ it is not ................ to state any single................for................ failure.
89. The future................of the stars and the facts of............
history are................now once for all,................I like them................not.
Other standard tests and scales of measurement have been derived and are being developed. The examples given above will, however, suffice to make clear the distinction between the ordinary type of examination and the more careful study of the achievements of children which may be accomplished by using these measuring sticks. It is important for any one who would attempt to apply these tests to know something of the technique of recording results.
In the first place, the measurement of a group is not expressed satisfactorily by giving the average score or rate of achievement of the cla.s.s. It is true that this is one measure, but it is not one which tells enough, and it is not the one which is most significant for the teacher. It is important whenever we measure children to get as clear a view as we can of the whole situation. For this purpose we want not primarily to know what the average performance is, but, rather, how many children there are at each level of achievement. In arithmetic, for example, we want to know how many there are who can do none of the Courtis problems in addition, or how many there are who can do the first six on the Woody test, how many can do seven, eight, and so on. In penmanship we want to know how many children there are who write quality eight, or nine, or ten, or sixteen, or seventeen, as the case may be.
The work of the teacher can never be accomplished economically except as he gives more attention to those who are less proficient, and provides more and harder work for those who are capable, or else relieves the able members of the cla.s.s from further work in the field. It will be well, therefore, to prepare, for the sake of comparing grades within the same school or school system, or for the sake of preparing the work of a cla.s.s at two different times during the year, a table which shows just how many children there are in the group who have reached each level of achievement. Such tables for work in composition for a cla.s.s at two different times, six months apart, appear as follows:
DISTRIBUTION OF COMPOSITION SCORES FOR A SEVENTH GRADE
====================================== | NUMBER OF CHILDREN +----------------------- | NOVEMBER | FEBRUARY --------------+-----------+----------- Rated at 0 | 0 | 0 1.83 | 1 | 1 2.60 | 6 | 4 3.69 | 12 | 6 4.74 | 8 | 11 5.85 | 3 | 4 6.75 | 1 | 3 7.72 | 1 | 2 8.38 | 0 | 1 9.37 | 0 | 0 ======================================
A study of such a distribution would show not only that the average performance of the cla.s.s has been raised, but also that those in the lower levels have, in considerable measure, been brought up; that is, that the teacher has been working with those who showed less ability, and not simply pushing ahead a few who had more than ordinary capacity.
It would be possible to increase the average performance by working wholly with the upper half of the cla.s.s while neglecting those who showed less ability. From a complete distribution, as has been given above, it has become evident that this has not been the method of the teacher. He has sought apparently to do everything that he could to improve the quality of work upon the part of all of the children in the cla.s.s.
It is very interesting to note, when such complete distributions are given, how the achievement of children in various cla.s.ses overlaps. For example, the distribution of the number of examples on the Courtis tests, correctly finished in a given time by pupils in the seventh grades, makes it clear that there are children in the fifth grade who do better than many in the eighth.
THE DISTRIBUTION OF THE NUMBER OF EXAMPLES CORRECTLY FINISHED IN THE GIVEN TIME BY PUPILS IN THE SEVERAL GRADES
=================================================================== ADDITION | SUBTRACTION No. OF |----------------------+ No. OF |------------------------ EXAMPLES| GRADES | EXAMPLES | GRADES FINISHED| 5 | 6 | 7 | 8 | FINISHED | 5 | 6 | 7 | 8 --------+----+-----+-----+-----+----------+----+-----+-----+------- 0 | 12 | 15 | 5 | 4 | 0 | 6 | 2 | 2 | -- 1 | 26 | 23 | 14 | 9 | 1 | 5 | 6 | 2 | 1 2 | 27 | 31 | 8 | 6 | 2 | 7 | 8 | 1 | -- 3 | 31 | 27 | 27 | 9 | 3 | 13 | 21 | 3 | 1 4 | 25 | 28 | 19 | 16 | 4 | 21 | 18 | 13 | 2 5 | 16 | 23 | 16 | 15 | 5 | 26 | 30 | 12 | 7 6 | 15 | 22 | 12 | 12 | 6 | 17 | 27 | 15 | 9 7 | 1 | 11 | 8 | 9 | 7 | 15 | 27 | 18 | 9 8 | 3 | 4 | 6 | 11 | 8 | 15 | 20 | 12 | 12 9 | 1 | 2 | 3 | 8 | 9 | 10 | 13 | 9 | 12 10 | -- | -- | -- | 6 | 10 | 8 | 6 | 13 | 11 11 | -- | -- | 1 | -- | 11 | 6 | 2 | 3 | 12 12 | -- | -- | 1 | 2 | 12 | 3 | 1 | 7 | 9 13 | -- | -- | -- | -- | 13 | 2 | 2 | 3 | 5 14 | -- | -- | -- | -- | 14 | 1 | 1 | 3 | 7 15 | -- | -- | -- | 2 | 15 | -- | -- | 2 | 3 16 | -- | -- | -- | 1 | 16 | -- | -- | 1 | 2 17 | -- | -- | -- | -- | 17 | -- | 1 | -- | 1 18 | -- | -- | -- | -- | 18 | -- | -- | -- | 1 19 | -- | -- | -- | -- | 19 | -- | -- | -- | 4 20 | -- | -- | -- | -- | 20 | -- | -- | -- | 2 21 | -- | -- | -- | -- | 21 | -- | -- | -- | 1 22 | -- | -- | -- | -- | 22 | -- | -- | -- | -- --------+----+-----+-----+-----+----------+----+-----+-----+------- Total | | | | | | | | | papers |157 | 86 | 119 | 111 | |155 | 185 | 119 | 111 ===================================================================
THE DISTRIBUTION OF THE NUMBER OF EXAMPLES CORRECTLY FINISHED IN THE GIVEN TIME BY PUPILS IN THE SEVERAL GRADES
======================================================================= MULTIPLICATION | DIVISION ------------------------------------|---------------------------------- No. of | GRADES |No. of | GRADES Examples|---------------------------|Examples|------------------------- Finished| 5 | 6 | 7 | 8 |Finished| 5 | 6 | 7 | 8 --------|------+-----+-----+--------|--------|------+-----+-----+------ 0 . . .| 10 | 4 | -- | -- | 0 . . .| 17 | 7 | 1 | -- 1 . . .| 10 | 4 | 3 | -- | 1 . . .| 19 | 17 | 2 | 1 2 . . .| 19 | 20 | 5 | 1 | 2 . . .| 18 | 22 | 8 | 4 3 . . .| 21 | 17 | 11 | 5 | 3 . . .| 21 | 26 | 6 | 2 4 . . .| 28 | 31 | 16 | 3 | 4 . . .| 25 | 27 | 8 | 6 5 . . .| 26 | 34 | 12 | 13 | 5 . . .| 21 | 27 | 11 | 7 6 . . .| 24 | 27 | 13 | 13 | 6 . . .| 9 | 15 | 12 | 4 7 . . .| 9 | 20 | 16 | 10 | 7 . . .| 10 | 15 | 16 | 18 8 . . .| 5 | 14 | 21 | 19 | 8 . . .| 6 | 7 | 20 | 9 9 . . .| 3 | 9 | 11 | 13 | 9 . . .| 4 | 7 | 11 | 6 10 . . .| -- | 4 | 6 | 10 |10 . . .| 4 | 9 | 7 | 13 11 . . .| 1 | -- | 2 | 9 |11 . . .| 1 | 3 | 3 | 7 12 . . .| -- | -- | 2 | 6 |12 . . .| -- | 2 | 10 | 10 13 . . .| -- | -- | 1 | 3 |13 . . .| -- | 2 | -- | 10 14 . . .| -- | -- | -- | 3 |14 . . .| 1 | -- | 1 | 4 15 . . .| -- | -- | -- | -- |15 . . .| -- | 1 | 2 | 9 16 . . .| -- | -- | -- | 1 |16 . . .| -- | -- | -- | 2 17 . . .| -- | -- | -- | -- |17 . . .| -- | -- | -- | 4 18 . . .| -- | -- | -- | 1 |18 . . .| -- | -- | -- | 2 19 . . .| -- | -- | -- | 1 |19 . . .| -- | -- | -- | 1 20 . . .| -- | -- | -- | -- |20 . . .| -- | -- | -- | 1 21 . . .| -- | -- | -- | -- |21 . . .| -- | -- | -- | 1 22 . . .| -- | -- | -- | -- |22 . . .| -- | -- | -- | -- --------+------+-----+-----+--------|--------|------+-----+-----+------- Total | | | | | | | | | Papers | 156 | 184 | 119 | 111 | | 156 | 187 | 118 | 111 =======================================================================
If the tests had been given in the fourth or the third grade, it would have been found that there were children, even as low as the third grade, who could do as well or better than some of the children in the eighth grade. Such comparisons of achievements among children in various subjects ought to lead at times to reorganizations of cla.s.ses, to the grouping of children for special instruction, and to the rapid promotion of the more capable pupils.
In many of these measurements it will be found helpful to describe the group by naming the point above and below which half of the cases fall.
This is called the median. Because of the very common use of this measure in the current literature of education, it may be worth while to discuss carefully the method of its derivation.[30]
[31]The _median point_ of any distribution of measures is that point on the scale which divides the distribution into two exactly equal parts, one half of the measures being greater than this point on the scale, and the other half being smaller. When the scales are very crude, or when small numbers of measurements are being considered, it is not worth while to locate this median point any more accurately than by indicating on what step of the scale it falls. If the measuring instrument has been carefully derived and accurately scaled, however, it is often desirable, especially where the group being considered is reasonably large, to locate the exact point within the step on which the median falls. If the unit of the scale is some measure of the variability of a defined group, as it is in the majority of our present educational scales, this median point may well be calculated to the nearest tenth of a unit, or, if there are two hundred or more individual measurements in the distribution, it may be found interesting to calculate the median point to the nearest hundredth of a scale unit. Very seldom will anything be gained by carrying the calculation beyond the second decimal place.
The best rule for locating the median point of a distribution is to _take as the median that point on the scale which is reached by counting out one half of the measures_, the measures being taken in the order of their magnitude. If we let _n_ stand for the number of measures in the distribution, we may express the rule as follows: Count into the distribution, from either end of the scale, a distance covered by *_n/2_ measures. For example, if the distribution contains 20 measures, the median is that point on the scale which marks the end of the 10th and the beginning of the 11th measure. If there are 39 measures in the distribution, the median point is reached by counting out 19-1/2 of the measures; in other words, the median of such a distribution is at the mid-point of that fraction of the scale a.s.signed to the 20th measure.
The _median step_ of a distribution is the step which contains within it the median point. Similarly, the _median measure_ in any distribution is the measure which contains the median point. In a distribution containing 25 measures, the 13th measure is the median measure, because 12 measures are greater and 12 are less than the 13th, while the 13th measure is itself divided into halves by the median point. Where a distribution contains an even number of measures, there is in reality no median measure but only a median point between the two halves of the distribution. Where a distribution contains an uneven number of measures, the median measure is the (_n_+1)/2 measurement, at the mid-point of which measure is the median point of the distribution.
Much inaccurate calculation has resulted from misguided attempts to secure a _median point_ with the formula just given, which is applicable only to the location of the _median measure_. It will be found much more advantageous in dealing with educational statistics to consider only the median point, and to use only the _n_/2 formula given in a previous paragraph, for practically all educational scales are or may be thought of as continuous scales rather than scales composed of discrete steps.
The greatest danger to be guarded against in considering all scales as continuous rather than discrete, is that careless thinkers may refine their calculations far beyond the accuracy which their original measurements would warrant. One should be very careful not to make such unjustifiable refinements in his statement of results as are often made by young pupils when they multiply the diameter of a circle, which has been measured only to the nearest inch, by 3.1416 in order to find the circ.u.mference. Even in the ordinary calculation of the average point of a series of measures of length, the amateur is sometimes tempted, when the number of measures in the series is not contained an even number of times in the sum of their values, to carry the quotient out to a larger number of decimal places than the original measures would justify. Final results should usually not be refined far beyond the accuracy of the original measures.
It is of utmost importance in calculating medians and other measures of a distribution to keep constantly in mind the significance of each step on the scale. If the scale consists of tasks to be done or problems to be solved, then "doing 1 task correctly" means, when considered as part of a continuous scale, anywhere from doing 1.0 up to doing 2.0 tasks. A child receives credit for "2 problems correct" whether he has just barely solved 2.0 problems or has just barely fallen short of solving 3.0 problems. If, however, the scale consists of a series of productions graduated in quality from very poor to very good, with which series other productions of the same sort are to be compared, then each sample on the scale stands at the middle of its "step" rather than at the beginning.
The second kind of scale described in the foregoing paragraph may be designated as "scales for the _quality_ of products," while the other variety may be called "scales for _magnitude_ of achievement." In the one case, the child makes the best production he can and measures its quality by comparing it with similar products of known quality on the scale. Composition, handwriting, and drawing scales are good examples of scales for quality of products. In the other case, the scales are placed in the hands of the child at the very beginning, and the magnitude of his achievement is measured by the difficulty or number of tasks accomplished successfully in a given time. Spelling, arithmetic, reading, language, geography, and history tests are examples of scales for quant.i.ty of achievement.
Scores tend to be more accurate on the scales for magnitude of achievement, because the judgment of the examiner is likely to be more accurate in deciding whether a response is correct or incorrect than it is in deciding how much quality a given product contains. This does not furnish an excuse for failing to employ the quality-of-products scales, however, for the qualities they measure are not measurable in terms of the magnitude of tasks performed. The fact appears, however, that the method of employing the quality-of-products scales is "by comparison"
(of child"s production with samples reproduced on the scale), while the method of employing the magnitude-of-achievement scales is "by performance" (of child on tasks of known difficulty).
In this connection it may be well to take one of the scales for quality of products and outline the steps to be followed in a.s.signing scores, making tabulations, and finding the medians of distributions of scores.
When the Hillegas scale is employed in measuring the quality of English composition, it will be advisable to a.s.sign to each composition the score of that sample on the scale to which it is nearest in merit or quality. While some individuals may feel able to a.s.sign values intermediate to those appearing on the Hillegas scale, the majority of those persons who use this scale will not thereby obtain a more accurate result, and the a.s.signment of such intermediate values will make it extremely difficult for any other person to make accurate use of the results. To be exactly comparable, values should be a.s.signed in exactly the same manner.
The best result will probably be obtained by having each composition rated several times, and if possible, by a number of different judges, the paper being given each time that value on the Hillegas scale to which it seems nearest in quality. The final mark for the paper should be the median score or step (not the median point or the average point) of all the scores a.s.signed. For example, if a paper is rated five times, once as in step number five (5.85), twice as in step number six (6.75), and twice as in step number seven (7.72), it should be given a final mark indicating that it is a number six (6.75) paper.
After each composition has been a.s.signed a final mark indicating to what sample on the Hillegas scale it is most nearly equal in quality, proceed as follows:
Make a distribution of the final marks given to the individual papers, showing how many papers were a.s.signed to the zero step on the scale, how many to step number one, how many to step number two, and so on for each step of the scale. We may take as an example the distribution of scores made by the pupils of the eighth grade at b.u.t.te, Montana, in May, 1914.
No. of papers 1 9 32 39 43 22 6 2 Rated at 0 1 2 3 4 5 6 7 8 9
All together there were 154 papers from the eighth grade, so that if they were arranged in order according to their merit we might begin at the poorest and count through 77 of them (n/2 = 154/2 = 77) to find the median point, which would lie between the 77th and the 78th in quality.
If we begin with the 1 composition rated at 0 and count up through the 9 rated at 1 and the 32 rated at 2 in the above distribution, we shall have counted 42. In order to count out 77 cases, then, it will be necessary to count out 35 of the 39 cases rated at 3.
Now we know (if the instructions given above have been followed) that the compositions rated at 3 were so rated by virtue of the fact that the judges considered them nearer in quality to the sample valued at 3.69 than to any other sample on the scale. We should expect, then, to find that some of those rated at 3 were only slightly nearer to the sample valued at 3.69 than they were to the sample valued at 2.60, while others were only slightly nearer to 3.69 than they were to 4.74. Just how the 39 compositions rated on 3 were distributed between these two extremes we do not know, but the best single a.s.sumption to make is that they are distributed at equal intervals on step 3. a.s.suming, then, that the papers rated at 3 are distributed evenly over that step, we shall have covered .90 (35/39 = .897 = .90) of the entire step 3 by the time we have counted out 35 of the 39 papers falling on this step.
It now becomes necessary to examine more closely just what are the limits of step 3. It is evident from what has been said above that 3.69 is the middle step 3 and that step 3 extends downward from 3.69 halfway to 2.60, and upward from 3.69 halfway to 4.74. The table given below shows the range and the length of each step in the Hillegas Scale for English Composition.
THE HILLEGAS SCALE FOR ENGLISH COMPOSITION
====================================================== STEP No.|VALUE or SAMPLE|RANGE OF STEP |LENGTH OF STEP --------+---------------+--------------+-------------- 0. . . .| 0 | 0- .91[32] | .91 1. . . .| 1.83 | .92-2.21 | 1.30 2. . . .| 2.60 |2.22-3.14 | .93 3. . . .| 3.69 |3.15-4.21 | 1.07 4. . . .| 4.74 |4.22-5.29 | 1.08 5. . . .| 5.85 |5.30-6.30 | 1.00 6. . . .| 6.75 |6.30-7.23 | .93 7. . . .| 7.72 |7.24-8.05 | .81 8. . . .| 8.38 |8.05-8.87 | .82 9. . . .| 9.37 |8.88- | ======================================================
From the above table we find that step 3 has a length of 1.07 units. If we count out 35 of the 39 papers, or, in other words, if we pa.s.s upward into the step .90 of the total distance (1.07 units), we shall arrive at a point .96 units (.90 1.07 = .96) above the lower limit of step 3, which we find from the table is 3.15. Adding .96 to 3.15 gives 4.11 as the median point of this eighth grade distribution.
The median and the percentiles of any distribution of scores on the Hillegas scale may be determined in a manner similar to that ill.u.s.trated above, if the scores are a.s.signed to the individual papers according to the directions outlined above.
A similar method of calculation is employed in discovering the limits within which the middle fifty per cent of the cases fall. It often seems fairer to ask, after the upper twenty-five per cent of the children who would probably do successful work even without very adequate teaching have been eliminated, and the lower twenty-five per cent who are possibly so lacking in capacity that teaching may not be thought to affect them very largely have been left out of consideration, what is the achievement of the middle fifty per cent. To measure this achievement it is necessary to have the whole distribution and to count off twenty-five per cent, counting in from the upper end, and then twenty-five per cent, counting in from the lower end of the distribution. The points found can then be used in a statement in which the limits within which the middle fifty per cent of the cases fall.
Using the same figures that are given above for scores in English composition, the lower limit is 2.64 and the limit which marks the point above which the upper twenty-five per cent of the cases are to be found is 5.08. The limits, therefore, within which the middle fifty per cent of the cases fall are from 2.64 to 5.08.
It is desirable to measure the relationship existing between the achievements (or other traits) of groups. In order to express such relationship in a single figure the coefficient or correlation is used.
This measure appears frequently in the literature of education and will be briefly explained. The formula for finding the coefficient of correlation can be understood from examples of its application.
Let us suppose a group of seven individuals whose scores in terms of problems solved correctly and of words spelled correctly are as follows:[33]
====================================== INDIVIDUALS|No. OF |No. OF WORDS MEASURED |PROBLEMS|SPELLED CORRECTLY CORRECTLY | | -----------+--------+----------------- A | 1 | 2 B | 2 | 4 C | 3 | 6 D | 4 | 8 E | 5 | 10 F | 6 | 12 G | 7 | 14 ======================================
From such distributions it would appear that as individuals increase in achievement in one field they increase correspondingly in the other. If one is below or above the average in achievement in one field, he is below or above and in the same degree in the other field. This sort of positive relationship (going together) is expressed by a coefficient of +1. The formula is expressed as follows:
(Sum x y) r = ------------------------------ (sqrt(Sum x^2))(sqrt(Sum y^2))
Here _r_ = coefficient of correlation.
_x_ = deviations from average score in arithmetic (or difference between score made and average score).
_y_ = deviations from average score in spelling.