OCR Text |
Show Journal of Neuro- Ophlhalmolojiy 17( 1): 1- 6, 1997. © 1997 I. ippincoll- Ruven Publishers, Philadelphia Evaluation of a Significantly Shorter Version of the Farnsworth- Munsell 100- Hue Test in Patients with Three Different Optic Neuropathies Brian E. Nichols, M. D., Ph. D., H. Stanley Thompson, M. D., and Edwin M. Stone, M. D., Ph. D. We tested the hypothesis that a subset of the Farnsworth- Munsell 100- hue test ( FM- 100) would be a sensitive, specific, and practical means of monitoring color vision in patients with chronic optic nerve disorders. We retrospectively analyzed the records of 1,113 patients affected with optic neuritis ( ON), Graves' ophthalmopathy with suspected optic neuropathy, or idiopathic intracranial hypertension with suspected optic neuropathy ( 11H). One hundred six records of patients showed that an FM- 100 had been performed ( 23 ON, 46 Graves', 37 IIH). Forty additional patients were studied prospectively ( 11 ON, 17 Graves', 12 IIH). The sensitivity and specificity of all possible 21 chip subtests were compared against the same statistics for the entire test. We found that for these three optic nerve disorders, a test consisting of chips 22^ 1- 2 had nearly the same sensitivity and specificity as the entire test when compared with the clinical diagnosis. At 90% specificity, the ratio of sensitivities of the short version to the original version of the test were IIH, 53%/ 45%; optic neuritis, 85%/ 79%; and Graves', 67%/ 70%. The majority of the clinical value of the test can be achieved in one fourth of the original examination time. Color vision is affected early in the course of many disorders that affect the optic nerve ( 1- 4), and, as a result, color vision tests are often used in a neuro-ophthalmology clinic to diagnose and follow patients with optic neuropathies. The Farnsworth- Munsell 100- hue test ( FM- 100) ( 5,6), is a test of fine hue discrimination that consists of 83- 84 moveable colored chips, presented to the subject- one eye at a time- as four separate 19- to- 21- chip boxes. The subject is asked to arrange the chips in consecutive color order between fixed end chips. In the common hereditary dyschromatopsias, the test reveals most errors to be aligned along a specific " axis" that is diagnostic for the affected cone system. For example, patients with deuteranopia lack green-sensitive photoreceptors and have difficulty distinguish- Manusci'ipt received April 17, 1996; accepted July 4, 1996. From the Department of Ophthalmology, University of Iowa College of Medicine, Iowa City, Iowa, U. S. A. Address correspondence and reprint requests to Dr. E. M. Stone, Department of Ophthalmology, University of Iowa College of Medicine, Iowa City, IA 52242, U. S. A. ing hues that differ from one another in their green content. These hues are found in two opposing zones of the color wheel, and when a deuteranope's errors arc plotted on a polar graph, an axis of the errors is evident ( Fig. IA). In contrast, patients with an acquired color vision defect usually have a relatively panchromatic dyschro-matopsia. Although some optic neuropathies may show a preponderance of red- green defects ( Kollner's rule; 6,7) and some macular disorders may cause a relative blue-yellow deficiency, the data from a single FM- 100 test are so " noisy" that distinguishing a diagnostic axis in a given patient is usually impossible. Figure 1B shows the FM- 100 result from a patient with Graves' ophthalmology in whom no discernible axis is present despite a large number of errors on the test. Thus, we have found the total error score on a given test to be the most clinically useful parameter in these patients. In our clinic, the principal use of the FM- 100 has been to follow the progress of patients affected with one of three disorders: Graves' orbitopathy with optic neuropathy, idiopathic intracranial hypertension ( IIH) ( pseudotumor cerebri) with optic neuropathy, and optic neuritis ( ON). The usefulness of the test is limited by the length of time required to administer it. Since the distribution of errors ( axis) is of less value than the total error score when monitoring optic neuropathies, we examined both prospective and retrospective FM- 100 data from optic neuropathy patients to determine whether a subset of the original test could be used for routine monitoring. METHODS A 10- year subset of the computerized records of the University of Iowa Hospitals and Clinics was searched for patients with one of three diagnoses: Graves' orbitopathy with suspected optic neuropathy, IIH with suspected optic neuropathy, and ON. Over 1,100 charts resulted from this search. Charts were excluded for lack of an FM- 100; multiple diagnoses ( e. g., Graves' and demy-elinating optic neuritis in the same patient); misdiagnosis; inherited dyschromatopsia; diabetes; glaucoma; visually significant cataracts; visually significant corneal 2 B. E. NICHOLS ET Ah. 63 21 FIG. 1. Polar graphs of FM- 100 error score distributions. A: Deuteranopia. An " axis" of error scores is found to be centered on a line between chips 16 and 58. B: Graves' optic neuropathy. Large errors are found in several regions of the color wheel; no axis is evident. pathology; compressive lesions, such as pituitary tumors, ophthalmoscopically evident retinal lesions, such as age-related macular degeneration or retinal detachment; ophthalmic vascular disease ( anterior ischemic optic neuropathy, vein occlusion, artery occlusion); uveitis; collagen vascular disease; optic nerve drusen; previous orbital surgery; nystagmus; and cycloplegia at the time of the test. After these exclusions, 106 charts remained for analysis ( 37 IIH, 23 ON, and 46 Graves'). The average ages of the patients in each diagnostic category at the time the tests were performed were 29, 32, and 47 years, respectively. In addition, data were collected prospectively from 12 patients with pseudotumor ( IIH), 11 with optic neuritis, and 17 with Graves'. The average ages of these patients were 46, 36, and 54 years, respectively. Last, 74 tests were administered to normal volunteers ( average age, 25 years) for the purposes of evaluating the sensitivity and specificity of the FM- 100 in identifying optic neuropathy. The results from patients studied retrospectively ( 73%) and prospectively ( 27%) did not differ noticeably and were pooled to simplify the figures given in this article. The test was administered to the prospectively studied patients in a booth lined with black velvet to minimize the effects of stray light on retinal adaptation. The chips were illuminated by two 150- watt quartz halogen lamps filtered with Macbeth blue filters to provide an " illumi-nant C " color temperature. These filtered lights provided 500 lux evenly distributed across the testing surface. The testing booth and lights were custom- made and are not commercially available. However, a similar testing environment can be inexpensively created with two 40- watt " daylight" fluorescent lights ( Design 50; Osram Sylva-nia, Inc., Versailles, Kentucky) suspended 53 inches over the testing surface. The patients analyzed retrospectively took the test under " daylight" fluorescent lights with an intensity of 1,000 lux at the testing surface. Patients from both groups were allowed as much time as they wished to complete the test. A computer bar code was affixed to the base of each chip, and the test was scored with a computerized barcode reader ( 6). An error score was calculated for each chip, as outlined by Farnsworth ( 9). Raw error scores are defined as the sum of the absolute difference between a chip's number and the numbers of its two adjacent chips. The final chip error score is the sum of the absolute differences minus two. The error score for a particular box is the sum of the box's individual chip scores. The total score for the test is defined as the sum of the box error scores. To make the shortened test as practical as possible for clinical ophthalmologists, we also developed a calculator- based scoring system ( the calculator program is available on request from the authors). When executed on an inexpensive programmable calculator ( Casio fx- 7700G), the program prompts the investigator for the number of each chip in the array. The calculator can then provide a graphic output of error score distribution, numerical chip- by- chip error scores, individual box scores ( including the recently described reliability box [ 6]) with and without the square- root transformation. The individual chip error scores from all patients in each diagnostic category were averaged to give an average error distribution for patients with each of the three disorders. The individual chip error scores from all patients were also averaged to give an average error distribution for patients with these three optic neuropathies. A subtest of 21 chips ( one box) was arbitrarily chosen as a practical size for routine follow- up use. All 84 possible 21- chip subtests were then evaluated by calculating the average error score of all affected patients for chips n to n + 20 for all n values from 1 to 84. The potential diagnostic power of each 21- chip subtest ,/ Neuw- Ophlhalmol, Vol. 17, No. I. 1997 FARNSWORTH- MUNSELL TEST FOR OPTIC NEUROPATHY 3 was estimated by calculating the " standard difference" between the error scores of patients and control subjects. The standard difference is defined as the difference of the upper boundary of the 95% confidence interval for control subjects and the lower boundary of the 95% confidence interval for the patients divided by the upper boundary of the 95% confidence interval for the controls. Sensitivity and specificity figures were calculated for three different tests ( the entire FM- 100, the chip 22- 42 subtest, and the chip 70- 6 subtest) for all three diseases. The threshold between normal and abnormal was varied, and sensitivities were calculated at specificities ranging from 50 to 100%. Sensitivity was then plotted versus specificity to generate an operating characteristic curve ( 10) for all three tests in all three diseases. RESULTS Figure 2 shows that the average error scores ( at each chip) of patients affected with any of the three diseases studied were greater than the corresponding scores of normal controls. The differences between patients and A 1 C 3.0 controls were least for idiopathic intracranial hypertension patients and greatest for those with optic neuritis, which simply reflects the greater average visual dysfunction in patients with optic neuritis compared with those with idiopathic intracranial hypertension. For some hues, there was a greater difference between patients' and controls' scores than for other hues. This finding suggests that subtests containing these " highly diagnostic" hues might be more powerful than subtests containing the " poorly diagnostic" hues. An example of the latter are the chips around number 47. Patients and control subjects both have difficulty arranging these chips properly, and thus they have relatively poor diagnostic power ( 6). Figure 3 shows the average error score of all patients and all control subjects calculated for every possible 21- chip subtest. Again, the error scores of the patients are substantially higher than controls for every 21- chip subtest. Although the greatest arithmetic difference between the patients' curves and the control curves occurs at chip 34 ( corresponding to the error score of a 21- chip test consisting of chips 34- 54 inclusive), the diagnostic power of such a subtest is dependent on the number of patients whose scores overlap those of controls and 1 B FIG. 2. Average error score versus chip number for patients with three different disorders with potential for optic nerve dysfunction. Data from 74 normal controls are shown with open symbols, and patients' data are depicted with closed symbols. A: Graves' disease ( n = 63). B: Optic neuritis ( n = 34). C: Idiopathic intracranial hypertension ( n = 49). The data were smoothed over five points. 2.0- 40 Chip Number 3.0 40 Chip Number 20 40 60 Chip Number 80 ./ Neiirn- Ophlhalmol. Vol. 17, No. I, 1997 B. E. NICHOLS ET AL. 0 10 20 30 40 50 60 70 80 First Chip FIG. 3. Average error scores for all possible 21 chip tests. The data point corresponding to each 21- chip subset is plotted according to the number of the first chip in the subset. Data from patients ( all three disorders, n = 146) are shown as closed symbols, and data from normal volunteers ( n = 74) are given by open symbols. Error bars denote the standard error of the means. hence is more likely to be related to the ratio of patients' and controls' values than to the simple difference. To identify the subtests with the highest ratio of af-fected- to- control scores, the standard difference was calculated for each of the 84 subtests ( see Methods) and is shown in Fig. 4. There are two peaks in the resulting curves, one corresponding to a 21- chip test beginning at chip 22, and the other beginning at chip 70. Sensitivity and specificity calculations were performed at several different " thresholds" to generate operating characteristic curves for the entire FM- 100, the 22- 42 subtest, and the 70- 6 subtest for all three diseases as well as for all the patients averaged together ( Fig. 5). All of these curves illustrate the phenomenon of decreasing sensitivity of a test as the specificity is increased ( by progressively altering the threshold for an " abnormal" result). DISCUSSION The data presented herein suggest that either of two 21- chip regions of the FM- 100 are as capable of detecting optic neuropathy associated with three common disorders as the entire lest. This result occurs because some chips in the full FM- 100 are too difficult for control subjects to arrange correctly, and hence the differences between controls and affected patients are minimized in these areas. It may be that removal of two or more " difficult chips" from the region surrounding chip 47 will increase the power of the full FM- 100. This possibility is currently being evaluated in our clinic. The most significant finding is that the operating characteristic curve for the entire FM- 100 is nearly identical to the curves for two 21- chip subtests ( chips 22- 42 and chips 70- 6; see Fig. 5). That is, at any given specificity, the sensitivity of either of these subtests is almost the same ( and in some cases greater) than that of the entire test. Inasmuch as the region between chips 22 and 42 has essentially the same diagnostic power as the region from chip 70 to 6. We have chosen the former as our test for routine follow- up of optic neuropathy patients because this region fortuitously corresponds exactly to the second box of the FM- 100 as it is currently supplied by the manufacturer. Thus, one can easily use the shortened version of the test in selected clinical situations without physically altering the full test. Figure 2 is interesting in that it completely fails to support Kollner's rule. That is, there is no evidence for a " red- green axis" in any of these disorders ( such an axis would appear as two peaks centered near chips 18 and 60). The most dominant " axis" in all of these disorders is the monopolar peak near chip 47, which is an artifact of the test that is also present in controls ( 6). The test evaluated in this study differs from the D- 15 ( another widely used subset of the FM- 100) in a significant way. In the D- 15, the entire color wheel is represented with 15 chips. When arranged properly, the adjacent chips differ in hue approximately six times more than the chips in the FM- 100. The purpose of the D- 15 is to more rapidly detect a defect along one axis by sacrificing some of the sensitivity associated with the fine hue differences present in the original test. In contrast, the test evaluated in this study is a subset of the 100- hue test in which the ability to detect an axis is sacrificed ( only one fourth of the color wheel is tested) while still requiring the subject to exhibit the same degree of fine hue discrimination that is required by the original FM- 100. There are several limitations in the data set that is presented in this article. First, the data in the retrospective portion of the study ( 73%) were obtained under different testing conditions than those of the prospective portion ( 27%) and the controls. The most significant testing difference was in the level of illumination, which was 0 10 20 30 40 50 60 70 80 First Chip FIG. 4. Standard differences between 95% confidence intervals of 21- chip error scores of affected patients and normal volunteers. The data point corresponding to each 21- chip subset is plotted according to the number of the first chip in the subset. Data from patients ( all three disorders, n = 146) are shown as closed symbols, and data from normal volunteers ( n = 74) are given by open symbols. The standard difference is defined as the difference of the upper boundary of the 95% confidence interval for control subjects and the lower boundary of the 95% confidence interval for the patients divided by the upper boundary of the 95% confidence interval for the controls. ./ Ncwo- Ophthalmol. Vol. 17, No. I. 1997 FARNSWORTH- MUNSELL TEST FOR OPTIC NEUROPATHY A 100 C 100 - i 60 70 80 Specificity 90 60 70 Specificity - i - 80 CO 40 60 70 80 Specificity FIG. 5. Sensitivity as a function of specificity for three versions of the FM- 100 in three different disorders. A: Graves' disease ( n = 63). B: Optic neuritis ( n = 34). C: Idiopathic intracranial hypertension ( n = 49). D: All patients ( n = 146). In all cases, the data from the entire FM- 100 are given as open symbols, and the data from the two 21- chip subtests are given as closed symbols ( 22- 42, circles; 70- 6, triangles). - 500 lux for the prospective arm and 1,000 lux in the retrospective arm. However, this difference should not adversely affect the results presented here because the dimmer illumination was used for all the controls and tended to give them slightly higher error scores, thereby minimizing the difference between the patients and control group rather than artificially augmenting it. A second problem is that the number of patients in each of the diagnostic categories is not equal, such that when these groups are averaged, the result is weighted toward Graves' and ON and away from IIH. However, this weighting parallels the relative incidences of these diseases among patients followed with the FM- 100 in our clinic during a 10- year period, and thus biases the results toward the diagnosis for which the test is most likely to be used. A significant problem is that the normal volunteers were not age matched with the clinic patients. The patients were significantly older than the controls, which would tend to artifactually increase the difference between the scores of the two groups. This mismatch would affect the magnitude of the sensitivity and specificity values reported here but would not alter the comparisons between the various tests unless a significant age- related decrease in color discrimination existed that affected a specific portion of the color wheel. Such age- related changes in performance on the FM- 100 have been reported ( 6,11), but the magnitude of such changes is small. In a previous age- stratified study of more than 200 individuals ( 6), the average error score of patients aged 56- 65 years was only 0.15 per chip higher than patients aged 16- 25 years. Thus, although it is certainly possible that the true sensitivity of the 21- chip test in detecting optic neuritis is < 85% ( at 90% specificity), the finding that the entire FM- 100 is no more sensitive than box 2 is unlikely to be altered by using age- matched controls with higher average error scores. In conclusion, this study shows that the second box of the four box Farnsworth Munsell 100- hue test is as sensitive and specific as the entire test for detecting optic neuropathy associated with three common disorders. Although this single box test is incapable of delecting an " axis" of dyschromatopsia, it can be given in 5 min per eye and is much more practical than the entire FM- 100 for frequent retesting on follow- up visits. The availability of a reasonably priced approximation of " illuminant ./ Nciiro- Ophtlmlmol, Vol. 17. No. I. 1997 6 B. E. NICHOLS ET AL. C" as well as a calculator- based scoring system should make the FM- 100 and the subtest described in this article accessible to all ophthalmologists. Acknowledgment: This work was supported in part by the Foundation Fighting Blindness, the George Gund Foundation, the Grousbeck Family Foundation, and Public Health Service Research grants EY10539 and EY10564. Dr. Stone is a Research to Prevent Blindness Dolly Green Scholar. REFERENCES 1. Linksz A. The clinical characteristics of acquired color vision defects. In: Slraatsma BR, Hall MO, Allen RA, eds. The retina: morphology, function, and clinical characteristics. Berkeley: University of California Press, 1969: 583- 92. 2. Verricst G. Further studies on acquired deficiency of color discrimination. ./ Opt Soc Am 1963; 53: 185. 3. Krill A, Fishman G. Acquired color vision defects. Trans Am Acad Ophthalmol Otolaryngol 1971: 75: 1095. 4. Griffin JF, Wray SH. Acquired color vision defects in retrobulbar neuritis. Am J Ophthalmol 1978; 86: 193- 201. 5. Farnsworth D. The Farnsworth- Munsell 100- hue and dichotomous tests for color vision. J Opt Soc Am 1943; 33: 568- 78. 6. Stone EM, Nichols BE, Wolken MS, Montague PR, Thompson HS. New normative data for the Farnsworth Munsell 100 hue test. In: Drum B, ed. Colour vision deficiencies, vol 11. Dordrecht, the Netherlands: Kluwer Academic Publishers, 1993. ( Documenta Ophthalmologica Proceedings series 56.) 7. Kollner H. Die Storungen des Farbensinnes: ihre klinische Bedeu-lung und ihre Diagnose. Berlin: Karger, 1912. 8. Hart WM. Acquired dyschromatopsias. Surv Ophthalmol 1987; 32: 10- 31. 9. Farnsworth D. The Farnsworth- Munsell 100 hue test manual. Baltimore: Munsell Color Company, 1957. 10. Lusted LB. Introduction to medical decision making, Springfield, 111.: Charles C. Thomas, 1968. 11. Pinckers A. Color vision and age. Ophthalmologica 1980; 181: 23- 30. ./ Neuro- Opluhalmol, Vol. 17, No. I. 1997 |