Description |
I tested a group of frames intended for a medical diagnosis system called Iliad. This system is a microcomputer-based (Macintosh) medical expert system. The Iliad system contains a knowledge base, data dictionary, application programs and recent medical literature. Iliad is a Bayesian medical expert system. The system performs two functions for medical students: consultation and simulation. Accuracy and reliability are major concerns for the development of a Bayesian expert system. The sequential Bayesian model is based on an assumption of conditional data independence. However, many disease findings are interrelated, and tend to co-occur. Some of these co-occurring findings describe pathophysiological concepts, such as “lung consolidation.†To handle co-occurring findings, a new type of decision frames, called “clusters,†have been included in the Iliad system. Clusters are rule-based decision frames which contain the conditional dependent findings. Clusters are used as findings in Bayesian frames, and thereby reduce the overconfidence that would result from including the conditional dependent findings directly. I hypothesized that the clusters would, in fact, significantly improve Iliad’s diagnostic accuracy and reliability, compared to a non-clustered system. I tested the hypothesis by measuring the reliability of pairs of clustered and non-clustered frames using real patient data. The null hypothesis of my test assumed there was no difference between the clustered system of frames and the non-clustered system. This hypothesis was tested under two conditions: The first condition used estimated probabilities for the frames. The second condition used actual probabilities measured from the data base. The test of both conditions allowed us to determine whether inaccurate statistical estimates might partly explain any unreliability or whether all unreliability was a result of conditional dependent findings. The test frame reliability was developed by Hilden, et al. According to my research, I found that the results generated by the clustered system were significantly more reliable than the results generated by the non-clustered system. Expert probability estimates were found to be inaccurate compared to actual measurements from the patient data. However, this inaccuracy did not explain the unreliability I found. This unreliability was due to conditional dependent findings. Some clustered frames remained unreliable on initial testing. When modified by re-clustering, these frames proved reliable. Reliability testing could be used during the knowledge engineering process to validate prototype frames. |