Study of inter-rater reliability using various patient categorization methods

Study of inter-rater reliability using various patient categorization methods

Title	Study of inter-rater reliability using various patient categorization methods
Publication Type	thesis
School or College	College of Nursing
Department	Nursing
Author	Warnick, Myrna Loy Williams
Date	1973-08
Description	Nurses have determined staffing patterns in the hospital environment for years by patient census at its historical peak. Some nursing administrators questions this as being the most appropriate way to manage the workload had have suggested identifying patient's needs for care and placing the patients into categories based on those needs. Once the category of classification has been determined, the assigned category for each patient is converted to pre-determined hours of care. This system then is dependent upon the reliability of nurses using the tools for categorization and the validity of the tools themselves. This study was designed to look at the inter-rater reliability of staff nurses using various types of categorization methods. The tools in the study ranged from a subjective, intuitive nursing assessment, to more structured format with suggest criteria (Modified Georgette), to a checklist format, (Pardee Checklist), to a point system multiple dimension tool with specific criteria (Poland Point System). These methods were use by six registered nurse raters in a general hospital assigned to two medical divisions. Each nurse rated a randomly designated patient using all four of the methods. Pearson product-moment correlations were calculated to determine the inter-rater reliability. Of the four tools, the intuitive, subjective method of identifying patient needs was the least reliable with correlations of .18 to .85 and an overall correlation of .57. The reliability coefficients for the Modified Georgette ranged from .00 to 1.00 with an overall mean correlation of .73 and the Pardee Checklist had correlations between .09 and 1.00 with an overall mean correlation of .70. The Poland Point System, which is the most structured tool and has the greatest specificity of patient needs, was used most reliably as indicated by correlations of .49 to 99 and an overall mean correlation of .83. Experience seemed to be one factor that influenced the rater's ability to use the various tools. For example, nurse with less than one year of experience were much more consistent with other raters and themselves on the more structured tool, the Poland Point System. Raters with experience of three to twenty-five years and raters with less than one year of experience were not consistent with each other. Further study on the role of experience in assessing patient needs in indicated and deemed necessary to fully understand these findings. Nurses who assumed administrative roles in addition to their clinical assignment were not as consistent in their use of the various tools with other raters as were those nurses with primary clinical assignments. Further study is suggested on the time involved and the influence of administrative tasks on clinical management care. Rater intra-divisionally had high reliability coefficients. This study suggested high agreement with other rates on a single division or unit but low consistency existed between raters inter-divisionally. This certainly suggests the influence of leadership and peer group suggestion on assessment of patient needs. The scores do not suggest whether the influence was positive or not, only that raters on a unit tended to assess a patient in a similar way.
Type	Text
Publisher	University of Utah
Subject	Comparison Study
Subject MESH	Nursing Care; Patient Care
Dissertation Institution	University of Utah
Dissertation Name	MS
Language	eng
Relation is Version of	Digital reproduction of "A study of inter-rater reliability using various patient categorization methods." Spencer S. Eccles Health Sciences Library. Print version of "A study of inter-rater reliability using various patient categorization methods." available at J. Willard Marriott Library Special Collection. RT2.5 1973 .W35.
Rights Management	© Myrna Loy Williams Warnick.
Format	application/pdf
Format Medium	application/pdf
Format Extent	657,811 bytes
Identifier	undthes,5035
Source	Original: University of Utah Spencer S. Eccles Health Sciences Library (no longer available).
Master File Extent	657,838 bytes
ARK	ark:/87278/s6wh2rvt
DOI	https://doi.org/doi:10.26053/0H-Y1FC-P100
Setname	ir_etd
ID	191374
OCR Text	Show A STUDY OF INTER-RATER RELIABILITY USING VARIOUS PATIENT CATEGORIZATION METHODS by Myrna Loy Williams Warnick A thesis submitted to the faculty of the University of Utah in partial fulfillment of the requirements for the degree of Master of Science College of Nursing University of Utah August 1973 This thesis for the Master of Science Degree by Myrna Loy Williams Warnick was app roved August 1973 ]1!�£{ ' .. -- Chairman, Supervisory Commi ty;' . " _J.��-" Member ��dderC '-��' ��'�')L�� (/ rJj�lc-J�t. Dean, College of Nursing . - .,� , . . A t . ,/ ) I' ... 1 ' /1' .. :.0....,.( .. ! J led' J) / II . ./ "-::/1' � � ; /'/ "v _,,--,/ :. : ! . - - ' _ Dean of Grad " " : ....--· �e School / ACKNOWLEDGEMENTS This writer wishes to extend grateful appreciation to the numerous people who have continually supported and encouraged her during this study: Dr. Marie Holley, Margaret Adamson, Ethel Saunders, and Gwen Luke. Appreciation is also expressed to Mrs. Verla Collins and the Latter-day Saint Hospital Nursing Administration Staff; Mrs. Esther Sparks and the University of Utah Nursing Administration Staff; and Dr. F. R. Woolley for their suggestions and help. A special thanks goes to those who sacrificed the most to have this study com· pleted: my parents, Mr. and Mrs. George Williams; my children, Chris, Mark, Marla, Lelen, Curtis, and Travis. A special thanks is extended to my husband, J. LaMar Warnick, who never waivered in his support and understanding and gave untold hours to this study. TABLE OF CONTENTS Acknowledgements . iii List of Tables . v List of Figures. vi Abstract vii Chapter Introduction 1 II. Method 7 III. Results and Discussion 12 IV. Summary, Conclusions and Recommendations 18 I. References 21 Append ices . 23 Vita . . . 31 LIST OF TABLES Table Page 1. Raters Profi Ie 2. Summary of Characteristics of Each Method 11 3. Pearson Product-Moment Correlations Among Six Raters Using the Non-Structured Method . 12 4. Weighted Average Correlations Between Raters . 13 5. Pearson Product-Moment Correlation Among Six Raters Using the Modified Georgette System 8 . . . . . . . . . . 14 6. Pearson Product-Moment Correlation Among Six Raters Using the Pardee Checklist. . . . . . . . . • . . . . . 14 7. Pearson Product-Moment Correlation Among Six Raters Using the Poland Point System . . . . . . . . . . . . . 15 LIST OF FIGURES Figure 1. Page Inter-Ward and Intra-Ward Comparison of Rater Reliabilities . . . . . . 17 ABSTRACT Nurses have determined staffing patterns in the hospital environment for years by patient census at its historical peak. Some nursing administrators question this as being the most appropriate way to manage the workload and have suggested identifying patient's needs for care and placing the patients into categories based on those needs. Once the category or classification has been determined, the assigned category for each patient is converted to pre-determined hours of care. This system then is dependent upon the reliability of nurses using the tools for categorization and the validity of the tools themselves. This study was designed to look at the inter-rater reliability of staff nurses using various types of categorization methods. The tools in the study ranged from a subjective, intuitive nursing assessment, to more structured format with suggested criteria (Modified Georgette), to a checklist format, (Pardee Checklist), to a point system multiple dimension tool with specific criteria, Poland Point System. These methods were used by six registered nurse raters in a general hospital assigned to two medical divisions. Each nurse rated a randomly designated patient using all four of the methods. Pearson product-moment correlations were calculated to determine the inter-rater reliability. Of the four tools, the intuitive, subjective method of identifying patient needs was the least reliable with correlations of .18 to .85 and an overall mean correlation of .57. The reliability coefficients for the Modified Georgette ranged from .00 to 1.00 with an overall mean correlation of .73 and the Pardee Checklist had correlations between .09 and 1.00 with an overall mean correlation of .70. The Poland Point System, which is the most structured tool and has the greatest specificity of patient needs, was used most reliably as indicated by correlations of .49 to .99 and an overall mean correlation of .83. Experience seemed to be one factor that influenced the rater's ability to use the various tools. For example, nurses with less than one year of experience were much more consistent with other raters and themselves on the more structured tool, the Poland Point System. Raters with experience of three to twenty-five years and raters with less than one year of experience were not consistent with each other. Further study on the role of experience in assessing patient needs is indicated and deemed necessary to fully understand these findings. Nurses who assumed administrative roles in addition to their clinical assignment were not as consistent in their use of the various tools with other raters as were those nurses with primary clinical assignments. Further study is suggested on the time involved and influence of administrative tasks on clinical management of care. Raters intra-divisionally had high reliability coefficients. This study suggested high agreement with other raters on a single division or unit but that low consistency existed between raters inter-divisionally. This certainly suggests the influence of leadership and peer group suggestion on assessment of patient needs. The scores do not suggest whether the influence was positive or not, only that raters on a unit tended to assess a patient in a similar way. viii CHAPTER I INTRODUCTION Hospital staffing patterns and the utilization of nursing personnel has traditionally been determined by patient census alone. These patterns are usually determined without regard to other important factors which can have an effect on manpower needs. The inpatient census, or the number of patients occupying hospital beds, is highly variable which makes it difficult to predict. Nursing staff has, therefore, been allocated according to historical peak need as perceived by administrators. Many factors including the rising cost of care and cries for accountability have caused many nursing administrators to question the value of this method as an adequate guideline for staffing control. Poland et al (1970) reported that they were using a new method for measuring patient care based on physical needs, replacing the bed as the basis for planning patient care. Economically, hospital administrators are no longer willing to accept the inpatient census as the gauge for allocation of nursing personnel. They are seeking an accurate, objective measure that will reflect patient's needs in order to make adjustments in the workload. Georgette (1970) suggested that responsible nursing administrators could benefit by turning to industrial management methodology for sound and scientific control of the workload. Cost control and efficiency techniques of industry have greatly improved manpower allocation problems. These techniques consider not only numbers, but also necessary skills to meet production needs. The need for assessing along multiple dimensions in determining nursing staff needs has been acclaimed by Young (1968, p. 85) in his proposal that: Traditionally, most nursing unit staffing in shortterm hospitals has been guided by rules of thumb that provide fixed amounts of nursing hours per patient day, based on historical measures of peak need as im plied by average bed occupancy. Such 2 procedures usually have not attempted to respond directly to the highly variable demand for care; they were frequently based on relatively 10nfJ-term estimates of the number of patients to be cared for rather than an actual and immediate aggregate nursing care required by individual patients. A more effective procedure, and one that can be shown to require fewer total nursing hours when confronted by stochastic demand, is to detect and respond cybernetically to increases and decreases in the demand for patient care when and where it occurs within the hospita I system. If staffing needs are to be efficiently determined, then there must be some logical system developed which wi II render quality care to the patient. A system of allocatinfl nursing personnel can only iJe as effective as the information received by nursing administration in determining workload management. The number of staff and needed skills must be acurately predicted uased on the type and number of tasks which need to be performed for each patient during a given time. If the information provided by nursing is not reliable and does not accurately predict the scope of patient's needs, administrators will look for other ways to analyze the workload. A major problem in determining staff needs is finding instruments, or m ethods,for nurses to use which are reliable and valid. A review of the literature reveals that several types of categorization systems have been reported. Many of these systems attempt to give a numerical value to patient's needs. Methods range from short, subjective, intuitive need assessments made by nurses to elaborate checklists of the tasks involved in providing care to the patient. Wolfe and Young (1965) developed a categorization system which classifies patients into self-care, intermediate care, and intensive or total care categories. The criteria for placement into these three categories is primarily based on the patient's ability to care for himself. With this method, nursing staff identified the primary factors which would best indicate self-sufficiency; namely ambulation, feeding, bathing, and major therapy. 3 A patient who is mobile who can feed and bathe himself and who can generally take care of his personal needs without assistance is also not likely to be very ill (although in some cases "illness" may not be apparent). He will tend to require somewhat less direct care nursing time than a patient who is bedfast, and who must be assisted in feeding, roughly on the basis of self-sufficiency and avoid the many difficulties associated with attempts to determine just how ill is ill. (Wolfe and Young, 1965, p. 5,) Historically, the work of Wolfe and Young are among the earliest attempts to define staff needs based on criteria other than occupied beds. Many of the categorization methods discussed hereafter are based on their efforts. The categorization system developed by Georgette (1970) elaborated on the self-sufficiency concept. Observable nursing behaviors necessary to provide adequate care were the parameters assessed in assigning patient categories. Five parameters, activities of daily living (ADL), grooming, eating, excretion and comfort; general health; treatments; medications; and teaching and emotional support. These are included in each evaluation to determine placement of a patient into a particular category. Criteria have been established for each parameter to help nurses evaluate the patient with less subjectivity. When all the parameters have been assessed, the patient is given a number that is consistent with the category in which he is placed most often, i.e., the nearly self-sufficient patient is placed in Category I with more intensive type of nursing needs indicated in progressive categories, II, III, and I V. With categorization complete the total amount of nursing time can be approximated based on time standards for each category of patient. The Salt Lake City Latter-day Saints (LDS) Hospital Nursing Administrative Staff (1971) modified the categorization system of Georgette (See Appendix A). Some of the parameters were changed to more closely identify the needs of patients with in their hospital. The areas which they included were: activities of daily living, e.g., personal 4 hygiene, diet, turning and/or assisted activity and excretion; diagnostic evaluation; medications; treatments and emotional support; and teaching. As with the Georgette System, detailed criteria were established for inclusion of patients into certain categories. The nurses assess the patient's needs and assign the patient to categories I - IV, category I indicating self-sufficient patients to category IV for patients needing complete nursing care. Again, the number of hours of care allotted to each patient is based on the category in which the patient was placed. As applied by LDS Hospital, care needs are reassessed prior to each oncoming shift. Personnel allocation is completed after the information to classify patients neeus is converted by nursing administration into hours of nursing care needed by each nursing unit. White, Quade and Wh ite (1967) completed a study using self-care, intermediate care, and intensive care (strict and moderate) categories. Physician estimates of patient care needs were included in this study as an added measure. vVhite noted that there was agreement between physicians and nurses on patient classification in the intensive care category, but there was a lack of agreement in nurse-physician classifications of intermediate care and self-care classes. These differences were explained as follows: Nurses may be placing a large proportion of patients in intermediate care because of present checklist cri< teria, which are based on objective nursing tasks. In Contrast, the classification by physicians is based largely on subjective definitions which perhaps relate more closely to a general impression of the patient's degree of mobility. PNhite, 1967, p. 3.) Since nurses and physicians found that they were unable to agree on specific patient problems when they were evaluated separately, attention was turned to overall implications of the nursing criteria being used. In doing so, specific nursing criteria were identified as being of greatest significance in classification and these were chosen for analysis. For example, the bath was singled out as the best single criteria for distinguishing self-care patients from others. To produce a workable list of measures, five elements were extracted from the original list. These were: temperature, pulse, respiration and/or blood pressure, oxygen therapy, suction, cleanliness, and dietary. 5 Poland (1970) further developed these five criteria, resulting in two additional dimensions: turning and/or assisted activity and toileting-output (See Appendix B). In order to obtain greater precision in determining staffing needs, nurses and assistants were timed while they performed the various tasks listed under the seven categories. Using the observations of many nursing personnel over several days, average time periods were obtained for each task. The seven levels of patient care needs were broken down into subcategories with point values. The number of points a patient receives may then be converted to nursing care hours. Pardee (1970) has also establ ished criteria to assess patient's needs util izing a checklist format. The criteria used are virtually the same as White, namely pulse and respiration and/or blood pressure, personal hygiene, activity, diet, oxygen, therapy and medical treatments. Nursing divisions adjust the criteria to meet their own particular needs. The criteria used on a surgical unit are different from the neurology unit which are, in turn, different from the medical unit (See Appendix C). The ward clerks are responsible for categorizing patients every eight hours, basing their data on information obtained from the nursing care plan. Because of limited clinical background, criteria used with clinical personnel were stated as srmply as possible. Categorization is begun with the clerk marking on a scale of one to three for each of twentyone need areas. A category is then assigned the patient according to the category with the most number of checks. An unusual feature of the Pardee Checklist was the predetermination of care of certain patient conditions. For example, on a surgery day a patient is automatically given a level three check by the clerk, if he is being suctioned a level two or three check is required. Nearly every system or method reported in the literature uses some method of quantifying patient needs on which to predict nursing workload. This fi9ure must be based on a realistic evaluation of patient care needs rather than patient census. I dentifying a system and a way to utilize that system that is both economical and valid which 6 can be used reliably by registered nurses is of prime importance. Hospital nursing administrations cannot count on information from any system until it has been shown to be reliable as well as valid in its use. It is assumed that all of the above mentioned methods, if properly applied, have some validity and represent substantial improvements over determining staffing needs using traditional patient census methods. A question of prime importance is which of the methods can be applied most reliably and efficiently with the existing variability among the nursing staff who collect the data. Thus, the purpose of this study is to determine the inter-rater reliability among registered nurses using four categorization methods; e.g., the Modified Georgette, the Poland, the Pardee, and an intuitive nursing assessment which approximates current practice. CHAPTER II METHOD S1~QY_~_~!!Lng. This study was conducted in a 284-bed university hospital located in Salt Lake City, Utah. The hospital is comprised of four intensive care units, two medical units, two surgical units and six additional specialty areas. The hospital, associated with the University of Utah College of Medicine, has an atypical patient population for several reasons. The factors which influence the number and type of patients encountered include: teaching needs of the University, serving as a regional hospital to a large portion of the Intermountain West, providing specialty services not available elsewhere, and serving as a major provider of medical services in the Salt Lake City, area. These factors often create special problems in anticipating and providing adequ ate nursing coverage. Administrative personnel in the nursing service at University Hospital also hold faculty appointments with the University of Utah College of Nursing. This affiliation brings an unusual added dimension not found iii many hospitals. These nursing administrators are well aware of current thinking and trends within academic circles as well as the practical application of new ideas to patient care. The result is a dynamic staff who are aware of the necessity for research and evaluation and are willing to accept responsibilities in these areas. Excellent cooperation in performing this study was obtained from Nursing Administration staff members and patients. The study was conducted during the month of July, 1972, on two divisions, caring for medical patients, with a total of 60 beds. A heavy patient load was encountered during the period of the study with census running at over 95%. The six members of nursing staff used as raters in this study were permanently assigned to one of these divisions. The head nurse on each division was responsible for assigning personnel and comr>letiny work schedules. I f the division was short of personnel, either because of illness or needs 8 of patients exceeding what the assigned staff could provide, a manpower resource pool from the nursing office supplied additional help. The head nurse or in some cases a staff nurse could request help from the resource pool. These requests were based on experience and intuitive assessment of the staff needed to provide adequate care in that situation. Although the nursing staff on the medical divisions had been oriented to a categorization similar to Georgette's method for determing staffing needs, it was not being used at the time of the study. R.a~~x~~ Nine registered nurses were selected to participate in the study. The initial group of raters comprised all the nurses on the two divisions. However, only six nurses could meet the criteria below. Minimum criteria for the raters were established as fol~ lows: 1) Graduate of either a baccalaureate, associate or diploma nursing program. 2) Licensu re as a registered nu rse. (R. N.) 3) Commitment to complete data collection using a" four methods. 4) Time available to attend orientation classes. 5) Scheduled on either day or afternoon shifts during the period of the study. Table 1 presents a summary of the education, experience, position and shift assignment for each of the six raters completing the study. TABLE 1 Rater's Profile :r:~qoo Years Ex e.!:,_ Position Completed Observations Normal Shift By Raters 1 B. S. 10 Head Nurse Days 16 2 3-yr. Diploma 25 Team Leader Days 9 2-yr. Associate 3 Team Leader Rotates 11 4 2-yr. Associate ~ Team Leader Afternoons 10 5 B. S. 5 Head Nurse Days 10 6 B. S. Team Leader Days 7 3 I [ 1 9 It can be seen that half of the group held bachelors degrees, and that the raters' experience ranged from about six weeks to over twenty-five years. The average for the group was approximately 14.5 years. Most of the data were collected during day shift. Methods for Categorization. Four different categorization methods were used by each of the raters in the study. A complete copy of each method or protocol is shown in Appendices A, B, and C. Method Two, Three, and Four correspond to the Modified Georgette (developed at Salt Lake LOS Hospital), Pardee Checklist, and Poland Point System respectively. Method one was virtually unstructured, using nurse's overall intuitive judgement to rate patient care needs on a one-to-three scale. A fourth level of need was available to describe long-term or chronic patients. This level was used on a limited basis, and does not represent a true end point of the continuum represented by levels one, two, and three. Method one was an attempt to quantify current practice then in effect at University Hospital. In practice, the patients were rated on an alphabetic scale (A, B, or C) which was later converted to numerics for analysis. Table 2 summarizes the major features of each of the four methods used in the study. Procedure. Each day, one patient was selected according to a random table of numbers, to be categorized on each of the four tools. The ratings were completed at the change of shift when the greatest number of registered nurses could evaluate the patient within a reasonable period of time. The time lapse between the first observation to the last was held to less than two hours. Each rater evaluated the patient independently utilizing all four tools. The number of raters varied each day according to the number of nurses scheduled between the two divisions. After the first series of observations were made, any patient who had a sudden change in condition such as, vital signs becoming unstable or the nursing or medical plan of care altered, were eliminated from the study. If, however, the patient's condition was unstable at the outset and remained so, his data was retained. Four patients were eliminated because of inadequate data collecting or unstable patient conditions. A total of sixteen patients were assessed and categorized. Categorization 10 of the patients was completed during a one-month period excluding weekends when there was neither a representative staffing pattern nor patient load. TABLE 2 Summary of Characteristics of Each Method #OF CATEG. LEVELS METHOD FOR DETERMINING STAFF NEEDS MAJOR AREAS REVIEWED CRITERIA FOR CATEG. Unspecified, essentially 1 overall 3 Plus 1 Estimated from exper. Unspecified None Overall impression (A,B,C,D) None 4 Calculated from overall category level Hygiene, diet turning and/or assisted activity excretion, d jagnostic evaluations, med icat io ns, teaching and emotional support. Extensive Average level of care II,III,IV) None (referred to list) 3 Calculated from overall category level Diet, activity, vital signs, IPPB, urine analysis, monitor, dressings and drains, suction, isolation, in continance, admit, dialysis, surgery or major exam. Minimal Average level of care (I, Used special medical form (new form for each patient) Minutes of care needed calculated from points received Respiratory aides, suction, cleanliness, turning and/or assist activity, diet, toilet-output, vital signs and measurements. Moderate # TYPE 1. Unstruct. 2. Modified 7 Georgette 3. Pardee Checklist 4. Poland Point System METHOD OF REPORT. DATA NUMBER OF CARE AREAS EVALUATED Up to 21 7 Varies from 1-12 points II, III) Total number of points PROTOCOL FORM USED Circled points on check Iist (new form for each patient) ...\ CHAPTER III RESULTS AND DISCUSSION Analyses of Data. The data were analyzed at the University of Utah Computer Center (UU/CC) using a Univac 1108 computer and the UU/CC library program CORREl, written by Edward Sharp. Pearson Product-moment correlations were calculated between each of six raters on a given method. The results are four six-by-six square inter-correlation matrices, representing the four methods. Since all matrices obtained are symmetrical only the lower half is displayed. Pearson Product-moment correlations between the six raters on tool number one, the unstructured intuitive nursing assessment, are displayed in Table 3. TABLE 3 Pearson Product-moment Correlations among six Raters using the Non-Structured Method. Rater r 1 2 ----., 3 4 2 .65(9) 3 .51(11) .85(4) 4 .63( 10) .85(4) .76(8) 5 .18(9) .65(5) .73(7) .39(6) 6 .26(7) .58(6) .50(3) .00(4) 5 .85(5) -1 i i Number of patient assessments used to calculate correlation The correlations ranged from .00 between raters four and six to .85 between raters two and three, three and four, and five and six. These correlations indicated a high degree of variability among raters. Since dimensions of patient care assessed by tool one were not well-defined with a particular format, the categorizations were probably on estimates from previous experiences which also varied considerably. I n looking at the correlations between the raters 13 using tool one, raters two and three with more than three years of experienc;e used this method most reliably or consistently. The two raters with less than one year of experience used this intuitive method least reliably with a correlation of .00. The correlations suggest a marked increase in reliability with increased experience. Weighted mean correlations were calculated on each of the four tools to provide an estimate of the overall agreement between the six raters using each method. These means were calculated by multiplying the number of observations made in determining each correlation coefficient with the correlation coefficient itself, then adding the products and dividing by the total number of observations. The resultant numbers are shown in Table 4 which provides a simultaneous, reliability score among raters on all of the methods. TABLE 4 Individual and Overall Weighted Mean Correlations on Four Categorization Tools, Rater ~ \2 I 13 I l 2 3 4 5 6 Overall Weighted Mean .46 .66 .69 .57 .52 .44 .57 .68 .80 .87 .72 .77 .49 .73 .72 .67 .88 .66 .83 .37 .70 .82 .90 .89 .90 .69 .82 As indicated in Table 4, there is an increase in the overall weighted mean from .57 on tool one to .73 on tool two. The increased structure of tool two apparently facilitated the consistency among raters. It is of importance to note that the two raters with the least experience did no better with this tool than on tool one. Only those nurses with experience attained greater consistency with this tool. 14 Table 5 displays the intercorrelations between raters on tool two, the Modified Georgette. The range of correlations from .00, raters four and six, to 1.00, raters three and six indicates great variability in the overall use of the tool by all six raters. The two raters with the least experience, four and six, had the correlation of .00. TABLE 5 Pearson Product-moment Correlations among six Raters using the Modified Georgette Method. 3 2 Rater 4 2 .83(9) 3 .84( 11) .89(4) 4 .81( 10) .94(4) .81(8) 5 .50(9) .97(5} .90(7) .78(6) 6 .32(7) .45(6) 1.00(3) .00(4) 5 .84(5) ( ) '" Number of patients as!>essed to calculate correlation Correlations between the six raters on tool three, the Pardee Checklist extend from -.09, raters one and six, to 1.00, raters three and six. The weighted mean correlation for this tool was .70 which is about the same as for tool two (.73). However, there was greater variability between raters as indicated by the wider range in correlations as demonstrated on Table 6. TABLE 6 Pearson Product-moment Correlations among six Raters using the Pardee Checklist. 2 3 2 .89(9) 3 .90( 11) 1.00(4) 4 .76( 10) .33(4) .73 5 ~92(9) .87(5) .92(7) 6 -.09(7) .20(6) 1.00(3) ________ () . _____L -_________L -_ _ _ _ _ _ _ _ _ _ _ Number of patient assessments used to calculate correlation 4 5 .71(6) .67(5) .58(4) ~ _ _ ._ _ _ _ _ _ _ _ ~ _ _ _ _ _ _ _ __ _ 15 The two nurses with minimal experience improved their correlations from .00 on tools one and two to .58 on tool three. According to these statistics nurses with less experience do better on a highly structured tool. Those nurses with three to 25 years of experience also used the tools consistently with correlations between .87 and 1.00. The low reliability coefficients occurred between those nurses with little experience and the nurses with over three years experience. They seemed to view patient's needs differently, Table 7 shows the inter rater correlations obtained with tool four, the Poland Point System. These correlations ranged from .49 to .99. The overall mean for this particular tool showed an increase from .53 on tool one, .73 on tool two, and. 70 on tool three to a mean of .83 (See Table 4). TABLE 7 Pearson ~!'Iter Product~moment 1 Correlations among six Raters using the Poland Point System. L ___._,...-. 4 4 2 .81 (9) 3 .93(11) .99(4) 4 .87( 10) .98(4) .94(8) 5 .62(9) .85(5) .67(7) .83(6) 6 .84(7) .95(6) .97(3) .91 (4) () Nurntll" of p,lllent assessments used to -L- .49(5) calculate correlatIon There is greater specificity, or structure, written into this particular tool for identifying patient needs. The other methods had restricted ranges of patient assessment criteria which must be considered as a possibility for allowing more variability to occur than with tool four. The two nurses with one year or less experience were able to use this method with a correlation of .91 suggesting that the more structure and more specificity a tool has, the more reliability it can be used by those with minimal experience. The nurses with J 16 experience used this tool consistently with correlations ranging from .62 to .99. This particular tool also seemed to lend itself to better utilization between the more experienced nurses and those with minimal experience as indicated by the decreased range of the correlations on tool four. To look at an individual's overall performance with each of the four methods, an estimate of the raters overall reliability was calculated into a weighted mean score. This was done by mUltiplying the correlation coefficient with the number of observations. This is demonstrated on Table 4. Rater six, a baccalaureate graduate nurse with one year of experience lacked consistency with all other raters as demonstrated by her overall mean correlations on the tools of .44, .49, .37, .82. This study did not attempt to identify individual motivation, attitude toward the study, or clinical assessment skills which could have been the basis for the low reliabil ity coefficients of th is rater. There were high correlations, .73, .85, .75 and .87, among all nurses within each division, as indicated on Figure 1. This division consistency seemed to indicate nurses of one division assess patient needs and categorize similarly. Whether they all categorize the patient's needs adequately was not determined in th is study. They have simply assessed the needs consistently with each other. I nter-division scores however, indicated a great lack of consistency by the overall mean correlations of .41, .62, .64 and .85. Nursing administrations, using patient assessment tools as a means of providing adequate nursing staff to units must necessarily have information about patient care needs that is consistently obtained from each unit so that it is comparable throughout the institution. Otherwise, a nursing administrator's ability to establish realistic staffing priorities will be less than desirable. 11 .85 .80 .75 .70 Cf) z Q.65 I- / <:( / uj .60 cr: cr: 0.55 u "" I" / . I I ! I I I / I .50 I I / I .45 , I I I .40 ( 2 3 4 CATEGORIZATION TOOLS Figure 1 Inter-ward and intra-ward comparison of overall rater reliabilities. Mean intra-ward reliability Mean inter-ward reliability Mean average reliability CHAPTER IV SUMMARY, CONCLUSIONS AND RECOMMENDATIONS Nurses have determined staffing patterns in the hospital environment for years by patient census at its historical peak. Some nursing administrators question this as being the most appropriate way to manage the workload and have suggested identifying patient's needs for care and placing the patients into categories based on those needs. Once the category or classification has been determined, the assigned category for each patient is converted to pre-determined hours of care. This system then is dependent upon the reliability of nurses using the tools for categorization and the validity of the tools themselves. This study was designed to look at the inter-rater reliability of staff nurses using various types of categorization methods. The tools in the study ranged from a subjective, intuitive nursing assessment, to more structured format with ,suggested criteria (Modified Georgette), to a checklist format, (Pardee Checklist), to a point system multiple dimension tool with specific criteria, Poland Point System. These methods were used by six re!iistered nurse raters in a general hospital assigned to two medical divisions. Each nurse rated a randomly designated patient using all four of the methods. Pearson Product-moment correlations were calculated to determine the inter-rater reliability. Of the four tools, the intuitive, subjective method of identifying patient needs was the least reliable with correlations of .18 to .85 and an overall mean correlation of .57. The reliability coefficients for the Modified Georgette ranged from .00 to 1.00 with an overall mean correlation of .73 and the Pardee Checklist had correlations between .90 and 1.00 with an overall mean correlation of .70. The Poland Point System, which is the most structured tool and has the greatest specificity of patient needs, was used most reliably as indicated by correlations of .49 to .99 and an overall mean correlation of .83. 19 A possible source of error that may have been introduced into this study is the effect of repeated use of the various tools when assessing needs of patients. The overall weighted mean increased from .57 to .73, to .70, to .83 on tools one, two, three, and four respectively, In every case the four tools were administered in a one, two, three, four order subsequently more data was available for assessment by the time a rater reached tool four. Repeated use of these tools also increased the individual raters consistency with other nurses. Those nurses with the most numbers of observations had the highest reliability coefficients. A recommendation is made that the study be repeated using one tool per patient by raters to establish whether the observed increase in reliability is due to interna I features of the tool or the learned behavior of the raters. Experience seemed to be another factor that influenced the rater's ability to use the various tools. For example, nurses with less than one year of experience were much more consistent on the more structured tool, the Poland Point System. Raters with experience of three to twenty-five years and raters with less than one year of experience were not consistent with each other. Further study on the role of experience in assessing patient needs is indicated and deemed necessary to fully understand these findings. Nurses who assumed administrative roles in addition to their clinical assignment were not as consistent in their use of the various tools as were those nurses with primary clinical assignments. Further study is suggested on the time involved and influence of administrative tasks on clinical responsibilities. Raters intra-divisionally had high reliability coefficients. This study suggested high agreement between raters on a single division or unit but that low consistency existed between raters inter-divisionally. This suggests the influence of leadership and peer group suggestion on assessment of patient needs. The scores do not suggest whether the influence was positive, only that raters on a unit tended to assess a patient in a similar way. 20 With the limitations of sample size and the lack of consistency among raters on the various tools, it is recommended that further investigation be made on the reliability of these tools. Once that has been completed subsequent validity studies need to be made to assure that nurses are measuring what needs to be measured. To quote Sanford (1965, p.131): Once we succeed in making a reliable observation or in setting down a reliable test score, we still face the question of what it means, if anything. REFERENCES 22 REFERENCES American Psychological Association. Publication manual of the American Psychological Association. Washington, D. C. American Psychological Association, 1967. Clark, L. Can the nursing workload be measured? Supervisor nurse, 1970, (12) 14-24. Gaito, J. Scale classification and statistics. Heerman, E. F. and Broskamp, L. A. (Eds.) Readings in statistics for the behavioral sciences. Englewood Cliffs, New Jersey: Prentice-Hall Inc. 1970,64-66. Georgette, J. K. Staffing by patient classification. Nursing clinics of North America, 1970, 5(2) 329-335. Guilford, J. Fundamental Statistics in Psychology and Education, New York: McGrawHill, 1965. Latter-day Saints Hospital. General guidelines for use in patient classification. UnpubIished manuscript, 1972. Pardee, G. Classifying patients to predict staff requirements. American journal of nursing, 1968,68(3) 517-519. Poland, M. PETO: A system for assessing and meeting patient care needs. American journal of nursing, 1970, 70(7) 1979-85. Sanford, F. H. Psychology: a scientific study of man. Calif. 1965, 3rd edition. Wadsworth Pub. Co. Belmont, Wolfe, H. and Young J. P. Staffing the nursing unit: controlled variable staffing. Nursing research 1965, 14(3) (a). Wolfe, H. and Young, J. P. Staffing the nursing unit: the mUltiple assignment technique. Nursing research 1965, 14(4) (b). Young, John P. Information nexus guides decision system. Modernhospital 1966, 106 (2) 10 1-1 05. APPENDICES APPENDIX A LATTER DAY SAINTS HOSPITAL GUIDELINES FOR USE OF PATIENT CATEGORIZATION AREA OF CARE CATEGORY I CATEGORY II CATEGORY III CATEGORY IV 1. Activities of Daily Living A. Eating 1. Feeds self or needs little help. 1. Needs some help in preparing food for eating. May need encouragement. 1. Cannot feed self but is able to chew and swallow all right. 1. Cannot feed self at all and may have difficulty chewing and swallowing food. B. Grooming 1. Almost entirely self-sufficient. 1. Needs some help with bathing, oral hygiene, hair combing, etc. 1. Unable to do much for self. 1. Completely dependent. C. Excretion 1. Up and to BR 1. Needs some help in getting up to BR or using urinal. l. In bed and needs bedpan 1. Completel dependent. 1. Needs some help with adjustment of position of bed (tubes, 1. V. ) 1. Cannot turn without help, get drink, adjust position of extremities, etc. 1. Completely dependent. 1. Acute symptoms, severe emotional reaction to illness or surgery, more than one acute med/surg. problem, severe or frequent incontinence. 1. Critically ill,may have a very severe emotional reaction. alone or almost alone. D. Comfort II.General Health 1. Self-sufficient. 1. Good--in for a 1. Mild symptoms, more diagnostic prothan one mild illness, cedure or a simple mild debility, mild emotional reaction, treatment or surgery procedure mild incontinence (not (biopsy, D & C) more than once/shift) simple and minor. or urinal to be placed and removed after use. May be able to partially turn or lift self. LATTER-DAY SAINTS HOSPITAL GUIDELINES FOR CATEGORIZATION II 1. Treatments 1. Simple--super1. vised ambulation, pedangle, simple dressing, test procedure preparation not requiring medication, reinforcement of surgical dressing. X-pad, vital signs once/shift. Any Category I 1. Any treatment more than 1. treatment more than twice/shift, medicated once/shift. Foley i.v. IS, complicated drescatheter care, I & 0, sings, sterile proced., bladder irrigations, care of tracheotomy, Harris sitzbaths, compresses, flush, suctioning, tube test procedures requirfeeding, vital signs more ing medication or than every 4 hours. follow-ups, simple (nonmedicated) I.V.'s Clinitest-Acetest, simple enema for evacuation, vital signs every 4 hours. IV.Medications 1. Simple, routine, 1. not needing pre or post evaluation, P.R.N. medications no more than once/ shift. Diabetic, cardiac, 1. hypohypertensive, diuretic, anticoagulant medications, P.R.N. medications more than once/shift, medication needing pre or post evaluation. V. Teaching and Emotional Support 1. Routine follow-up 1. Initial teaching of 1. teaching, patients care of ostomies, new with no unusual or diabetics, tubes that adverse emotional will be in place for reactions. periiods of timp.,etc. pat tents with conditions that require a major change in eating, living, or excretory practices. Patients with mild adverse reactions to the illness--depressions, overly demanding, etc. Any elaborate or delicate procedure, one requiring 2 nurses, vital signs" more often than every 2 hours. Unusual amount of Category 1. More intensive Category II medications, control of III medications, I.V.'s refractory diabetics (need with frequent, close to be monitored more than observation and regulaevery 4 hours) • tion. More intensive Category II items, teaching of apprehensive or mildly resistive patients, care of moderately upset or apprehensive patients, confused or disoriented patients. 1. Teaching of resistive patients, care and support of patients with severe emotional'reactions AREA OF CARE II. Diagnostic Evaluation CATEGORY I 1. Preparation for simple diagnostic tests & procedures (i.e., phisohex bath, routine enema, G.B. pills) CATEGORY II CATEGORY III 1. Preparing for 1. Assistance of one nurse with diagnostic multiple tests or procedures, or gatherprocedure requiring ing multiple speci15-30 minutes. ments. (i.e.,douche & enema, time scrub for skin prep., enemas till clear with no problems.) CATEGORY IV 1. Diagnostic procedure requiring nurse assistance of more than 30 minutes. ~,J ('J\ APPENDIX B LATTER-DAY SAINTS HOSPITAL GENERAL GUIDELINES FOR USE IN PATIENT CLASSIFICATION CLASSIFICATION I A patient requiring minimal nursing care whose condition is characterized by: 1. 2. 3. 4. 5, 6. Self-sufficient in activities of daily living. Few diagnostic tests. Simple, uncomplicated treatments. Few medications. Acceptable behavior patterns Requirements for simple orientation and teaching to meet patient's needs. CLASSIFICATION II A patient requiring a moderate amount of nursing care whose condition is characterized by: 1. 2. 3. 4. 5. 6. Need for assistance in activities of daily living. Preparation for multiple tests or procedures or gathering of multiple specimens. Periodic treatment and/or observation. Periodic administration of medications requ~r~ng evaluation. Occasional deviations from acceptable behavior patterns. Requirements for more detailed teaching. CLASSIFICATION III A patient requiring a considerable amount of nursing care whose condition is characterized by: 1. 2. 3. 4. 5. 6. Almost complete or total care required as to activities of daily living, Frequent, time consuming diagnostic tests and procedures. Frequent treatments and/or observation. Numerous medications. Signifcant deviation from acceptable behavior patterns. Requiring specific teaching. CLASSIFICATION IV A patient requiring complete nursing care whose condition is characterized by: 1. 2. 3. 4. 5. Total dependency on the nurse for activities of daily living. Excessively time consuming diagnostic tests and. procedures. Comprehensive treatments and/or close observation. Comprehensive medication regime. Severe deviation from acceptable behavior requiring intensive emotional support. 28 PARDEE PATIENT CLASSIFICATION CHECKLIST DIET Regular Special N.P.O. ACTIVITY Ambulate B.R.P. Commode Positioning/Turn or/ROM VITAL SIGNS q4H - qid q6H I.P.P.B. q2H-q4H Rebreath. /Heat. Neb. URINE· Spec. Brav ./Frac ./Vol. M( )NITOR DRESSINGS DRAINAGE SUCTION N.G./N.T. ISOLATION I Dressing INCONTINENCE DAY OF ADMISSION ISOLATION DAY OF SURGERY/OR MAJ.MED.EXAM ........................:-.;.....;.....::....,;........-..;.....;.....;~_ _.......l 29 APPENDIX C POLAND POINT SYSTEM ELEMENTS OF PHYSICAL CARE AND THEIR LEVELS OF INTENSITY AS DENOTED BY POINT ASSESSMENT Category Criterion Diet Feeds self without supervision, or parent feeds patient. Feeds self with supervision by staff. Feeds self but needs constant presence of staff, or gastrostomy feeding q4h. Total feeding by personnel, instructing the parent, continuous I. V., or blood transfusion. Tube feedings more frequently than q4h. Toileting-Output Vital Signs and Measure ments Respiratory Aids Suction Toilets without supervision. Toilets with supervision, specimen to be collected, or uses bedpan. Up to toilet with stand·by supervision, or output measurement every hour ,or daily colostomy irrigation. Incontinent, average output. Incontinent with diarrhea, or immediate postoperative colostom~' or urethrostomy, or drainage with frequent dressing change. Routine-daily temperature, pulse, and respiration. Vital signs q4h, or night observation q1 h. Vital signs monitored plus hypothermia, or vital signs q2h. Vital signs and observation every hour, orvital signs monitored plus hypothermia and neurologic evaluation. BP, pluse, respirations, and neurologic evaluation q%h. Beside humidifier, or "blow bottle," Mist or Croupette when sleeping, or cough and deep breathe q2h, or IPPB without supervision q4h. Continuouse oxygen, or cough and deep breathe q1 h, or continuouse assisted ventilation. Mechanical respiratory aid, or IPPB with supervision q4h. PPB continuously with intermittent Ambu "bagging." Routine postoperative standby. Nasopharyngeal or oral suction prn. Tracheostomy suction every hour, or nasogastric tube irrigation q2h. Tracheostomy suction q%h, patient responsive Tracheostomy suction qY2h, patient not responsive Points Assessed 4 8 12 2 4 8 12 1 2 4 8 12 2 4 8 12 1 2 4 8 12 30 Cleanliness Bathes self, bed straighened. Bathes self with help or supervision, daily change of bed. Bathed and dressed by personnel, or partial bath given, daily change of linen. Bathed and dressed by personnel, special sk in care, occupied bed. Turning and/or Assisted Activity Up in chair with assistance once in 8 hours. Up in chair with assistance twice in 8 hours, or wllking with assistance. Bedfast with assistance in turning q2h, or up walking with assistance of two people twice in 8 hours. Bedfast with assistance in turning q 1 h. Turning on Foster frame or CircOlectric bed q1 h. Points PCU's 4-11 12-19 1 Hours 1 2 2 3 4 5 3 4 5 6 7 20-27 28-35 36-43 44-51 52-59 GO-67 68-75 76--80 6 7 8 9 10 8 9 10 TO ARRIVE AT an intensity of care for a patient, add the appropriate points from each category and use this total to find the PCU's (patient care units in hours) from the conversion table. Point assessment was worked out from a time study, and flexibility was built in by rounding out minutes to the hour to allow for intangibles and unplanned incidents. 1 2 4 8 2 4 8 12
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6wh2rvt