Online Course

NRSG 790 - Methods for Research and Evidence-Based Practice

Module 3: Research Methods

Estimating Reliability and Validity

Type Procedure Reliability Index Appropriate to Use
Test Retest/
Repeated Measures:Consistency of performance a measure elicits from one group of subjects on two separate test occasions
Test administered under standardized conditions to a single group of subjects representative of the group for which the measure was designed. Two weeks later, same test given under same conditions to same group of subjects. Correlation coefficient (rxy) determined Coefficient of stability, reflects extent to which the measure rank orders subjects the same on two separate occasions. Closer coefficient (rxy ) is to 1.00 the more stable the measure For tools that measure characteristics relatively stable over time & for clinical tools that produce only 1 score that can be given 2 or more times.
Parallel Form:Consistency of performance alternate forms of a test elicit from one group of subjects on one testing occasion. Two tests are parallel if: 1) constructed using same objectives and procedures; 2) have approximately = means;3) equal correlation with a third variable; 4) have equal means and standard deviations. Two forms given to one group on same occasion or on two separate occasions. Correlation between 2 sets of scores determined using rxy Values above 0.80 provide evidence two forms can be used interchangeably. If both forms given same time rxy reflects form equivalence; if given 2 times reflects stability as well. Whenever 2 or more tools are available, is the preferred method.
Internal Consistency:
Consistency of performance of a group of individuals across the items on a single test.
Test administered under standardized conditions to representative group on one occasion. Alpha coefficient reflects extent to which performance on any one item is a good indicator of performance on any other item. Alpha coefficient is preferred because it provides a single value for any given data set; is equal in value to the mean of the distribution of all possible split –half coefficients for data set. For interviews, multi-item scales.
Interrater: consistency of performance among raters or judges (or degree of agreement between them) in assigning scores to the objects, responses, or observational data being judged. Two or more competent raters score responses to a set of subjective  items . When two raters employed, rxy is used to determine the degree of agreement between them. If more than two raters alpha may be used. Kappa coefficient used when aim is to compare ratings of two judges classifying patients into diagnostic categories. A coefficient =0 indicates complete lack of agreement; coefficient = 1.00 indicates complete agreement. Agreement means the relative ordering of scores assigned by raters. Raters often trained to a high degree of agreement in scoring prior to data collection. Interrater reliability is especially useful with behavioral observations &/or proxy judgments of a patients state.
Intrarater:
Consistency with which one rater assigns scores to a single set of test item responses on two occasions.
One rater assigns scores to a subjective measure using a fixed scale and without recording answers on the scoring sheet. About 2 weeks later, responses are shuffled and rescored using the same procedure as on occasion 1. Agreement between the two scorings is assessed using rxy. Zero value for rxy is interpreted as inconsistency; a value of 1.00 as complete consistency. In determining extent to which an individual applies the same criteria to rate responses on different occasions especially with subjective measures; enables one to determine the degree to which ratings are influenced by temporal factors.

Adapted from Waltz, C F, Strickland, OL, Lenz, ER. (2005) Measurement in nursing and health research (3rd ed) 138-145. New York: Springer Publishing

Estimating Validity
Type Procedure Appropriate to Use
Content:
Determine how subject performs at present in a domain of situations the tool intends to represent
Review of objectives and items on tool by panel of experts and content validity index (CVI) or percent agreement  determined. Where performance on a small set of items in one tool serves as lone indicator of how well content domain is represented
Construct:
Infer degree to which subject possesses some hypothetical trait or quality presumed to be reflected in performance on tool.
Comparison of subject scores on tool with scores of subjects known to have high and low amounts of attribute using D index, factor analysis, MTMM In cases where it is believed that individual differences in performance on the tool reflect differences in the trait about which inference is being made
Criterion-related:
Forecast subject’s future or present standing on variable of particular significance that is different from the tool
Correlation of scores on the tool with present or future scores on a second measure. ROC curves to examine predictive value of diagnostic test. For tool used to predict present or future performance.

Adapted from Waltz,CF, Strickland, OF,& Lenz, ER.(2005) Measurement in nursing and health research (3rd ed) 138-145. New York: Springer Publishing

TRY THIS

Reflect on the following scenarios.

  1. You are asked about the reliability coefficient on a recent standardized test. The coefficient was reported as .89. How would you explain that .89 is an acceptable coefficient?
  2. You are looking for an assessment instrument to measure reading ability. They have narrowed the selection to two possibilities -- Test A provides data indicating that it has high validity, but there is no information about its reliability. Test B provides data indicating that it has high reliability, but there is no information about its validity. Which test would you recommend? Why?

This website is maintained by the University of Maryland School of Nursing (UMSON) Office of Learning Technologies. The UMSON logo and all other contents of this website are the sole property of UMSON and may not be used for any purpose without prior written consent. Links to other websites do not constitute or imply an endorsement of those sites, their content, or their products and services. Please send comments, corrections, and link improvements to nrsonline@umaryland.edu.