Online Course
NRSG 795: BIOSTATISTICS FOR EVIDENCE-BASED PRACTICE
Module 8: Associations Between Nominal or Ordinal Variables
Statistically Testing Association: Chi-Square
Chi-Square is one of the most common statistics to test association between nominal or ordinal variables. It is based on the differences between what is observed in the contingency tables cells and what would be expected if the null hypothesis were true (i.e., totally unrelated). The chi square statistic reflects this difference. The Chi Square statistic is compared to the critical value in a Chi-square distribution with a specified degree of freedom (based on sample size). If the calculated value is greater or equal to the critical value, you can conclude that a significant association exists. Unlike the Pearson’s r and Spearman’s rho, the Chi-square statistic does not give any information about the strength of the relationship. It only conveys the existence or non-existence of a relationship between nominal or ordinal variables (example). When the chi square test leads to the rejection of the null hypothesis, the direction of the association between the variables sometimes can be determined by examining the percentages.
Chi square test is used when both the independent and dependent variables are nominal level or when an ordinal variable has only a few categories (e.g., very low, low or normal birth weight). Interval or ratio data can be recoded into grouped categories, however it is often preferable to use the more powerful parametric tests that can handle the interval/ratio level data.
The idea behind the chi square test is to compare the observed frequencies with the frequencies that would be expected if the null hypothesis of no association / statistical independence were true [click here to see a hand calculation of a simple chi square].
Specifications and assumptions
The sample size must be large enough to fairly represent the population from which it is drawn. At least 20 observations should be used, with at least five members in every individual category.
- Observations are randomly and independently sampled from population. Each participant must qualify for one and only one cell in the contingency table. If this is violated a McNemar test can be used.
- Calculate chi-square using real counts, not percentages or ratios, so as to adequately represent the true number of observations counted.
- The chi square requires that the expected frequency of each cell in the contingency table be greater than zero (recommend be at least 5).
When a sample size is too small or the expected cells in the contingency table have too few observations (<5) a special type of chi square- Fisher’s exact test- is performed. Some people advocate if there are only 5-9 expected cases in a cell, then a Yates’ continuity correction should be used.
Effect Size
The chi square statistic only provides information about the existence of a relationship. It does not tell us anything about the magnitude. To obtain an idea of the strength of the relationship we can use several indexes. The most commonly used one is the odds ratio (more on that in next sections).
- phi coefficient is used when both of the nominal variables under consideration have exactly two possible values
- Cramer’s V statistic is used when the number of possible values for the two variables is unequal, yielding a different number of rows and columns in the data matrix (2x3, 3x5, etc).
Presenting Results
This is an example of how results may be written for a Chi-Square test for independence:
The crosstabulation of smoking by gender showed that the proportion of women who smoked (18%) was similar to men (20%). A Chi-square test for independence indicated no significant association between gender and smoking (p=.34, df=1, n=220).
Note: The results from a chi square test includes a statement describing what is going on via a comparison of percentages in addition to presenting the p-value. This description helps someone who doesn’t know stats understand the findings (similar to reporting results for t-test (akin to 2x2 when have categorical variables) and ANOVA (akin to a 2x3 crosstab when have categorical variables).
Required Videos
- Chi-Square Tests (11:52) https://www.youtube.com/watch?v=WXPBoFDqNVk
Learning Activities
- Practice interpreting findings by answering these questions about the table shown below.
Check your responses here.
- Practice how to run a chi square. This Excel spreadsheet represents data from a sample of patients with heart failure. Answer the following questions :
- How many individuals are included in the sample?
- How many males and females are in the sample?
- Is there an association between gender and group membership?
- Is there an association between gender and self care?
- Check your answers
Guide for those choosing to use IntellectusStatistics | Guide for those choosing to use Excel |
---|---|
Refer to the hint sheet and videos in the resource section for how to run a chi square Chi-square test |
First watch the pivot table tutorial, then refer to the hint sheet and video for how to run a chi square in Excel Additional resource : Making crosstabulations with pivot tables in excel (12:02) https://www.youtube.com/watch?v=OXuQnro0UnE • Hints of how to run a chi square in EXCEL |
This website is maintained by the University of Maryland School of Nursing (UMSON) Office of Learning Technologies. The UMSON logo and all other contents of this website are the sole property of UMSON and may not be used for any purpose without prior written consent. Links to other websites do not constitute or imply an endorsement of those sites, their content, or their products and services. Please send comments, corrections, and link improvements to nrsonline@umaryland.edu.