- CFA Exams
- 2023 Level I > Topic 1. Quantitative Methods > Reading 2. Organizing, Visualizing, and Describing Data
- 4. Summarizing Data Using a Contingency Table
Why should I choose AnalystNotes?
Simply put: AnalystNotes offers the best value and the best product available to help you pass your exams.
Subject 4. Summarizing Data Using a Contingency Table
To find patterns between variables we can use a contingency table, which is a type of table in a matrix format that displays the frequency distribution of the variables in terms of joint frequencies and marginal frequencies. They are heavily used in survey research, business intelligence, engineering and scientific research.
The table below shows the favorite leisure activities for 50 adults - 20 men and 30 women.
Entries in the "Total" row and "Total" column are called marginal frequencies. They represent the frequency distribution for each variable. Entries in the body of the table are called joint frequencies.
One benefit of having data presented in a contingency table is that it allows one to more easily perform basic probability calculations.
There's a 16/50 (32%) probability that the person sampled likes TV as his/her favorite leisure activity, while the probability that a random participant is female is 30/50 (60%). What's more, computing conditional probabilities is made easier using contingency tables, e.g., the probability that a person's favorite leisure activity is to dance given that the person is male is 2/20=10%, while the conditional probability that a person is male given that sports are preferred is 10/16 (62.5%).
One application is for evaluating the performance of a classification model (using a confusion matrix). It is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known.
Another is to investigate a potential association between two categorical variables by performing a chi-square test of independence. For example, we can test whether the two variables being examined - in this case, gender and favorite way to eat ice cream - are actually independent as they've been assumed throughout. This is done by computing, for each cell, the expected frequency E, comparing it to the observed frequency O, and then performing a chi-squared test.
Note the los says "interpret a contingency table.": you probably don't need to know how to perform a chi-square test of independence for the exam.
Study notes from a previous year's CFA exam:
4. Summarizing Data Using a Contingency Table