CFA Practice Question

There are 363 practice questions for this topic.

CFA Practice Question

There are different methods for dealing with imbalanced datasets. They are:

I. Oversample minority class
II. Undersample majority class
III. Generate synthetic samples
Correct Answer: I, II and III

Imbalanced classes are a common problem in machine learning classification where there are a disproportionate ratio of observations in each class. Oversampling can be defined as adding more copies of the minority class. Oversampling can be a good choice when you don't have a ton of data to work with.

Undersampling can be defined as removing some observations of the majority class. Undersampling can be a good choice when you have a ton of data -think millions of rows. But a drawback is that we are removing information that may be valuable. This could lead to underfitting and poor generalization to the test set.

Generate synthetic samples: it's important to generate the new samples only in the training set to ensure our model generalizes well to unseen data.

User Contributed Comments 0

You need to log in first to add your comment.