2026 CFA Level II Exam: CFA Study Preparation

CFA Exams
2026 Level II
Topic 1. Quantitative Methods
Learning Module 6. Machine Learning
Subject 4. Unsupervised Machine Learning Algorithms

Why should I choose AnalystNotes?

Simply put: AnalystNotes offers the best value and the best product available to help you pass your exams.

Subject 4. Unsupervised Machine Learning Algorithms PDF Download

Principal Components Analysis

Principal-component analysis (PCA) is often used to reduce multidimensional data sets to a lower number of dimensions for analysis. PCA retains those characteristics of the data set that contribute most to its variance, by keeping lower-order principal components (the ones that explain a large part of the variance present in the data) and ignoring higher-order ones (that do not explain much of the variance present in the data). Such low-order components often contain the most important aspects of the data.

Eigenvectors are used to define the principal components (i.e. the new uncorrelated composite variables). An eigenvalue gives the proportion of total variance in the initial data that is explained by each eigenvector and its associated principal component.

The challenge is to decide how many principal components to retain, as there is always a trade off. The main drawback is that these components cannot be easily labeled or directly interpreted.

Clustering

Clustering focuses on sorting observations into groups (clusters) such that observations in the same cluster are more similar to each other than they are to observations in other clusters. Groups are formed based on a set of criteria that may or may not be pre-specified.

Cohesion: observations inside each cluster are similar.
Separation: observations in two different clusters are not similar.

Euclidian distance is the straight-line distance between two points, and can be used to define "similarity". The smaller the distance, and the more similar the observations. Once the distance is determined, the groups can be created.

Two popular clustering approaches are discussed below.

K-means partitions observations into a fixed number (k) of non-overlapping clusters. Each cluster is characterized by its centroid, and each observation belongs to the cluster with the centroid to which that observation is closest.

The algorithm follows an iterative process until it find these clusters that has minimized intra-cluster distance and maximized inter-cluster distance. It runs fast, and works well in large data sets. The final result, however, depends on the number of pre-determined clusters, and the initial location of the centroids.

Hierarchical clustering is used to build a hierarchy of clusters. The initial data set and the final set are the same. Two main strategies are used to define the intermediary clusters.

Agglomerative (bottom-up) hierarchical clustering begins with each observation being its own cluster. Then, the algorithm finds the two closest clusters, defined by some measure of distance, and combines them into a new, larger cluster. This process is repeated until all the observations are clumped into a single cluster.

Divisive (top-down) hierarchical clustering starts with all observations belonging to a single cluster. The observations are then divided into two clusters based on some measure of distance. The algorithm then progressively partitions the intermediate clusters into smaller clusters until each contains only one observation.

LOS Quiz

User Contributed Comments 0

You need to log in first to add your comment.

You have a wonderful website and definitely should take some credit for your members' outstanding grades.

Colin Sampaleanu

My Own Flashcard

No flashcard found. Add a private flashcard for the subject.

Add

Actions

Take a Quiz
PDF Download
Previous LOS
Next LOS
Print notes
Mark as complete
Bookmark this LOS
Add my flashcard