DSC 140B
Problems tagged with dimensionality-reduction

Problems tagged with "dimensionality-reduction"

Problem #078

Tags: pca, dimensionality-reduction, eigenvectors, quiz-04, lecture-06

Let \(C\) be the sample covariance matrix of a data set in \(\mathbb{R}^3\), and suppose \(\vec{u}^{(1)}, \vec{u}^{(2)}, \vec{u}^{(3)}\) are orthonormal eigenvectors of \(C\) with eigenvalues \(\lambda_1 = 9, \lambda_2 = 4, \lambda_3 = 1\), respectively, where:

\[\vec{u}^{(1)} = \begin{pmatrix} 2/3 \\ 2/3 \\ 1/3 \end{pmatrix}, \quad\vec{u}^{(2)} = \begin{pmatrix} -2/3 \\ 1/3 \\ 2/3 \end{pmatrix}, \quad\vec{u}^{(3)} = \begin{pmatrix} 1/3 \\ -2/3 \\ 2/3 \end{pmatrix}\]

Suppose a data point is \(\vec{x} = \begin{pmatrix} 3 \\ 6 \\ 6 \end{pmatrix}\).

If PCA is performed to reduce the dimensionality from 3 to 2, what is the new representation of \(\vec{x}\)?

Solution

\(\begin{pmatrix} 8 \\ 4 \end{pmatrix}\) In PCA, to reduce from \(d\) dimensions to \(k\) dimensions, we project each data point onto the top \(k\) eigenvectors (those with the largest eigenvalues).

Here, the top 2 eigenvectors are \(\vec{u}^{(1)}\)(with \(\lambda_1 = 9\)) and \(\vec{u}^{(2)}\)(with \(\lambda_2 = 4\)).

The new representation is obtained by computing the dot product of \(\vec{x}\) with each of the top \(k\) eigenvectors:

$$\begin{align*}\vec{x}\cdot\vec{u}^{(1)}&= 3 \cdot\frac{2}{3} + 6 \cdot\frac{2}{3} + 6 \cdot\frac{1}{3} = 2 + 4 + 2 = 8 \\\vec{x}\cdot\vec{u}^{(2)}&= 3 \cdot\left(-\frac{2}{3}\right)+ 6 \cdot\frac{1}{3} + 6 \cdot\frac{2}{3} = -2 + 2 + 4 = 4 \end{align*}$$

Therefore, the new representation is:

\[\vec{z} = \begin{pmatrix} \vec{x} \cdot \vec{u}^{(1)} \\ \vec{x} \cdot \vec{u}^{(2)} \end{pmatrix} = \begin{pmatrix} 8 \\ 4 \end{pmatrix}\]

Problem #079

Tags: pca, dimensionality-reduction, eigenvectors, quiz-04, lecture-06

Let \(C\) be the sample covariance matrix of a data set in \(\mathbb{R}^4\), and suppose \(\vec{u}^{(1)}, \vec{u}^{(2)}, \vec{u}^{(3)}, \vec{u}^{(4)}\) are orthonormal eigenvectors of \(C\) with eigenvalues \(\lambda_1 = 16, \lambda_2 = 9, \lambda_3 = 4, \lambda_4 = 1\), respectively, where:

\[\vec{u}^{(1)} = \begin{pmatrix} 0 \\ 1/3 \\ 2/3 \\ 2/3 \end{pmatrix}, \quad\vec{u}^{(2)} = \begin{pmatrix} 1/3 \\ 0 \\ 2/3 \\ -2/3 \end{pmatrix}, \quad\vec{u}^{(3)} = \begin{pmatrix} -2/3 \\ -2/3 \\ 1/3 \\ 0 \end{pmatrix}, \quad\vec{u}^{(4)} = \begin{pmatrix} 2/3 \\ -2/3 \\ 0 \\ 1/3 \end{pmatrix}\]

Suppose a data point is \(\vec{x} = \begin{pmatrix} 3 \\ 3 \\ 6 \\ 6 \end{pmatrix}\).

If PCA is performed to reduce the dimensionality from 4 to 2, what is the new representation of \(\vec{x}\)?

Solution

\(\begin{pmatrix} 9 \\ 1 \end{pmatrix}\) In PCA, to reduce from \(d\) dimensions to \(k\) dimensions, we project each data point onto the top \(k\) eigenvectors (those with the largest eigenvalues).

Here, the top 2 eigenvectors are \(\vec{u}^{(1)}\)(with \(\lambda_1 = 16\)) and \(\vec{u}^{(2)}\)(with \(\lambda_2 = 9\)).

The new representation is obtained by computing the dot product of \(\vec{x}\) with each of the top \(k\) eigenvectors:

$$\begin{align*}\vec{x}\cdot\vec{u}^{(1)}&= 3 \cdot 0 + 3 \cdot\frac{1}{3} + 6 \cdot\frac{2}{3} + 6 \cdot\frac{2}{3} = 0 + 1 + 4 + 4 = 9 \\\vec{x}\cdot\vec{u}^{(2)}&= 3 \cdot\frac{1}{3} + 3 \cdot 0 + 6 \cdot\frac{2}{3} + 6 \cdot\left(-\frac{2}{3}\right)= 1 + 0 + 4 - 4 = 1 \end{align*}$$

Therefore, the new representation is:

\[\vec{z} = \begin{pmatrix} \vec{x} \cdot \vec{u}^{(1)} \\ \vec{x} \cdot \vec{u}^{(2)} \end{pmatrix} = \begin{pmatrix} 9 \\ 1 \end{pmatrix}\]

Problem #080

Tags: pca, dimensionality-reduction, eigenvectors, quiz-04, lecture-06

Let \(C\) be the sample covariance matrix of a data set in \(\mathbb{R}^5\), and suppose \(\vec{u}^{(1)}, \vec{u}^{(2)}, \vec{u}^{(3)}, \vec{u}^{(4)}, \vec{u}^{(5)}\) are orthonormal eigenvectors of \(C\) with eigenvalues \(\lambda_1 = 25, \lambda_2 = 16, \lambda_3 = 9, \lambda_4 = 4, \lambda_5 = 1\), respectively, where:

\[\vec{u}^{(1)} = \begin{pmatrix} 1/3 \\ 2/3 \\ 2/3 \\ 0 \\ 0 \end{pmatrix}, \quad\vec{u}^{(2)} = \begin{pmatrix} 2/3 \\ -1/3 \\ 0 \\ 2/3 \\ 0 \end{pmatrix}, \quad\vec{u}^{(3)} = \begin{pmatrix} -2/3 \\ 0 \\ 1/3 \\ 2/3 \\ 0 \end{pmatrix}, \quad\vec{u}^{(4)} = \begin{pmatrix} 0 \\ 2/3 \\ -2/3 \\ 1/3 \\ 0 \end{pmatrix}, \quad\vec{u}^{(5)} = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 1 \end{pmatrix}\]

Suppose a data point is \(\vec{x} = \begin{pmatrix} 3 \\ 6 \\ 6 \\ 3 \\ 2 \end{pmatrix}\).

If PCA is performed to reduce the dimensionality from 5 to 3, what is the new representation of \(\vec{x}\)?

Solution

\(\begin{pmatrix} 9 \\ 2 \\ 2 \end{pmatrix}\) In PCA, to reduce from \(d\) dimensions to \(k\) dimensions, we project each data point onto the top \(k\) eigenvectors (those with the largest eigenvalues).

Here, the top 3 eigenvectors are \(\vec{u}^{(1)}\)(with \(\lambda_1 = 25\)), \(\vec{u}^{(2)}\)(with \(\lambda_2 = 16\)), and \(\vec{u}^{(3)}\)(with \(\lambda_3 = 9\)).

The new representation is obtained by computing the dot product of \(\vec{x}\) with each of the top \(k\) eigenvectors:

$$\begin{align*}\vec{x}\cdot\vec{u}^{(1)}&= 3 \cdot\frac{1}{3} + 6 \cdot\frac{2}{3} + 6 \cdot\frac{2}{3} + 3 \cdot 0 + 2 \cdot 0 = 1 + 4 + 4 + 0 + 0 = 9 \\\vec{x}\cdot\vec{u}^{(2)}&= 3 \cdot\frac{2}{3} + 6 \cdot\left(-\frac{1}{3}\right)+ 6 \cdot 0 + 3 \cdot\frac{2}{3} + 2 \cdot 0 = 2 - 2 + 0 + 2 + 0 = 2 \\\vec{x}\cdot\vec{u}^{(3)}&= 3 \cdot\left(-\frac{2}{3}\right)+ 6 \cdot 0 + 6 \cdot\frac{1}{3} + 3 \cdot\frac{2}{3} + 2 \cdot 0 = -2 + 0 + 2 + 2 + 0 = 2 \end{align*}$$

Therefore, the new representation is:

\[\vec{z} = \begin{pmatrix} \vec{x} \cdot \vec{u}^{(1)} \\ \vec{x} \cdot \vec{u}^{(2)} \\ \vec{x} \cdot \vec{u}^{(3)} \end{pmatrix} = \begin{pmatrix} 9 \\ 2 \\ 2 \end{pmatrix}\]

Problem #081

Tags: lecture-06, dimensionality-reduction, eigenvectors, pca

Consider the following data set of 5 points in \(\mathbb{R}^3\):

\[\vec{x}^{(1)} = \begin{pmatrix} 3 \\ 1 \\ 2 \end{pmatrix}, \quad\vec{x}^{(2)} = \begin{pmatrix} -1 \\ 2 \\ 0 \end{pmatrix}, \quad\vec{x}^{(3)} = \begin{pmatrix} 2 \\ -1 \\ 1 \end{pmatrix}, \quad\vec{x}^{(4)} = \begin{pmatrix} -2 \\ 0 \\ -1 \end{pmatrix}, \quad\vec{x}^{(5)} = \begin{pmatrix} -2 \\ -2 \\ -2 \end{pmatrix}\]

Perform PCA on this data set to reduce the dimensionality from 3 to 2. What are the new representations of each data point?

Note: You are not expected to compute the eigenvalues and eigenvectors by hand. Use software (such as numpy.linalg.eigh) to find them.

Solution

First, we form the data matrix and compute the sample covariance matrix:



>>> import numpy as np
>>> X = np.array([
...     [3, 1, 2],
...     [-1, 2, 0],
...     [2, -1, 1],
...     [-2, 0, -1],
...     [-2, -2, -2]
... ])
>>> mu = X.mean(axis=0)
>>> Z = X - mu
>>> C = 1 / len(X) * Z.T @ Z

Next, we compute the eigendecomposition of \(C\):



>>> eigenvalues, eigenvectors = np.linalg.eigh(C)
>>> idx = eigenvalues.argsort()[::-1]# sort descending
>>> eigenvalues = eigenvalues[idx]
>>> eigenvectors = eigenvectors[:, idx]
>>> eigenvalues
array([6.49, 1.89, 0.01])

To reduce to 2 dimensions, we project onto the top 2 eigenvectors:



>>> U2 = eigenvectors[:, :2]
>>> Z_new = Z @ U2
>>> Z_new
array([[-3.74, -0.13],
       [ 0.34, -2.21],
       [-1.93,  1.51],
       [ 2.16, -0.57],
       [ 3.17,  1.40]])

The new representations are (rounded to two decimal places):

\[\vec{z}^{(1)} = \begin{pmatrix} -3.74 \\ -0.13 \end{pmatrix}, \quad\vec{z}^{(2)} = \begin{pmatrix} 0.34 \\ -2.21 \end{pmatrix}, \quad\vec{z}^{(3)} = \begin{pmatrix} -1.93 \\ 1.51 \end{pmatrix}, \quad\vec{z}^{(4)} = \begin{pmatrix} 2.16 \\ -0.57 \end{pmatrix}, \quad\vec{z}^{(5)} = \begin{pmatrix} 3.17 \\ 1.40 \end{pmatrix}\]

Note: The signs of the eigenvectors are not unique; flipping the sign of an eigenvector will flip the sign of the corresponding component in the new representation.

Problem #082

Tags: lecture-06, dimensionality-reduction, eigenvectors, pca

Consider the following data set of 5 points in \(\mathbb{R}^4\):

\[\vec{x}^{(1)} = \begin{pmatrix} 2 \\ 1 \\ 3 \\ 0 \end{pmatrix}, \quad\vec{x}^{(2)} = \begin{pmatrix} 0 \\ -2 \\ 1 \\ 1 \end{pmatrix}, \quad\vec{x}^{(3)} = \begin{pmatrix} -1 \\ 0 \\ -1 \\ 2 \end{pmatrix}, \quad\vec{x}^{(4)} = \begin{pmatrix} 1 \\ 2 \\ 0 \\ -1 \end{pmatrix}, \quad\vec{x}^{(5)} = \begin{pmatrix} -2 \\ -1 \\ -3 \\ -2 \end{pmatrix}\]

Perform PCA on this data set to reduce the dimensionality from 4 to 2. What are the new representations of each data point?

Note: You are not expected to compute the eigenvalues and eigenvectors by hand. Use software (such as numpy.linalg.eigh) to find them.

Solution

First, we form the data matrix and compute the sample covariance matrix:



>>> import numpy as np
>>> X = np.array([
...     [2, 1, 3, 0],
...     [0, -2, 1, 1],
...     [-1, 0, -1, 2],
...     [1, 2, 0, -1],
...     [-2, -1, -3, -2]
... ])
>>> mu = X.mean(axis=0)
>>> Z = X - mu
>>> C = 1 / len(X) * Z.T @ Z

Next, we compute the eigendecomposition of \(C\):



>>> eigenvalues, eigenvectors = np.linalg.eigh(C)
>>> idx = eigenvalues.argsort()[::-1]# sort descending
>>> eigenvalues = eigenvalues[idx]
>>> eigenvectors = eigenvectors[:, idx]
>>> eigenvalues
array([6.35, 2.57, 1.06, 0.02])

To reduce to 2 dimensions, we project onto the top 2 eigenvectors:



>>> U2 = eigenvectors[:, :2]
>>> Z_new = Z @ U2
>>> Z_new
array([[-3.68,  0.43],
       [-0.40, -2.19],
       [ 0.96, -1.44],
       [-0.92,  2.18],
       [ 4.03,  1.02]])

The new representations are (rounded to two decimal places):

\[\vec{z}^{(1)} = \begin{pmatrix} -3.68 \\ 0.43 \end{pmatrix}, \quad\vec{z}^{(2)} = \begin{pmatrix} -0.40 \\ -2.19 \end{pmatrix}, \quad\vec{z}^{(3)} = \begin{pmatrix} 0.96 \\ -1.44 \end{pmatrix}, \quad\vec{z}^{(4)} = \begin{pmatrix} -0.92 \\ 2.18 \end{pmatrix}, \quad\vec{z}^{(5)} = \begin{pmatrix} 4.03 \\ 1.02 \end{pmatrix}\]

Note: The signs of the eigenvectors are not unique; flipping the sign of an eigenvector will flip the sign of the corresponding component in the new representation.

Problem #083

Tags: lecture-06, dimensionality-reduction, eigenvectors, pca

Consider the following data set of 5 points in \(\mathbb{R}^4\):

\[\vec{x}^{(1)} = \begin{pmatrix} 1 \\ 2 \\ 0 \\ 3 \end{pmatrix}, \quad\vec{x}^{(2)} = \begin{pmatrix} -1 \\ 0 \\ 2 \\ 1 \end{pmatrix}, \quad\vec{x}^{(3)} = \begin{pmatrix} 2 \\ -1 \\ 1 \\ 0 \end{pmatrix}, \quad\vec{x}^{(4)} = \begin{pmatrix} 0 \\ 1 \\ -2 \\ -2 \end{pmatrix}, \quad\vec{x}^{(5)} = \begin{pmatrix} -2 \\ -2 \\ -1 \\ -2 \end{pmatrix}\]

Perform PCA on this data set to reduce the dimensionality from 4 to 3. What are the new representations of each data point?

Note: You are not expected to compute the eigenvalues and eigenvectors by hand. Use software (such as numpy.linalg.eigh) to find them.

Solution

First, we form the data matrix and compute the sample covariance matrix:



>>> import numpy as np
>>> X = np.array([
...     [1, 2, 0, 3],
...     [-1, 0, 2, 1],
...     [2, -1, 1, 0],
...     [0, 1, -2, -2],
...     [-2, -2, -1, -2]
... ])
>>> mu = X.mean(axis=0)
>>> Z = X - mu
>>> C = 1 / len(X) * Z.T @ Z

Next, we compute the eigendecomposition of \(C\):



>>> eigenvalues, eigenvectors = np.linalg.eigh(C)
>>> idx = eigenvalues.argsort()[::-1]# sort descending
>>> eigenvalues = eigenvalues[idx]
>>> eigenvectors = eigenvectors[:, idx]
>>> eigenvalues
array([5.72, 2.28, 1.34, 0.26])

To reduce to 3 dimensions, we project onto the top 3 eigenvectors:



>>> U3 = eigenvectors[:, :3]
>>> Z_new = Z @ U3
>>> Z_new
array([[-3.45, -1.17,  0.67],
       [-1.10,  1.83,  1.02],
       [-0.70,  0.83, -2.20],
       [ 1.84, -2.31, -0.10],
       [ 3.40,  0.82,  0.61]])

The new representations are (rounded to two decimal places):

\[\vec{z}^{(1)} = \begin{pmatrix} -3.45 \\ -1.17 \\ 0.67 \end{pmatrix}, \quad\vec{z}^{(2)} = \begin{pmatrix} -1.10 \\ 1.83 \\ 1.02 \end{pmatrix}, \quad\vec{z}^{(3)} = \begin{pmatrix} -0.70 \\ 0.83 \\ -2.20 \end{pmatrix}, \quad\vec{z}^{(4)} = \begin{pmatrix} 1.84 \\ -2.31 \\ -0.10 \end{pmatrix}, \quad\vec{z}^{(5)} = \begin{pmatrix} 3.40 \\ 0.82 \\ 0.61 \end{pmatrix}\]

Note: The signs of the eigenvectors are not unique; flipping the sign of an eigenvector will flip the sign of the corresponding component in the new representation.