Suggestions for the SVD lecture #361

matheusvillasb · 2023-08-16T20:32:11Z

Corrected some typos.

Added a brief description explaining why the Eckart-Young theorem is important.

Added details to how the PCA for any given matrix.

Changed the n_components term in the codes to r_components, so it is in accord with the definition of the Eckward-Young theorem and doesn't create confusion with the n columns of the matrix.

Added an exercise and its solution at the end (I've never used Sphinx, so not sure if it's done correctly).

…kart-Young theorem is important. Added details to how the PCA for any given matrix. Changed the n_components term in the codes to r_components, so it is in accord with the definition of the Eckward-Young theorem and doesn't create confusion with the n columns of the matrix. Added an exercise and its solution at the end (I've never used sphinx, so not sure if it's done correctly).

jstac · 2023-08-16T23:05:46Z

Many thanks @matheusvillasb for your very helpful corrections and suggestions!

@HumphreyYang , could you please do a first round review of this PR? @matheusvillasb is a first-time contributor, and hence might need some assistance with QE conventions and myst syntax.

@thomassargent30, just looping you into this discussion since the edits are to your SVD lecture. @matheusvillasb is a very enthusiastic and hard working masters student based in Europe.

thomassargent30 · 2023-08-16T23:17:51Z

Dear Humphrey, Please let me know how I can help. I wrote most of the SVD lecture and am responsible for all of the bugs. Tom

…

On Wed, Aug 16, 2023 at 5:05 PM John Stachurski ***@***.***> wrote: Many thanks @matheusvillasb <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_matheusvillasb&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=hV4qLWg4zodvX-YQ-ETIbA&m=EfK6cqaDA36Kw99qAGZfrvvhgu_f0DbrJgxG_W8q1DHqE_QCoA-6_Rd6cdoiyc3E&s=jaOvCvFtcxNa9JjxpJqJP6-WE8TPhmuhWpHTYv8yY6M&e=> for your very helpful corrections and suggestions! @HumphreyYang <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_HumphreyYang&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=hV4qLWg4zodvX-YQ-ETIbA&m=EfK6cqaDA36Kw99qAGZfrvvhgu_f0DbrJgxG_W8q1DHqE_QCoA-6_Rd6cdoiyc3E&s=CbYLKt8p2j6MTcXMJLyQsEUjlQbQ9Ls8BDhcDEuD7n8&e=> , could you please do a first round review of this PR? @matheusvillasb <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_matheusvillasb&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=hV4qLWg4zodvX-YQ-ETIbA&m=EfK6cqaDA36Kw99qAGZfrvvhgu_f0DbrJgxG_W8q1DHqE_QCoA-6_Rd6cdoiyc3E&s=jaOvCvFtcxNa9JjxpJqJP6-WE8TPhmuhWpHTYv8yY6M&e=> is a first-time contributor, and hence might need some assistance with QE conventions and myst syntax. @thomassargent30 <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_thomassargent30&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=hV4qLWg4zodvX-YQ-ETIbA&m=EfK6cqaDA36Kw99qAGZfrvvhgu_f0DbrJgxG_W8q1DHqE_QCoA-6_Rd6cdoiyc3E&s=_ilPWitKrGNlQEPeeQr1kqZ_7lZP_ZKesblhdF62IyQ&e=>, just looping you into this discussion since the edits are to your SVD lecture. @matheusvillasb <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_matheusvillasb&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=hV4qLWg4zodvX-YQ-ETIbA&m=EfK6cqaDA36Kw99qAGZfrvvhgu_f0DbrJgxG_W8q1DHqE_QCoA-6_Rd6cdoiyc3E&s=jaOvCvFtcxNa9JjxpJqJP6-WE8TPhmuhWpHTYv8yY6M&e=> is a very enthusiastic and hard working masters student based in Europe. — Reply to this email directly, view it on GitHub <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_QuantEcon_lecture-2Dpython.myst_pull_361-23issuecomment-2D1681379151&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=hV4qLWg4zodvX-YQ-ETIbA&m=EfK6cqaDA36Kw99qAGZfrvvhgu_f0DbrJgxG_W8q1DHqE_QCoA-6_Rd6cdoiyc3E&s=m54LA3o6H90U9k2YYVjLmLhV-8DwjrxN9Mik6dppVoI&e=>, or unsubscribe <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AA63MTUJ2FOCGV3TFEGGFT3XVVG5LANCNFSM6AAAAAA3TB6O5I&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=hV4qLWg4zodvX-YQ-ETIbA&m=EfK6cqaDA36Kw99qAGZfrvvhgu_f0DbrJgxG_W8q1DHqE_QCoA-6_Rd6cdoiyc3E&s=Mu1osRFFGIOZYwY8sra05HVOn2sTl09FBOd5MMZehM4&e=> . You are receiving this because you were mentioned.Message ID: ***@***.***>

HumphreyYang

Great changes @matheusvillasb, many thanks.

Please kindly check my comments below and feel free to commit them. I will do another round in a separate PR once these changes are made.

lectures/svd_intro.md

Co-authored-by: Humphrey Yang <[email protected]>

Removed typos and excessive lines that were pointed out by HumphreyYang

just being sure I commited all changes

matheusvillasb · 2023-08-17T16:06:44Z

Thank you so much Humphrey! Thanks so much for reviewing everything I did and making it more concise! I committed the changes, there was just the third suggestion where you suggested me to write "the leading eigenvalues", but in that case particularly, the theorem works for any matrix not only those that can be eigendecomposed, so singular values is more appropriate. Other than that, I think you made everything much better, thanks! So as I said, I committed the changes and also made a new pull request, I hope this works. I'm sorry if I did anything wrong, I'm still figuring out how to do it. Thank you very much! If you don't mind, I'll start on proofreading the DMD lectures and proposing exercises and small changes if you don't mind. Or if you need help with anything else, I'm just really glad to help!! Matheus Le jeu. 17 août 2023 à 08:38, Humphrey Yang ***@***.***> a écrit :

…

***@***.**** requested changes on this pull request. Great changes @matheusvillasb <https://github.com/matheusvillasb>, Please kindly check my comments above and feel free to commit them. I will do another round in a separate PR once these changes are made. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -350,7 +350,7 @@ of dimension $m \times n$. Three popular **matrix norms** of an $m \times n$ matrix $X$ can be expressed in terms of the singular values of $X$ -* the **spectral** or $l^2$ norm $|| X ||_2 = \max_{y \in \textbf{R}^n} \frac{||X y ||}{||y||} = \sigma_1$ +* the **spectral** or $l^2$ norm $|| X ||_2 = \max_{y \neq 0} \frac{||X y ||}{||y||} = \sigma_1$ ⬇️ Suggested change -* the **spectral** or $l^2$ norm $|| X ||_2 = \max_{y \neq 0} \frac{||X y ||}{||y||} = \sigma_1$ +* the **spectral** or $l^2$ norm $|| X ||_2 = \max_{||y|| \neq 0} \frac{||X y ||}{||y||} = \sigma_1$ Nice catch! I think it is clearer if we say $||y|| \neq 0$ instead of $y \neq 0$. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -360,6 +360,13 @@ $$ \hat X_r = \sigma_1 U_1 V_1^\top + \sigma_2 U_2 V_2^\top + \cdots + \sigma_r U_r V_r^\top $$ (eq:Ekart) +This is a very powerful theorem, it says that we can take our $ m \times n $ matrix $X$ that in not full rank, and we can best approximate it to a full rank $p_x p$ matrix through the SVD. ⬇️ Suggested change -This is a very powerful theorem, it says that we can take our $ m \times n $ matrix $X$ that in not full rank, and we can best approximate it to a full rank $p_x p$ matrix through the SVD. +This is a very powerful theorem, it says that we can take our $ m \times n $ matrix $X$ that in not full rank, and we can best approximate it to a full rank $p \times p$ matrix through the SVD. I think you are suggesting $p \times p$ here. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -360,6 +360,13 @@ $$ \hat X_r = \sigma_1 U_1 V_1^\top + \sigma_2 U_2 V_2^\top + \cdots + \sigma_r U_r V_r^\top $$ (eq:Ekart) +This is a very powerful theorem, it says that we can take our $ m \times n $ matrix $X$ that in not full rank, and we can best approximate it to a full rank $p_x p$ matrix through the SVD. + +Moreover, if some of these $p$ singular values carry more information than others, and if we want to have the most amount of information with the least amount of data, we can order these singular values in a decreasing order by magnitude and set a threshold $r$, from where past this point we set all singular values to zero. This is a great addition, but we often make sentences short in lectures. I propose we shorten it as ⬇️ Suggested change -Moreover, if some of these $p$ singular values carry more information than others, and if we want to have the most amount of information with the least amount of data, we can order these singular values in a decreasing order by magnitude and set a threshold $r$, from where past this point we set all singular values to zero. +Moreover, if some of these $p$ singular values carry more information than others, and if we want to have the most amount of information with the least amount of data, we can take $r$ leading eigenvalues ordered by magnitude. Please feel free to take or leave this suggestion. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -360,6 +360,13 @@ $$ \hat X_r = \sigma_1 U_1 V_1^\top + \sigma_2 U_2 V_2^\top + \cdots + \sigma_r U_r V_r^\top $$ (eq:Ekart) +This is a very powerful theorem, it says that we can take our $ m \times n $ matrix $X$ that in not full rank, and we can best approximate it to a full rank $p_x p$ matrix through the SVD. + +Moreover, if some of these $p$ singular values carry more information than others, and if we want to have the most amount of information with the least amount of data, we can order these singular values in a decreasing order by magnitude and set a threshold $r$, from where past this point we set all singular values to zero. + +This is what model reduction is about, we project the data into a new space, where we extract the patterns that are behind this data, and then we can then keep most of the important patterns and truncate the rest. I think the previous sentence is very clear already. Would you suggest leaving this out for simplicity? ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -569,19 +576,78 @@ where for $j = 1, \ldots, n$ the column vector $X_j = \begin{bmatrix}X_{1j}\\X_{ In a **time series** setting, we would think of columns $j$ as indexing different __times__ at which random variables are observed, while rows index different random variables. -In a **cross section** setting, we would think of columns $j$ as indexing different __individuals__ for which random variables are observed, while rows index different **attributes**. +In a **cross-section** setting, we would think of columns $j$ as indexing different __individuals__ for which random variables are observed, while rows index different **attributes**. + +As we have seen before, the SVD is a way to decompose a matrix into useful components, just like polar decomposition, eigen decomposition and many others. PCA on the other hand, is method that builds on the SVD, to analyse data. The goal is to apply certain steps, to help better visualize patterns in data, using statistical tools to capture the most important patterns in data. ⬇️ Suggested change -As we have seen before, the SVD is a way to decompose a matrix into useful components, just like polar decomposition, eigen decomposition and many others. PCA on the other hand, is method that builds on the SVD, to analyse data. The goal is to apply certain steps, to help better visualize patterns in data, using statistical tools to capture the most important patterns in data. +As we have seen before, the SVD is a way to decompose a matrix into useful components, just like polar decomposition, eigendecomposition, and many others. + +PCA, on the other hand, is a method that builds on the SVD to analyze data. The goal is to apply certain steps, to help better visualize patterns in data, using statistical tools to capture the most important patterns in data. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -360,6 +360,13 @@ $$ \hat X_r = \sigma_1 U_1 V_1^\top + \sigma_2 U_2 V_2^\top + \cdots + \sigma_r U_r V_r^\top $$ (eq:Ekart) +This is a very powerful theorem, it says that we can take our $ m \times n $ matrix $X$ that in not full rank, and we can best approximate it to a full rank $p_x p$ matrix through the SVD. + +Moreover, if some of these $p$ singular values carry more information than others, and if we want to have the most amount of information with the least amount of data, we can order these singular values in a decreasing order by magnitude and set a threshold $r$, from where past this point we set all singular values to zero. + +This is what model reduction is about, we project the data into a new space, where we extract the patterns that are behind this data, and then we can then keep most of the important patterns and truncate the rest. + +But more about it later when we present Principal Component Analysis. You can read about the Eckart-Young theorem and some of its uses here <https://en.wikipedia.org/wiki/Low-rank_approximation>. ⬇️ Suggested change -You can read about the Eckart-Young theorem and some of its uses here <https://en.wikipedia.org/wiki/Low-rank_approximation>. +You can read about the Eckart-Young theorem and some of its uses [here](https://en.wikipedia.org/wiki/Low-rank_approximation). ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > The cells above illustrate application of the `fullmatrices=True` and `full-matrices=False` options. Using `full-matrices=False` returns a reduced singular value decomposition. ⬇️ Suggested change -The cells above illustrate application of the `fullmatrices=True` and `full-matrices=False` options. -Using `full-matrices=False` returns a reduced singular value decomposition. +The cells above illustrate the application of the `full_matrices=True` and `full_matrices=False` options. +Using `full_matrices=False` returns a reduced singular value decomposition. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -569,19 +576,78 @@ where for $j = 1, \ldots, n$ the column vector $X_j = \begin{bmatrix}X_{1j}\\X_{ In a **time series** setting, we would think of columns $j$ as indexing different __times__ at which random variables are observed, while rows index different random variables. -In a **cross section** setting, we would think of columns $j$ as indexing different __individuals__ for which random variables are observed, while rows index different **attributes**. +In a **cross-section** setting, we would think of columns $j$ as indexing different __individuals__ for which random variables are observed, while rows index different **attributes**. + +As we have seen before, the SVD is a way to decompose a matrix into useful components, just like polar decomposition, eigen decomposition and many others. PCA on the other hand, is method that builds on the SVD, to analyse data. The goal is to apply certain steps, to help better visualize patterns in data, using statistical tools to capture the most important patterns in data. + +**Step 1: Standardize the data:** Because our data matrix may hold variables of different units and scales like mentioned above, we first need to standardize the data. First by computing the average of each row of $X$. ⬇️ Suggested change -**Step 1: Standardize the data:** Because our data matrix may hold variables of different units and scales like mentioned above, we first need to standardize the data. First by computing the average of each row of $X$. +**Step 1: Standardize the data:** + +Because our data matrix may hold variables of different units and scales, we first need to standardize the data. + +First by computing the average of each row of $X$. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > -The number of positive singular values equals the rank of matrix $X$. +**Step 2: Compute the covariance matrix:** Then because we want to extract the relationships between variables rather than just their magnitude, in other words, we want to know how they can explain each other, we compute the covariance matrix of $B$. ⬇️ Suggested change -**Step 2: Compute the covariance matrix:** Then because we want to extract the relationships between variables rather than just their magnitude, in other words, we want to know how they can explain each other, we compute the covariance matrix of $B$. +**Step 2: Compute the covariance matrix:** + +Then because we want to extract the relationships between variables rather than just their magnitude, in other words, we want to know how they can explain each other, we compute the covariance matrix of $B$. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > -The number of positive singular values equals the rank of matrix $X$. +**Step 2: Compute the covariance matrix:** Then because we want to extract the relationships between variables rather than just their magnitude, in other words, we want to know how they can explain each other, we compute the covariance matrix of $B$. + +$$ +C = \frac{1}{{n}} B^T B ⬇️ Suggested change -C = \frac{1}{{n}} B^T B +C = \frac{1}{{n}} B^\top B ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > -The number of positive singular values equals the rank of matrix $X$. +**Step 2: Compute the covariance matrix:** Then because we want to extract the relationships between variables rather than just their magnitude, in other words, we want to know how they can explain each other, we compute the covariance matrix of $B$. + +$$ +C = \frac{1}{{n}} B^T B +$$ + +**Step 3: Decompose the covariance matrix and arrange the singular values:** + +If the matrix $C$ is diagonalizable, we can eigendecompose it, find its eigenvalues and rearrange the eigenvalue and eigenvector matrices in a decreasing other. If $C$ is not diagonalizable, we can perform an SVD of $C$: ⬇️ Suggested change -If the matrix $C$ is diagonalizable, we can eigendecompose it, find its eigenvalues and rearrange the eigenvalue and eigenvector matrices in a decreasing other. If $C$ is not diagonalizable, we can perform an SVD of $C$: +If the matrix $C$ is diagonalizable, we can eigendecompose it, find its eigenvalues and rearrange the eigenvalue and eigenvector matrices in a decreasing other. + +If $C$ is not diagonalizable, we can perform an SVD of $C$: ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > ## Relationship of PCA to SVD -To relate a SVD to a PCA (principal component analysis) of data set $X$, first construct the SVD of the data matrix $X$: +To relate an SVD to a PCA (principal component analysis) of data set $X$, first construct the SVD of the data matrix $X$: ⬇️ Suggested change -To relate an SVD to a PCA (principal component analysis) of data set $X$, first construct the SVD of the data matrix $X$: +To relate an SVD to a PCA of data set $X$, first construct the SVD of the data matrix $X$: ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > +```{code-cell} python3 + +We can use SVD to compute the pseudoinverse: + +$$ +X = U \Sigma V^\top +$$ + +inverting $X$, we have: + +$$ +X^{+} = V \Sigma^{+} U^\top +$$ + +where: + +$$ +\Sigma^{+} \Sigma = \begin{bmatrix} I_p & 0 \cr 0 & 0 \end{bmatrix} +$$ + +and finally: + +$$ +\hat{\beta} = X^{+}y = V \Sigma^{+} U^\top y +$$ + I think this should not be in a code block. Please remove ```{code-cell} python3 ... ``` ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > -Arrange the positive singular values on the main diagonal of the matrix $\Sigma$ of into a vector $\sigma_R$. +**Step 4: Select singular values, (optional) truncate the rest:** + +We can now decide how many singular values to pick, based on how much variance you want to retain. (e.g., retaining 95% of the total variance). + +$$ +\frac{\sum_{i = 1}^{r} \sigma^2_{i}}{\sum_{i = 1}^{p} \sigma^2_{i}} +$$ + +**Step 5: Create the Score Matrix: ⬇️ Suggested change -**Step 5: Create the Score Matrix: +**Step 5: Create the Score Matrix:** ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > -Arrange the positive singular values on the main diagonal of the matrix $\Sigma$ of into a vector $\sigma_R$. +**Step 4: Select singular values, (optional) truncate the rest:** + +We can now decide how many singular values to pick, based on how much variance you want to retain. (e.g., retaining 95% of the total variance). ⬇️ Suggested change -We can now decide how many singular values to pick, based on how much variance you want to retain. (e.g., retaining 95% of the total variance). +We can now decide how many singular values to pick, based on how much variance you want to retain. (e.g., retaining 95% of the total variance). + +We can obtain the percentage by calculating the variance contained in the leading $r$ factors divided by the variance in total: ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -926,6 +994,56 @@ def compare_pca_svd(da): plt.show() ``` +## Exercises + +```{exercise} +:label: svd_ex1 + +In Ordinary Least Squares (OLS), we learn to compute $ \hat{\beta} = (X^T X)^{-1} X^T y $, but there are cases such as when we have colinearity or an underdetermined system: **short fat** matrix. + +In these cases, the $ (X^T X) $ matrix is not inversible. Its determinant is zero or close to zero and we cannot invert it. ⬇️ Suggested change -In these cases, the $ (X^T X) $ matrix is not inversible. Its determinant is zero or close to zero and we cannot invert it. +In these cases, the $ (X^\top X) $ matrix is not inversible. Its determinant is zero or close to zero and we cannot invert it. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -926,6 +994,56 @@ def compare_pca_svd(da): plt.show() ``` +## Exercises + +```{exercise} +:label: svd_ex1 + +In Ordinary Least Squares (OLS), we learn to compute $ \hat{\beta} = (X^T X)^{-1} X^T y $, but there are cases such as when we have colinearity or an underdetermined system: **short fat** matrix. ⬇️ Suggested change -In Ordinary Least Squares (OLS), we learn to compute $ \hat{\beta} = (X^T X)^{-1} X^T y $, but there are cases such as when we have colinearity or an underdetermined system: **short fat** matrix. +In Ordinary Least Squares (OLS), we learn to compute $ \hat{\beta} = (X^\top X)^{-1} X^\top y $, but there are cases such as when we have colinearity or an underdetermined system: **short fat** matrix. ------------------------------ In lectures/svd_intro.md <#361 (comment)> : > @@ -926,6 +994,56 @@ def compare_pca_svd(da): plt.show() ``` +## Exercises + +```{exercise} +:label: svd_ex1 + +In Ordinary Least Squares (OLS), we learn to compute $ \hat{\beta} = (X^T X)^{-1} X^T y $, but there are cases such as when we have colinearity or an underdetermined system: **short fat** matrix. + +In these cases, the $ (X^T X) $ matrix is not inversible. Its determinant is zero or close to zero and we cannot invert it. + +What we can do instead is to create what is called a pseudoinverse, a full rank approximation of the inverted matrix so we can compute $ \hat{\beta} $ with it. ⬇️ Suggested change -What we can do instead is to create what is called a pseudoinverse, a full rank approximation of the inverted matrix so we can compute $ \hat{\beta} $ with it. +What we can do instead is to create what is called a [pseudoinverse](https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse), a full rank approximation of the inverted matrix so we can compute $ \hat{\beta} $ with it. — Reply to this email directly, view it on GitHub <#361 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BB6AUEZGSBNNGM7ERY2VXK3XVW373ANCNFSM6AAAAAA3TB6O5I> . You are receiving this because you were mentioned.Message ID: ***@***.***>

HumphreyYang · 2023-08-17T22:54:11Z

Hi Matheus,

so singular values is more appropriate.

You are absolutely right. That was a typo from me.

I'm sorry if I did anything wrong, I'm still figuring out how to do it.

No, you are doing great. Thanks for the great contribution.

Please feel free to go ahead with the DMD lecture in a separate PR. Please do not hesitate to let me know if you need any help.

Thanks,
Humphrey

jstac · 2023-08-18T03:29:20Z

lectures/svd_intro.md

@@ -360,8 +360,13 @@ $$
 \hat X_r = \sigma_1 U_1 V_1^\top  + \sigma_2 U_2 V_2^\top  + \cdots + \sigma_r U_r V_r^\top
 $$ (eq:Ekart)

+This is a very powerful theorem, it says that we can take our $ m \times n $ matrix $X$ that in not full rank, and we can best approximate it to a full rank $p \times p$ matrix through the SVD. 


Please changes as follows

"...powerful theorem that says..."
"...approximate it by a..."

jstac · 2023-08-18T03:30:01Z

lectures/svd_intro.md

-You can read about the Eckart-Young theorem and some of its uses here <https://en.wikipedia.org/wiki/Low-rank_approximation>.
+Moreover, if some of these $p$ singular values carry more information than others, and if we want to have the most amount of information with the least amount of data, we can take $r$ leading singular values ordered by magnitude.
+
+But more about it later when we present Principal Component Analysis.


"We'll say more about this later when..."

jstac · 2023-08-18T03:32:40Z

lectures/svd_intro.md

+
+In Ordinary Least Squares (OLS), we learn to compute $ \hat{\beta} = (X^\top X)^{-1} X^\top y $, but there are cases such as when we have colinearity or an underdetermined system: **short fat** matrix.
+
+In these cases, the $ (X^\top X) $ matrix is not inversible. Its determinant is zero or close to zero and we cannot invert it.


"...not invertible (its determinant is zero) or ill-conditioned (its determinant is very close to zero)."

jstac · 2023-08-18T03:35:52Z

Thanks @matheusvillasb , these are nice changes. Thoughtful and well written.

I've requested some very minor edits. Would you mind to make those changes and push them to this PR?

Once you have made those edits to the PR I'll pass the review over to @mmcky and @thomassargent30 so we can get this merged.

(Thanks also to @HumphreyYang for a very useful review.)

matheusvillasb

Did the changes John asked, sorry for the delay, I thought I had commited it before, but it didn't.

jstac · 2023-08-20T03:13:02Z

Great, thanks @matheusvillasb . Over to you @mmcky and @thomassargent30 .

corrected definition of pseudo inverse of sigma matrix in the exercise

jstac · 2023-09-01T04:05:49Z

@HumphreyYang , could you please look and see why this is failing?

HumphreyYang · 2023-09-01T04:10:28Z

@HumphreyYang , could you please look and see why this is failing?

Hi @jstac,

I think @mmcky raised this issue of GitHub action. It rejects running PR from a fork as it is reluctant to share the credentials for EC2. I think we have yet to find a good solution other than building it locally (CC @mmcky).

mmcky · 2023-12-12T05:54:27Z

@HumphreyYang can you run this locally and cross-check the builds and report back.

It would be great if you can also fix the merge conflict.

mmcky · 2023-12-12T05:55:06Z

@HumphreyYang can you run this locally and cross-check the builds and report back.

It would be great if you can also fix the merge conflict.

Alternatively please transfer the PR to a local branch and push (giving @matheusvillasb the credit to get CI to pass)

mmcky · 2023-12-13T23:09:19Z

Closing this in preference for #375 (migrated to a local branch)

mmcky · 2023-12-14T00:49:14Z

I have re-opened this as we can use #375 as a test environment so long as we move those changes back to this PR (that can't execute the previews). @HumphreyYang makes a good point in that we want attribution to be retained for @matheusvillasb so it will be better to merge this into main than #375.

@HumphreyYang can you use gh cli to

gh checkout pr matheusvillasb:main
<resolve conflict>
git merge main
git push

to bring this PR (fork in line with #375)

mmcky · 2023-12-14T00:50:22Z

@HumphreyYang I see the fix 0770ded and I can apply this if you're busy today. 👍

mmcky · 2023-12-14T01:02:51Z

@jstac a replica of this is available here: #375

and a preview

https://65782eb0f8c9513bb97425e7--nostalgic-wright-5fa355.netlify.app/svd_intro.html

jstac · 2023-12-19T22:59:05Z

Hi @mmcky, thanks for organizing. Thanks @matheusvillasb for putting this together and @HumphreyYang for the review.

I'm happy with these changes but they should be approved by @thomassargent30 before being made live.

mmcky · 2023-12-23T05:08:23Z

thanks @matheusvillasb for these proposed changes.

Thanks @HumphreyYang @jstac for your comments and reviews.

@thomassargent30 has reviewed and will make a few minor changes once merged into the main branch.

mmcky · 2024-01-01T23:22:12Z

thanks once again @matheusvillasb -- these changes are going live today.

HumphreyYang requested changes Aug 17, 2023

View reviewed changes

matheusvillasb and others added 3 commits August 17, 2023 17:15

Apply suggestions from code review

ba57924

Co-authored-by: Humphrey Yang <[email protected]>

Update svd_intro.md

77ad25c

Removed typos and excessive lines that were pointed out by HumphreyYang

Update svd_intro.md

e0006ce

just being sure I commited all changes

jstac reviewed Aug 18, 2023

View reviewed changes

Update svd_intro.md

4908256

matheusvillasb commented Aug 19, 2023

View reviewed changes

Update svd_intro.md

669619b

corrected definition of pseudo inverse of sigma matrix in the exercise

HumphreyYang mentioned this pull request Dec 12, 2023

ENH: Suggestions for SVD Lecture (migrate matheusvillasb PR) #375

Closed

mmcky closed this Dec 13, 2023

mmcky reopened this Dec 14, 2023

mmcky and others added 2 commits December 14, 2023 11:52

fix exercise solution syntax

faf5a3d

Merge branch 'main' into main

d944712

mmcky added the review label Dec 19, 2023

mmcky mentioned this pull request Dec 19, 2023

MAINT: Fix small typos #362

Merged

mmcky requested review from HumphreyYang and jstac December 19, 2023 22:32

mmcky requested a review from thomassargent30 December 19, 2023 23:06

mmcky merged commit 3dbdccd into QuantEcon:main Dec 23, 2023


		In Ordinary Least Squares (OLS), we learn to compute $ \hat{\beta} = (X^\top X)^{-1} X^\top y $, but there are cases such as when we have colinearity or an underdetermined system: short fat matrix.

		In these cases, the $ (X^\top X) $ matrix is not inversible. Its determinant is zero or close to zero and we cannot invert it.

Uh oh!

Suggestions for the SVD lecture #361

Suggestions for the SVD lecture #361

Uh oh!

Conversation

matheusvillasb commented Aug 16, 2023

Uh oh!

jstac commented Aug 16, 2023

Uh oh!

thomassargent30 commented Aug 16, 2023 via email

Uh oh!

HumphreyYang left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matheusvillasb commented Aug 17, 2023 via email

Uh oh!

HumphreyYang commented Aug 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jstac Aug 18, 2023

Choose a reason for hiding this comment

Uh oh!

jstac Aug 18, 2023

Choose a reason for hiding this comment

Uh oh!

jstac Aug 18, 2023

Choose a reason for hiding this comment

Uh oh!

jstac commented Aug 18, 2023

Uh oh!

matheusvillasb left a comment

Choose a reason for hiding this comment

Uh oh!

jstac commented Aug 20, 2023

Uh oh!

jstac commented Sep 1, 2023

Uh oh!

HumphreyYang commented Sep 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mmcky commented Dec 12, 2023

Uh oh!

mmcky commented Dec 12, 2023

Uh oh!

mmcky commented Dec 13, 2023

Uh oh!

mmcky commented Dec 14, 2023

Uh oh!

mmcky commented Dec 14, 2023

Uh oh!

mmcky commented Dec 14, 2023

Uh oh!

jstac commented Dec 19, 2023

Uh oh!

mmcky commented Dec 23, 2023

Uh oh!

mmcky commented Jan 1, 2024

Uh oh!

Uh oh!

HumphreyYang left a comment •

edited

Loading

HumphreyYang commented Aug 17, 2023 •

edited

Loading

HumphreyYang commented Sep 1, 2023 •

edited

Loading