The problem is finding A. To find A we can use two methods. More precisely, A can be computed by Choleskey decompostion, or it can be computed by eigenvectors.
My question is which method Choleskey , or eigenvectors is more efficient, and what the advantage or disadvantage of each method is. Cholesky is more efficient, and that is the primary advantage. The advantage of an eigenvector decomposition is that the A matrix is the product of an orthogonal, matrix the eigenvectors and a diagonal matrix the square root of the eigenvalues.
For the eigen decomp, Y1 depends on X1, X2, However, the eigen decomp has a nice geometry in terms of variance, since the eigen decomp is equivalent to a PCA decomposition. Thank you for this clear explanation. Also this method is usually described in the context of iid normals but this is not necessary is it? I know that more computational techniques like Iman Conover will do this and are distribution agnostic. If the marginal distributions are nonnormal, there is no reason to expect their shape to be preserved by a Cholesky transformation.
You can easily modify the example in this post to simulate uniform variates on 0,1. The Cholesky transformation is a triangular matrix, so the first transformed variable z1 is uniformly distributed, but the second z2 is a linear combination of uniforms, which is not itself uniformly distributed.
Thanks for the great article. I'm trying to implement this for discrete complex random variables and by using Matlab. I think I've done it but I don't get the expected identity covariance matrix. Discrete variables are tricky. See Chapter 9 of Simulating Data with SAS , which discusses how to generate correlated multivariate binary variales and correlated multivarate ordinal variables. Hi Rick, this is a nice article. Could you introduce some other methods to implement the transformation between uncorrelated variables to correlated variables?
I mean Cholesky is an efficient way to do this, but I am wondering whether there is another way more convenient than it. Now I have a big-size original sample matrix of several correlated variables and I do separated sampling for every single variable by using Latin hypercube sampling perhaps.
Thus I can get a sample matrix called A in which the samples are not correlated. Is that the only way to make A get the correlation? The scenario in this article assumes that the data are a random sample from a multivariate distribution.
LHS is for designed experiments. You can't always use LHS when the inputs are correlated. For example, if the explanatory variables are age, weight, and BMI, you might not have any patients in your study who have a low age, high weight, and low BMI.
If my answer is not clear, you might try posting your question to the "Cross Validated" discussion forum. After running the proc simnormal. I found negative values of simulated variables. I had input data for 4 variables all positive. It is true that correlations is ngative among two of them?
How do I obtain only positive simulated values. Think about what you are doing. You are specifying that the data are multivariate normal with a given mean vector and covariance matrix.
It is easy to see how your situation occurs, even in univariate data. If you simulate from the N 2, 1. The underlying reason is that a normal distribution is not a good fit for these data. You have a few choices. For example, model the data by lognormal or exponential.
I am trying to simulate the returns of a portfolio composed of 20 variables in excel. Before integrating correlations between the variables using Cholesky decomposition, I standardized the data, i. Once the correlation have been integrated, I de-standardized the data, basically reversed the process of standardization, to keep the same mean and standard deviation of each variable.
I am wondering if this is correct from the statistical point of view. I am not sure standardization of data is necessary before applying Cholesky decomposition. It would be great to find a reference to some kind of scientific paper. I need this for work. Your help would be much appreciated.
Thank you! The ideas in this blog post are also presented in Wicklin, R. No, you do not need to standardize the variables. This explanation is super intuitive and amazing! Eventually I get some concept about covariance matrix and its square-root!
Save my name, email, and website in this browser for the next time I comment. The General Cholesky Transformation Correlates Variables In the general case, a covariance matrix contains off-diagonal elements.
Apply inverse of L. Thanh Pham on February 9, pm. The best answers are voted up and rise to the top. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Learn more. Generating correlated random numbers: Why does Cholesky decomposition work? Ask Question. Asked 9 years, 4 months ago. Active 1 year, 7 months ago. Viewed 58k times. Rodrigo de Azevedo Flux Capacitor Flux Capacitor 1 1 gold badge 6 6 silver badges 7 7 bronze badges. Add a comment.
Actually Choleski factorization can be obtained from all Hermitian matrices. Hermitian matrices are a complex extension of real symmetric matrices.
A symmetric matrices is one that it is equal to its transpose, which implies that its entries are symmetric with respect to the diagonal. In a Hermitian matrix, symmetric entries with respect to the diagonal are complex conjugates, i. Simulating correlated variables with the Cholesky factorization Jan 21, 4 min read R , self-study , note.
0コメント