[ Pobierz caÅ‚ość w formacie PDF ] .2 we will discuss this question in more detail.At first sight, the matrices Sf and Sg look different, but they create almost the same scatter-plots (see the discussion in Section 1.4).Similarly, the common principal component analysisin Chapter 9 suggests a joint analysis of the covariance structure as in Flury and Riedwyl(1988).Scatterplots with point clouds that are upward-sloping , like the one in the upper left ofFigure 1.14, show variables with positive covariance.Scatterplots with downward-slopingstructure have negative covariance.In Figure 3.1 we show the scatterplot of X4 vs.X5 ofthe entire bank data set.The point cloud is upward-sloping.However, the two sub-cloudsof counterfeit and genuine bank notes are downward-sloping.84 3 Moving to Higher DimensionsSwiss bank notes8 9 10 11 12X_4Figure 3.1.Scatterplot of variables X4 vs.X5 of the entire bank dataset.MVAscabank45.xplEXAMPLE 3.2 A textile shop manager is studying the sales of classic blue pullovers over10 different periods.He observes the number of pullovers sold (X1), variation in price (X2,in EUR), the advertisement costs in local newspapers (X3, in EUR) and the presence of asales assistant (X4, in hours per period).Over the periods, he observes the following datamatrix:ëø öø230 125 200 109ìø ÷ø181 99 55 107ìø ÷øìø ÷ø165 97 105 98ìø ÷øìø ÷ø150 115 85 71ìø ÷øìø ÷ø97 120 0 82ìø ÷øX =.ìø ÷ø192 100 150 103ìø ÷øìø ÷ø181 80 85 111ìø ÷øìø ÷ø189 90 120 93ìø ÷øíø øø172 95 110 86170 125 130 78121110X_5983.1 Covariance 85pullovers data80 90 100 110 120price (X2)Figure 3.2.Scatterplot of variables X2 vs.X1 of the pullovers data set.MVAscapull1.xplHe is convinced that the price must have a large influence on the number of pullovers sold.So he makes a scatterplot of X2 vs.X1, see Figure 3.2.A rough impression is that the cloudis somewhat downward-sloping.A computation of the empirical covariance yields101¯ ¯sX X2 = X1i - X1 X2i - X2 = -80.02,19i=1a negative value as expected.Note: The covariance function is scale dependent.Thus, if the prices in this example werein Japanese Yen (JPY), we would obtain a different answer (see Exercise 3.16).A measureof (linear) dependence independent of the scale is the correlation, which we introduce in thenext section.200sales (x1)15010086 3 Moving to Higher DimensionsSummary’! The covariance is a measure of dependence.’! Covariance measures only linear dependence.’! Covariance is scale dependent.’! There are nonlinear dependencies that have zero covariance.’! Zero covariance does not imply independence.’! Independence implies zero covariance.’! Negative covariance corresponds to downward-sloping scatterplots.’! Positive covariance corresponds to upward-sloping scatterplots.’! The covariance of a variable with itself is its variance Cov(X, X) = ÃXX =2ÃX.1’! For small n, we should replace the factor in the computation of then1covariance by.n-13.2 CorrelationThe correlation between two variables X and Y is defined from the covariance as the follow-ing:Cov(X, Y )ÁXY = · (3.7)Var(X) Var(Y )The advantage of the correlation is that it is independent of the scale, i.e., changing thevariables scale of measurement does not change the value of the correlation.Therefore, thecorrelation is more useful as a measure of association between two random variables thanthe covariance.The empirical version of ÁXY is as follows:sXYrXY = " · (3.8)sXXsY YThe correlation is in absolute value always less than 1.It is zero if the covariance is zeroand vice-versa.For p-dimensional vectors (X1,., Xp) we have the theoretical correlationmatrixëø öøÁX X1.ÁX Xp1 1ìø.÷ø ,.P =íø øø.ÁX X1.ÁX Xpp p3.2 Correlation 87and its empirical version, the empirical correlation matrix which can be calculated from theobservations,ëø öørX X1.rX Xp1 1ìø.÷ø.R =íø øø.rX X1.rX Xpp pEXAMPLE 3.3 We obtain the following correlation matrix for the genuine bank notes:ëø öø1.00 0.41 0.41 0.22 0.05 0.03ìø ÷ø0.41 1.00 0.66 0.24 0.20 -0.25ìø ÷øìø ÷ø0.41 0.66 1.00 0.25 0.13 -0.14ìø ÷øRg = , (3.9)ìø ÷ø0.22 0.24 0.25 1.00 -0.63 -0.00ìø ÷øíø øø0.05 0.20 0.13 -0.63 1.00 -0.250.03 -0.25 -0.14 -0.00 -0.25 1
[ Pobierz całość w formacie PDF ]
zanotowane.pldoc.pisz.plpdf.pisz.plhanula1950.keep.pl
|