Minor revision of mvrnorm (from MASS) to facilitate replication

This is the mvrnorm function from the MASS package (Venables and Ripley, 2002), with one small modification to facilitate replication of random samples. The aim is to make sure that, after the seed is reset, the first rows of generated data are identical no matter what value is chosen for n. The one can draw 100 observations, reset the seed, and then draw 110 observations, and the first 100 will match exactly. This is done to prevent unexpected and peculiar patterns that are observed when n is altered with MASS package's mvrnorm.

Usage

mvrnorm(n = 1, mu, Sigma, tol = 1e-06, empirical = FALSE)

Arguments

n: the number of samples ("rows" of data) required.
mu: a vector giving the means of the variables.
Sigma: positive-definite symmetric matrix specifying the covariance matrix of the variables.
tol: tolerance (relative to largest variance) for numerical lack of positive-definiteness in Sigma
empirical: logical. If true, mu and Sigma specify the empirical not population mean and covariance matrix.

Value

If n = 1 a vector of the same length as mu, otherwise an n by length(mu) matrix with one sample in each row.

Details

To assure replication, only a very small change is made. The code in MASS::mvrnorm draws a random sample and fills a matrix by column, and that matrix is then decomposed. The change implemented here fills that matrix by row and the problem is eliminated.

Some peculiarities are noticed when the covariance matrix changes from a diagonal matrix to a more general symmetric matrix (non-zero elements off-diagonal). When the covariance is strictly diagonal, then just one column of the simulated multivariate normal data will be replicated, but the others are not. This has very troublesome implications for simulations that draw samples of various sizes and then base calculations on the separate simulated columns (i.e., some columns are identical, others are completely uncorrelated).

References

Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0

Author

Ripley, B.D. with revision by Paul E. Johnson

Examples

library(MASS)
#> 
#> Attaching package: ‘MASS’
#> The following object is masked from ‘package:rockchalk’:
#> 
#>     mvrnorm
library(rockchalk)

set.seed(12345)
X0 <- MASS::mvrnorm(n=10, mu = c(0,0,0), Sigma = diag(3))
## create a smaller data set, starting at same position
set.seed(12345)
X1 <- MASS::mvrnorm(n=5, mu = c(0,0,0), Sigma = diag(3))
## Create a larger data set
set.seed(12345)
X2 <- MASS::mvrnorm(n=15, mu = c(0,0,0), Sigma = diag(3))
## The first 5 rows in X0, X1, and X2 are not the same
X0
#>             [,1]       [,2]       [,3]
#>  [1,]  0.7796219 -0.1162478  0.5855288
#>  [2,]  1.4557851  1.8173120  0.7094660
#>  [3,] -0.6443284  0.3706279 -0.1093033
#>  [4,] -1.5531374  0.5202165 -0.4534972
#>  [5,] -1.5977095 -0.7505320  0.6058875
#>  [6,]  1.8050975  0.8168998 -1.8179560
#>  [7,] -0.4816474 -0.8863575  0.6300986
#>  [8,]  0.6203798 -0.3315776 -0.2761841
#>  [9,]  0.6121235  1.1207127 -0.2841597
#> [10,] -0.1623110  0.2987237 -0.9193220
X1
#>            [,1]       [,2]       [,3]
#> [1,] -0.1162478 -1.8179560  0.5855288
#> [2,]  1.8173120  0.6300986  0.7094660
#> [3,]  0.3706279 -0.2761841 -0.1093033
#> [4,]  0.5202165 -0.2841597 -0.4534972
#> [5,] -0.7505320 -0.9193220  0.6058875
X2
#>              [,1]       [,2]       [,3]
#>  [1,]  0.81187318  0.8168998  0.5855288
#>  [2,]  2.19683355 -0.8863575  0.7094660
#>  [3,]  2.04919034 -0.3315776 -0.1093033
#>  [4,]  1.63244564  1.1207127 -0.4534972
#>  [5,]  0.25427119  0.2987237  0.6058875
#>  [6,]  0.49118828  0.7796219 -1.8179560
#>  [7,] -0.32408658  1.4557851  0.6300986
#>  [8,] -1.66205024 -0.6443284 -0.2761841
#>  [9,]  1.76773385 -1.5531374 -0.2841597
#> [10,]  0.02580105 -1.5977095 -0.9193220
#> [11,]  1.12851083  1.8050975 -0.1162478
#> [12,] -2.38035806 -0.4816474  1.8173120
#> [13,] -1.06026555  0.6203798  0.3706279
#> [14,]  0.93714054  0.6121235  0.5202165
#> [15,]  0.85445172 -0.1623110 -0.7505320
set.seed(12345)
Y0 <- mvrnorm(n=10, mu = c(0,0,0), Sigma = diag(3))
set.seed(12345)
Y1 <- mvrnorm(n=5, mu = c(0,0,0), Sigma = diag(3))
set.seed(12345)
Y2 <- mvrnorm(n=15, mu = c(0,0,0), Sigma = diag(3))
# note results are the same in the first 5 rows:
Y0
#>             [,1]       [,2]       [,3]
#>  [1,]  0.7796219 -0.1162478  0.5855288
#>  [2,]  1.4557851  1.8173120  0.7094660
#>  [3,] -0.6443284  0.3706279 -0.1093033
#>  [4,] -1.5531374  0.5202165 -0.4534972
#>  [5,] -1.5977095 -0.7505320  0.6058875
#>  [6,]  1.8050975  0.8168998 -1.8179560
#>  [7,] -0.4816474 -0.8863575  0.6300986
#>  [8,]  0.6203798 -0.3315776 -0.2761841
#>  [9,]  0.6121235  1.1207127 -0.2841597
#> [10,] -0.1623110  0.2987237 -0.9193220
Y1
#>            [,1]       [,2]       [,3]
#> [1,] -0.1162478 -1.8179560  0.5855288
#> [2,]  1.8173120  0.6300986  0.7094660
#> [3,]  0.3706279 -0.2761841 -0.1093033
#> [4,]  0.5202165 -0.2841597 -0.4534972
#> [5,] -0.7505320 -0.9193220  0.6058875
Y2
#>              [,1]       [,2]       [,3]
#>  [1,]  0.81187318  0.8168998  0.5855288
#>  [2,]  2.19683355 -0.8863575  0.7094660
#>  [3,]  2.04919034 -0.3315776 -0.1093033
#>  [4,]  1.63244564  1.1207127 -0.4534972
#>  [5,]  0.25427119  0.2987237  0.6058875
#>  [6,]  0.49118828  0.7796219 -1.8179560
#>  [7,] -0.32408658  1.4557851  0.6300986
#>  [8,] -1.66205024 -0.6443284 -0.2761841
#>  [9,]  1.76773385 -1.5531374 -0.2841597
#> [10,]  0.02580105 -1.5977095 -0.9193220
#> [11,]  1.12851083  1.8050975 -0.1162478
#> [12,] -2.38035806 -0.4816474  1.8173120
#> [13,] -1.06026555  0.6203798  0.3706279
#> [14,]  0.93714054  0.6121235  0.5202165
#> [15,]  0.85445172 -0.1623110 -0.7505320
identical(Y0[1:5, ], Y1[1:5, ])
#> [1] FALSE
identical(Y1[1:5, ], Y2[1:5, ])
#> [1] FALSE

myR <- lazyCor(X = 0.3, d = 5)
mySD <- c(0.5, 0.5, 0.5, 1.5, 1.5)
myCov <- lazyCov(Rho = myR, Sd = mySD)

set.seed(12345)
X0 <- MASS::mvrnorm(n=10, mu = rep(0, 5), Sigma = myCov)
## create a smaller data set, starting at same position
set.seed(12345)
X1 <- MASS::mvrnorm(n=5, mu = rep(0, 5), Sigma = myCov)
X0
#>              [,1]        [,2]        [,3]        [,4]        [,5]
#>  [1,]  0.08586442 -0.42506610 -0.72775819 -0.54336544 -0.74968483
#>  [2,] -0.39244035 -1.62500085  0.26202476 -2.35665099  0.86875792
#>  [3,]  0.61053787 -0.57120046  0.61127560 -0.24668393  0.41111527
#>  [4,]  1.20401005  0.21343186  0.26811590 -0.03413171  0.88916091
#>  [5,]  0.58803330  0.41418849  0.06988804 -0.18843852 -1.52050066
#>  [6,]  0.23997744 -0.09060785 -0.66116271  1.60835635  3.05820974
#>  [7,] -0.33806060 -0.10754083  0.49266350 -0.01103830 -1.58416684
#>  [8,] -0.50358537  0.46296827 -0.35042124  0.67541860  0.08692676
#>  [9,]  0.48957328 -0.57118759 -0.29702946 -0.60437238  1.38469526
#> [10,]  0.00453532  0.02535062  0.68436595  0.83189714  1.36207911
X1
#>            [,1]       [,2]       [,3]        [,4]        [,5]
#> [1,]  0.2932491 -0.2110239 -0.3383668  0.89764835 -2.32890341
#> [2,] -0.6806095 -0.1969784 -1.2050331 -1.27521736 -0.15690337
#> [3,] -0.3034732 -0.0897417  0.1252289  0.40559282 -0.08458534
#> [4,] -0.0278204 -0.6472604  0.4841235  0.83960832  0.33527481
#> [5,] -0.1069345 -0.2393786  0.6516742  0.02669431 -1.60494038
##' set.seed(12345)
Y0 <- rockchalk::mvrnorm(n=10, mu = rep(0, 5), Sigma = myCov)
## create a smaller data set, starting at same position
set.seed(12345)
Y1 <- rockchalk::mvrnorm(n=5, mu = rep(0, 5), Sigma = myCov)
Y0
#>             [,1]        [,2]        [,3]       [,4]       [,5]
#>  [1,] -0.4124276 -0.76969141 -0.49336123 -1.7032570 -2.5580963
#>  [2,] -0.2689884 -1.24072192 -0.84597461 -2.7712637  1.1277290
#>  [3,]  0.9133846 -0.13199448  0.41973755 -0.4334065 -1.0086031
#>  [4,]  0.5147182 -0.06267782 -0.18888272  0.6686023 -3.5561141
#>  [5,] -0.5449612 -0.85354135 -0.01650729 -0.4651525 -2.9731541
#>  [6,]  0.0723811 -0.11692252  0.32953268 -1.0720930  2.3847186
#>  [7,] -0.1217320 -1.32462124  0.52923054 -0.8852664  0.3414409
#>  [8,]  0.1512240 -0.77219114  0.02787407  1.0535182 -1.3292388
#>  [9,] -0.1941025 -0.46378927  0.34651806  1.5462199  3.1225110
#> [10,]  1.2340517 -0.18190645 -0.13612065  0.5494518 -0.8431627
Y1
#>             [,1]       [,2]       [,3]       [,4]        [,5]
#> [1,] -0.10614068  0.1451396 -0.3014258 -1.3446805 -0.08550324
#> [2,]  0.20608050  0.3993785  0.7664380  1.6135876  2.73190157
#> [3,] -0.07672388 -0.3634412  0.1764648 -1.4438196  1.78158934
#> [4,]  0.30929116 -0.3612533 -0.1520808 -0.2249088 -1.79803733
#> [5,] -0.70715328  0.2547858  0.5543892 -2.2824882  0.30127402

Minor revision of mvrnorm (from `MASS`) to facilitate replication