Box-Cox, Box-Cox with Negatives Allowed, Yeo-Johnson and Basic Power Transformations

Transform the elements of a vector or columns of a matrix using, the Box-Cox, Box-Cox with negatives allowed, Yeo-Johnson, or simple power transformations.

Usage

bcPower(U, lambda, jacobian.adjusted=FALSE, gamma=NULL)

bcnPower(U, lambda, jacobian.adjusted = FALSE, gamma)

bcnPowerInverse(z, lambda, gamma)

yjPower(U, lambda, jacobian.adjusted = FALSE)

basicPower(U,lambda, gamma=NULL)

Arguments

U: A vector, matrix or data.frame of values to be transformed
lambda: Power transformation parameter with one element for each column of U, usuallly in the range from $-2$ to $2$.
jacobian.adjusted: If TRUE, the transformation is normalized to have Jacobian equal to one. The default FALSE is almost always appropriate.
gamma: For bcPower or basicPower, the transformation is of U + gamma, where gamma is a positive number called a start that must be large enough so that U + gamma is strictly positive. For the bcnPower, Box-cox power with negatives allowed, see the details below.
z: a numeric vector the result of a call to bcnPower with jacobian.adjusted=FALSE

Details

The Box-Cox family of scaled power transformations equals $(x^{\lambda}-1)/\lambda$ for $\lambda \neq 0$, and $\log(x)$ if $\lambda =0$. The bcPower function computes the scaled power transformation of $x = U + \gamma$, where $\gamma$ is set by the user so $U+\gamma$ is strictly positive for these transformations to make sense.

The Box-Cox family with negatives allowed was proposed by Hawkins and Weisberg (2017). It is the Box-Cox power transformation of $$z = .5 (U + \sqrt{U^2 + \gamma^2)})$$ where for this family $\gamma$ is either user selected or is estimated. gamma must be positive if $U$ includes negative values and non-negative otherwise, ensuring that $z$ is always positive. The bcnPower transformations behave similarly to the bcPower transformations, and introduce less bias than is introduced by setting the parameter $\gamma$ to be non-zero in the Box-Cox family.

The function bcnPowerInverse computes the inverse of the bcnPower function, so U = bcnPowerInverse(bcnPower(U, lambda=lam, jacobian.adjusted=FALSE, gamma=gam), lambda=lam, gamma=gam) is true for any permitted value of gam and lam.

If family="yeo.johnson" then the Yeo-Johnson transformations are used. This is the Box-Cox transformation of $U+1$ for nonnegative values, and of $|U|+1$ with parameter $2-\lambda$ for $U$ negative.

The basic power transformation returns $U^{\lambda}$ if $\lambda$ is not 0, and $\log(\lambda)$ otherwise for $U$ strictly positive.

If jacobian.adjusted is TRUE, then the scaled transformations are divided by the Jacobian, which is a function of the geometric mean of $U$ for skewPower and yjPower and of $U + gamma$ for bcPower. With this adjustment, the Jacobian of the transformation is always equal to 1. Jacobian adjustment facilitates computing the Box-Cox estimates of the transformation parameters.

Missing values are permitted, and return NA where ever U is equal to NA.

Value

Returns a vector or matrix of transformed values.

References

Fox, J. and Weisberg, S. (2019) An R Companion to Applied Regression, Third Edition, Sage.

Hawkins, D. and Weisberg, S. (2017) Combining the Box-Cox Power and Generalized Log Transformations to Accomodate Nonpositive Responses In Linear and Mixed-Effects Linear Models South African Statistics Journal, 51, 317-328.

Weisberg, S. (2014) Applied Linear Regression, Fourth Edition, Wiley Wiley, Chapter 7.

Yeo, In-Kwon and Johnson, Richard (2000) A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954-959.

Author

Sanford Weisberg, <sandy@umn.edu>

Examples

U <- c(NA, (-3:3))
if (FALSE) bcPower(U, 0) # \dontrun{}  # produces an error as U has negative values
bcPower(U, 0, gamma=4)
#>           Z1^0
#> [1,]        NA
#> [2,] 0.0000000
#> [3,] 0.6931472
#> [4,] 1.0986123
#> [5,] 1.3862944
#> [6,] 1.6094379
#> [7,] 1.7917595
#> [8,] 1.9459101
bcPower(U, .5, jacobian.adjusted=TRUE, gamma=4)
#>        Z1^0.5
#> [1,]       NA
#> [2,] 0.000000
#> [3,] 1.523048
#> [4,] 2.691724
#> [5,] 3.676964
#> [6,] 4.544977
#> [7,] 5.329721
#> [8,] 6.051368
bcnPower(U, 0, gamma=2)
#> [1]         NA -1.1947632 -0.8813736 -0.4812118  0.0000000  0.4812118  0.8813736
#> [8]  1.1947632
basicPower(U, lambda = 0, gamma=4)
#>        log(Z1)
#> [1,]        NA
#> [2,] 0.0000000
#> [3,] 0.6931472
#> [4,] 1.0986123
#> [5,] 1.3862944
#> [6,] 1.6094379
#> [7,] 1.7917595
#> [8,] 1.9459101
yjPower(U, 0)
#> [1]         NA -7.5000000 -4.0000000 -1.5000000  0.0000000  0.6931472  1.0986123
#> [8]  1.3862944
V <- matrix(1:10, ncol=2)
bcPower(V, c(0, 2))
#>           Z1^0 Z2^2
#> [1,] 0.0000000 17.5
#> [2,] 0.6931472 24.0
#> [3,] 1.0986123 31.5
#> [4,] 1.3862944 40.0
#> [5,] 1.6094379 49.5
basicPower(V, c(0,1))
#>        log(Z1) Z2^1
#> [1,] 0.0000000    6
#> [2,] 0.6931472    7
#> [3,] 1.0986123    8
#> [4,] 1.3862944    9
#> [5,] 1.6094379   10