Finding Univariate or Multivariate Power Transformations

powerTransform uses the maximum likelihood-like approach of Box and Cox (1964) to select a transformatiion of a univariate or multivariate response for normality, linearity and/or constant variance. Available families of transformations are the default Box-Cox power family and two additioal families that are modifications of the Box-Cox family that allow for (a few) negative responses. The summary method automatically computes two or three likelihood ratio type tests concerning the transformation powers.

Usage

powerTransform(object, ...)

# Default S3 method
powerTransform(object, family="bcPower", ...)

# S3 method for class 'lm'
powerTransform(object, family="bcPower", ...)

# S3 method for class 'formula'
powerTransform(object, data, subset, weights, na.action,
    family="bcPower", ...)

# S3 method for class 'lmerMod'
powerTransform(object, family="bcPower", ...)

Arguments

object: This can either be an object of class lm or lmerMod, a formula, or a matrix or vector; see below.
family: The quoted name of a family of transformations. The available options are "bcPower" for the default for the Box-Cox power family; "bcnPower" for a two-parameter modification of the Box-Cox family that allows negative responses (Hawkins and Weisberg (2017)), and the "yjPower" family (Yeo and Johnson(2000)), another modifiation of the Box-Cox family that allows a few negative values. All three families are documented at bcPower.
data: A data frame or environment, as in ‘lm’.
subset: Case indices to be used, as in ‘lm’.
weights: Weights as in ‘lm’.
na.action: Missing value action, as in ‘lm’.
...: Additional arguments that used in the interative algorithm; defaults are generally adequate. For use with the bcnPower family, a convergence criterion can be set with conv=.0001 the default, and a minimum positive value of the location parameter can be set, with default gamma.min=.1.

Details

This function implements the Box and Cox (1964) method of selecting a power transformation of a variable toward normality, and its generalization by Velilla (1993) to a multivariate response. Cook and Weisberg (1999) and Weisberg (2014) suggest the usefulness of transforming a set of predictors z1, z2, z3 for multivariate normality. It also includes two additional families that allow for negative values.

If the object argument is of class ‘lm’ or ‘lmerMod’, the Box-Cox procedure is applied to the conditional distribution of the response given the predictors. For ‘lm’ objects, the respose may be multivariate, and each column will have its own transformation. With ‘lmerMod’ the response must be univariate.

The object argument may also be a formula. For example, z ~ x1 + x2 + x3 will estimate a transformation for the response z from a family after fitting a linear model with the given formula. cbind(y1, y2, y3) ~ 1 specifies transformations to multivariate normality with no predictors. A vector value for object, for example powerTransform(ais$LBM), is equivalent topowerTransform(LBM ~ 1, ais). Similarly, powerTransform(cbind(ais$LBM, ais$SSF)), where the first argument is a matrix rather than a formula is equivalent to specification of a mulitvariate linear model powerTransform(cbind(LBM, SSF) ~ 1, ais).

Three families of power transformations are available. The default Box-Cox power family (family="bcPower") of power transformations effectively replaces a vector by that vector raised to a power, generally in the range from -3 to 3. For powers close to zero, the log-transformtion is suggested. In practical situations, after estimating a power using the powerTransform function, a variable would be replaced by a simple power transformation of it, for example, if $\lambda\approx 0.5$, then the correspoding variable would be replaced by its square root; if $\lambda$ is close enough to zero, the the variable would be replaced by its natural logarithm. The Box-Cox family requires the responses to be strictly positive.

The family="bcnPower", or Box-Cox with negatives, family proposed by Hawkins and Weisberg (2017) allows for (a few) non-positive values, while allowing for the transformed data to be interpreted similarly to the interpretation of Box-Cox transformed values. This family is the Box-Cox transformation of $z = .5 * (y + (y^2 + \gamma^2)^{1/2})$ that depends on a location parameter $\gamma$. The quantity $z$ is positive for all values of $y$. If $\gamma = 0$ and $y$ is strictly positive, then the Box-Cox and the bcnPower transformations are identical. When fitting the Box-Cox with negatives family, lambda is restricted to the range [-3, 3], and gamma is restricted to the range from gamma.min=.1 to the largest positive value of the variable, since values outside these ranges are unreasonable in practice. The value of gamma.min can be changed with an argument to powerTransform.

The final family family="yjPower" uses the Yeo-Johnson transformation, which is the Box-Cox transformation of $U+1$ for nonnegative values, and of $|U|+1$ with parameter $2-\lambda$ for $U$ negative and thus it provides a family for fitting when (a few) observations are negative. Because of the unusual constraints on the powers for positive and negative data, this transformation is not used very often, as results are difficult to interpret. In practical problems, a variable would be replaced by its Yeo-Johnson transformation computed using the yjPower function.

The function testTransform is used to obtain likelihood ratio tests for any specified value for the transformation parameter(s).

Computations maximize the likelihood-like functions described by Box and Cox (1964) and by Velilla (1993). For univariate responses, the computations are very stable and problems are unlikely, although for ‘lmer’ models computations may be very slow because the model is refit many times. For multivariate responses with the bcnPower family, the computing algorithm may fail. In this case we recommend adding the argument itmax = 1 to the call to powerTransform. This will return the starting value estimates of the transformation parameters, fitting a d-dimensional response as if all the d responses were independent.

Value

An object of class powerTransform or class bcnPowerTransform if family="bcnPower" that inherits from powerTransform is returned, including the components listed below.

A summary method presents estimated values for the transformation power lambda and for the ‘bcnPower’ family the location parameter gamma as well. Standard errors and Wald 95% confidence intervals based on the standard errors are computed from the inverse of the sample Hessian matrix evaluted at the estimates. The interval estimates for the gamma parameters will generally be very wide, reflecting little information available about the location parameter. Likelihood ratio type tests are also provided. For the ‘bcnPower’ family these are based on the profile loglikelihood for lambda alone; that is, we treat gamma as a nusiance parameter and average over it.

The components of the returned object includes

lambda: Estimated transformation parameter
roundlam: Convenient rounded values for the estimates. These rounded values will usually be the desired transformations.
gamma: Estimated location parameters for bcnPower, NULL otherwise
invHess: Estimated covariance matrix of the estimated parameters
llik: Value of the log-likelihood at the estimates

The summary method for powerTransform returns an array with columns labeled "Est Power" for the value of lambda that maximizes the likelihood; "Rounded Pwr" for roundlam, and columns "Wald Lwr Bnd" and "Wald Ur Bnd" for a 95 percent Wald normal theory confidence interval for lambda computed as the estimate plus or minus 1.96 times the standard error.

References

Box, G. E. P. and Cox, D. R. (1964) An analysis of transformations. Journal of the Royal Statisistical Society, Series B. 26 211-46.

Cook, R. D. and Weisberg, S. (1999) Applied Regression Including Computing and Graphics. Wiley.

Fox, J. and Weisberg, S. (2019) An R Companion to Applied Regression, Third Edition, Sage.

Hawkins, D. and Weisberg, S. (2017) Combining the Box-Cox Power and Generalized Log Transformations to Accomodate Nonpositive Responses In Linear and Mixed-Effects Linear Models South African Statistics Journal, 51, 317-328.

Velilla, S. (1993) A note on the multivariate Box-Cox transformation to normality. Statistics and Probability Letters, 17, 259-263.

Weisberg, S. (2014) Applied Linear Regression, Fourth Edition, Wiley.

Yeo, I. and Johnson, R. (2000) A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954-959.

Author

Sanford Weisberg, <sandy@umn.edu>

Examples

# Box Cox Method, univariate
summary(p1 <- powerTransform(cycles ~ len + amp + load, Wool))
#> bcPower Transformation to Normality 
#>    Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
#> Y1   -0.0592           0      -0.1789       0.0606
#> 
#> Likelihood ratio test that transformation parameter is equal to 0
#>  (log transformation)
#>                             LRT df    pval
#> LR test, lambda = (0) 0.9213384  1 0.33712
#> 
#> Likelihood ratio test that no transformation is needed
#>                            LRT df       pval
#> LR test, lambda = (1) 84.07566  1 < 2.22e-16
# fit linear model with transformed response:
coef(p1, round=TRUE)
#> Y1 
#>  0 
summary(m1 <- lm(bcPower(cycles, p1$roundlam) ~ len + amp + load, Wool))
#> 
#> Call:
#> lm(formula = bcPower(cycles, p1$roundlam) ~ len + amp + load, 
#>     data = Wool)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.43592 -0.11250  0.00802  0.11635  0.26790 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 10.551813   0.616683  17.111 1.41e-14 ***
#> len          0.016648   0.000875  19.025 1.43e-15 ***
#> amp         -0.630866   0.043752 -14.419 5.22e-13 ***
#> load        -0.078524   0.008750  -8.974 5.66e-09 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.1856 on 23 degrees of freedom
#> Multiple R-squared:  0.9658,	Adjusted R-squared:  0.9614 
#> F-statistic: 216.8 on 3 and 23 DF,  p-value: < 2.2e-16
#> 

# Multivariate Box Cox uses Highway1 data
summary(powerTransform(cbind(len, adt, trks, sigs1) ~ 1, Highway1))
#> bcPower Transformations to Multinormality 
#>       Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
#> len      0.1439           0      -0.2728       0.5606
#> adt      0.0876           0      -0.1712       0.3464
#> trks    -0.6954           0      -1.9046       0.5139
#> sigs1   -0.2654           0      -0.5575       0.0267
#> 
#> Likelihood ratio test that transformation parameters are equal to 0
#>  (all log transformations)
#>                                  LRT df    pval
#> LR test, lambda = (0 0 0 0) 6.014218  4 0.19809
#> 
#> Likelihood ratio test that no transformations are needed
#>                                  LRT df       pval
#> LR test, lambda = (1 1 1 1) 127.7221  4 < 2.22e-16

# Multivariate transformation to normality within levels of 'htype'
summary(a3 <- powerTransform(cbind(len, adt, trks, sigs1) ~ htype, Highway1))
#> bcPower Transformations to Multinormality 
#>       Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
#> len      0.1451        0.00      -0.2733       0.5636
#> adt      0.2396        0.33       0.0255       0.4536
#> trks    -0.7336        0.00      -1.9408       0.4735
#> sigs1   -0.2959       -0.50      -0.5511      -0.0408
#> 
#> Likelihood ratio test that transformation parameters are equal to 0
#>  (all log transformations)
#>                                 LRT df    pval
#> LR test, lambda = (0 0 0 0) 13.1339  4 0.01064
#> 
#> Likelihood ratio test that no transformations are needed
#>                                  LRT df       pval
#> LR test, lambda = (1 1 1 1) 140.5853  4 < 2.22e-16

# test lambda = (0 0 0 -1)
testTransform(a3, c(0, 0, 0, -1))
#>                                   LRT df       pval
#> LR test, lambda = (0 0 0 -1) 31.12644  4 2.8849e-06

# save the rounded transformed values, plot them with a separate
# color for each highway type
transformedY <- bcPower(with(Highway1, cbind(len, adt, trks, sigs1)),
                        coef(a3, round=TRUE))
if (FALSE)  # generates a smoother warning
scatterplotMatrix( ~ transformedY|htype, Highway1) 
 # \dontrun{}

# With negative responses, use the bcnPower family
m2 <- lm(I1L1 ~ pool, LoBD)
summary(p2 <- powerTransform(m2, family="bcnPower"))
#> bcnPower transformation to Normality 
#> 
#> Estimated power, lambda
#>    Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
#> Y1    0.5411         0.5        0.254       0.8282
#> 
#> Estimated location, gamma
#>    Est gamma Std Err. Wald Lower Bound Wald Upper Bound
#> Y1   16.5676  12.4057                0          40.8829
#> 
#> Likelihood ratio tests about transformation parameters
#>                            LRT df         pval
#> LR test, lambda = (0) 56.69008  1 5.107026e-14
#> LR test, lambda = (1) 46.95805  1 7.252199e-12
testTransform(p2, .5)
#>                               LRT df      pval
#> LR test, lambda = (0.5) 0.3364893  1 0.5618627
summary(powerTransform(update(m2, cbind(LoBD$I1L2, LoBD$I1L1) ~ .), family="bcnPower"))
#> bcnPower transformation to Multinormality 
#> 
#> Estimated power, lambda
#>    Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
#> Y1     0.554         0.5       0.3121       0.7959
#> Y2     0.516         0.5       0.2204       0.8115
#> 
#> Estimated location, gamma
#>    Est gamma Std Err. Wald Lower Bound Wald Upper Bound
#> Y1   13.5844   9.4546                0          32.1154
#> Y2   17.6867  12.5295                0          42.2444
#> 
#> Likelihood ratio tests about transformation parameters
#>                               LRT df pval
#> LR test, lambda = (0 0) 119.24783  2    0
#> LR test, lambda = (1 1)  99.35734  2    0

if (FALSE)  # takes a few seconds:
  # multivariate bcnPower, with 8 responses
  summary(powerTransform(update(m2, as.matrix(LoBD[, -1]) ~ .), family="bcnPower"))
  # multivariate bcnPower, fit with one iteration using starting values as estimates
  summary(powerTransform(update(m2, as.matrix(LoBD[, -1]) ~ .), family="bcnPower", itmax=1))
#> Warning: One iteration only, results assume responses are uncorrelated
#> bcnPower transformation to Multinormality 
#> 
#> Estimated power, lambda
#>      Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
#> I1L1    0.6083       0.500       0.4417       0.7748
#> I1L2    0.6259       0.500       0.4628       0.7890
#> I2L1    0.4211       0.500       0.2328       0.6094
#> I2L2    0.5197       0.500       0.3855       0.6539
#> I3L1    0.5312       0.500       0.3491       0.7133
#> I3L2    0.7149       0.715       0.5267       0.9031
#> I4L1    0.5357       0.500       0.3320       0.7394
#> I4L2    0.6802       0.500       0.4666       0.8939
#> 
#> Estimated location, gamma
#>      Est gamma Std Err. Wald Lower Bound Wald Upper Bound
#> I1L1    9.2651   5.1016           0.0000          19.2642
#> I1L2    8.7832   5.2569           0.0000          19.0868
#> I2L1   10.9212   4.3390           2.4167          19.4257
#> I2L2   10.8445   5.7514           0.0000          22.1172
#> I3L1    6.9378   3.9271           0.0000          14.6350
#> I3L2   10.9228   7.0303           0.0000          24.7023
#> I4L1    8.9549   4.2884           0.5496          17.3602
#> I4L2    8.8423   6.2439           0.0000          21.0804
#> 
#> Likelihood ratio tests about transformation parameters
#>                                          LRT df pval
#> LR test, lambda = (0 0 0 0 0 0 0 0) 594.0722  8    0
#> LR test, lambda = (1 1 1 1 1 1 1 1) 342.2410  8    0
 # \dontrun{}

# mixed effects model
if (FALSE)  # uses the lme4 package
  data <- reshape(LoBD[1:20, ], varying=names(LoBD)[-1], direction="long", v.names="y")
  names(data) <- c("pool", "assay", "y", "id")
#> Error in names(data) <- c("pool", "assay", "y", "id"): names() applied to a non-vector
  data$assay <- factor(data$assay)
#> Error in data$assay: object of type 'closure' is not subsettable
  require(lme4)
  m2 <- lmer(y ~ pool + (1|assay), data)
#> Error in list2env(data): first argument must be a named list
  summary(l2 <- powerTransform(m2, family="bcnPower", verbose=TRUE))
#>   Iter   gamma    lambda      llik       crit
#> 1    1 8.84141 0.6580407 -60.13782 0.02351759
#>   Iter    gamma   lambda      llik        crit
#> 1    2 10.66403 0.625854 -59.84053 0.004968034
#>   Iter    gamma    lambda      llik        crit
#> 1    3 12.05922 0.6040561 -59.71077 0.002173172
#>   Iter    gamma   lambda      llik        crit
#> 1    4 13.16512 0.587776 -59.64198 0.001153379
#>   Iter   gamma    lambda      llik         crit
#> 1    5 14.0547 0.5751436 -59.60227 0.0006661582
#>   Iter    gamma    lambda      llik         crit
#> 1    6 14.77656 0.5651406 -59.57822 0.0004036805
#>   Iter    gamma   lambda      llik         crit
#> 1    7 15.36647 0.557109 -59.56318 0.0002525962
#>   Iter    gamma    lambda      llik         crit
#> 1    8 15.84687 0.5506555 -59.55363 0.0001602949
#>   Iter    gamma    lambda      llik         crit
#> 1    9 16.24248 0.5453949 -59.54742 0.0001042265
#>   Iter    gamma    lambda      llik         crit
#> 1   10 16.56765 0.5411059 -59.54336 6.825245e-05
#> bcnPower transformation to Normality 
#> 
#> Estimated power, lambda
#>    Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
#> Y1    0.5411         0.5        0.254       0.8282
#> 
#> Estimated location, gamma
#>    Est gamma Std Err. Wald Lower Bound Wald Upper Bound
#> Y1   16.5676  12.4057                0          40.8829
#> 
#> Likelihood ratio tests about transformation parameters
#>                            LRT df         pval
#> LR test, lambda = (0) 56.69008  1 5.107026e-14
#> LR test, lambda = (1) 46.95805  1 7.252199e-12
 # \dontrun{}