Computes a constructed variable for the Box-Cox transformation of the response variable in a linear model.

boxCoxVariable(y)

Arguments

y

response variable.

Details

The constructed variable is defined as \(y[\log(y/\widetilde{y}) - 1]\), where \(\widetilde{y}\) is the geometric mean of y.

The constructed variable is meant to be added to the right-hand-side of the linear model. The t-test for the coefficient of the constructed variable is an approximate score test for whether a transformation is required.

If \(b\) is the coefficient of the constructed variable, then an estimate of the normalizing power transformation based on the score statistic is \(1 - b\). An added-variable plot for the constructed variable shows leverage and influence on the decision to transform y.

Value

a numeric vector of the same length as y.

References

Atkinson, A. C. (1985) Plots, Transformations, and Regression. Oxford.

Box, G. E. P. and Cox, D. R. (1964) An analysis of transformations. JRSS B 26 211–246.

Fox, J. (2016) Applied Regression Analysis and Generalized Linear Models, Third Edition. Sage.

Fox, J. and Weisberg, S. (2019) An R Companion to Applied Regression, Third Edition, Sage.

Author

John Fox jfox@mcmaster.ca

Examples

mod <- lm(interlocks + 1 ~ assets, data=Ornstein)
mod.aux <- update(mod, . ~ . + boxCoxVariable(interlocks + 1))
summary(mod.aux)
#> 
#> Call:
#> lm(formula = interlocks + 1 ~ assets + boxCoxVariable(interlocks + 
#>     1), data = Ornstein)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -23.1895  -6.7012   0.5411   6.7728  12.0506 
#> 
#> Coefficients:
#>                                  Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)                     1.461e+01  5.426e-01  26.920   <2e-16 ***
#> assets                         -7.142e-05  5.119e-05  -1.395    0.164    
#> boxCoxVariable(interlocks + 1)  7.427e-01  4.136e-02  17.956   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 7.247 on 245 degrees of freedom
#> Multiple R-squared:  0.7986,	Adjusted R-squared:  0.797 
#> F-statistic: 485.7 on 2 and 245 DF,  p-value: < 2.2e-16
#> 
# avPlots(mod.aux, "boxCoxVariable(interlocks + 1)")