R/mdply.r
mdply.RdCall a multi-argument function with values taken from columns of an data frame or array, and combine results into a data frame
mdply(
.data,
.fun = NULL,
...,
.expand = TRUE,
.progress = "none",
.inform = FALSE,
.parallel = FALSE,
.paropts = NULL
)matrix or data frame to use as source of arguments
function to apply to each piece
other arguments passed on to .fun
should output be 1d (expand = FALSE), with an element for each row; or nd (expand = TRUE), with a dimension for each variable.
name of the progress bar to use, see
create_progress_bar
produce informative error messages? This is turned off by default because it substantially slows processing speed, but is very useful for debugging
if TRUE, apply function in parallel, using parallel
backend provided by foreach
a list of additional options passed into
the foreach function when parallel computation
is enabled. This is important if (for example) your code relies on
external data or packages: use the .export and .packages
arguments to supply them so that all cluster nodes have the correct
environment set up for computing.
A data frame, as described in the output section.
The m*ply functions are the plyr version of mapply,
specialised according to the type of output they produce. These functions
are just a convenient wrapper around a*ply with margins = 1
and .fun wrapped in splat.
Call a multi-argument function with values taken from columns of an data frame or array
The most unambiguous behaviour is achieved when .fun returns a
data frame - in that case pieces will be combined with
rbind.fill. If .fun returns an atomic vector of
fixed length, it will be rbinded together and converted to a data
frame. Any other values will result in an error.
If there are no results, then this function will return a data
frame with zero rows and columns (data.frame()).
Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. https://www.jstatsoft.org/v40/i01/.
mdply(data.frame(mean = 1:5, sd = 1:5), rnorm, n = 2)
#> mean sd V1 V2
#> 1 1 1 2.704609 0.9199264
#> 2 2 2 1.125438 1.7615698
#> 3 3 3 5.359389 1.2631643
#> 4 4 4 3.418292 6.1058320
#> 5 5 5 13.667891 12.2432861
mdply(expand.grid(mean = 1:5, sd = 1:5), rnorm, n = 2)
#> mean sd V1 V2
#> 1 1 1 2.5181931 0.6159927
#> 2 2 1 3.8271252 1.4485083
#> 3 3 1 2.1342465 2.6561685
#> 4 4 1 5.0628765 4.8130582
#> 5 5 1 6.8034834 4.8949313
#> 6 1 2 2.9649067 -2.4266052
#> 7 2 2 0.3359609 4.2009838
#> 8 3 2 2.6523598 3.3576240
#> 9 4 2 2.6031411 2.0791017
#> 10 5 2 3.0491539 4.3228470
#> 11 1 3 4.4570412 2.2153036
#> 12 2 3 0.5872325 1.6002469
#> 13 3 3 6.6800471 3.9988320
#> 14 4 3 2.9587346 3.7043479
#> 15 5 3 5.1042982 6.1583811
#> 16 1 4 1.0833249 1.0303471
#> 17 2 4 5.7233761 -0.7389998
#> 18 3 4 4.3496061 1.3514492
#> 19 4 4 7.7370445 11.3612670
#> 20 5 4 2.1807213 5.0340412
#> 21 1 5 11.1709494 -5.7084303
#> 22 2 5 7.7948959 0.9839552
#> 23 3 5 1.1098572 11.6805552
#> 24 4 5 -0.2262391 -0.8078575
#> 25 5 5 10.0874553 -2.4802687
mdply(cbind(mean = 1:5, sd = 1:5), rnorm, n = 5)
#> mean sd V1 V2 V3 V4 V5
#> 1 1 1 -0.1848187 1.6302344 3.1012525 0.38626319 -0.6346383
#> 2 2 2 1.9791178 0.6869877 0.6609331 1.04282194 4.6389126
#> 3 3 3 4.9096883 4.5429833 -2.2541253 5.68079256 3.6691151
#> 4 4 4 6.3232664 3.2887143 6.9638668 0.01022768 -7.7559102
#> 5 5 5 8.5950783 1.5099748 -4.4706292 5.38149624 9.3765425
mdply(cbind(mean = 1:5, sd = 1:5), as.data.frame(rnorm), n = 5)
#> mean sd value
#> 1 1 1 1.4538274
#> 2 1 1 0.1492831
#> 3 1 1 1.5662016
#> 4 1 1 2.1522120
#> 5 1 1 0.2438026
#> 6 2 2 1.0214833
#> 7 2 2 -0.3321047
#> 8 2 2 1.0406621
#> 9 2 2 2.2306964
#> 10 2 2 -1.5360968
#> 11 3 3 -1.2229168
#> 12 3 3 5.1275354
#> 13 3 3 -0.7225288
#> 14 3 3 1.8950180
#> 15 3 3 4.3862403
#> 16 4 4 2.7086676
#> 17 4 4 -1.1488592
#> 18 4 4 -0.1201610
#> 19 4 4 10.0563573
#> 20 4 4 5.3876143
#> 21 5 5 13.8972077
#> 22 5 5 6.9331546
#> 23 5 5 0.4065238
#> 24 5 5 -2.9216824
#> 25 5 5 4.5797055