R/smartbind.R
smartbind.RdEfficient rbind of data frames, even if the column names don't match
smartbind(..., list, fill = NA, sep = ":", verbose = FALSE)Data frames to combine
List containing data frames to combine
Value to use when 'filling' missing columns. Defaults to
NA.
Character string used to separate column names when pasting them together.
Logical flag indicating whether to display processing
messages. Defaults to FALSE.
The returned data frame will contain:
all columns present in any provided data frame
a set of rows from each
provided data frame, with values in columns not present in the given data
frame filled with missing (NA) values.
The data type of columns will be preserved, as long as all data frames with a given column name agree on the data type of that column. If the data frames disagree, the column will be converted into a character strings. The user will need to coerce such character columns into an appropriate type.
df1 <- data.frame(A = 1:10, B = LETTERS[1:10], C = rnorm(10))
df2 <- data.frame(A = 11:20, D = rnorm(10), E = letters[1:10])
# rbind would fail
if (FALSE) { # \dontrun{
rbind(df1, df2)
# Error in match.names(clabs, names(xi)) : names do not match previous
# names:
# D, E
} # }
# but smartbind combines them, appropriately creating NA entries
smartbind(df1, df2)
#> A B C D E
#> 1:1 1 A -1.6176047 NA <NA>
#> 1:2 2 B -0.7237319 NA <NA>
#> 1:3 3 C 0.3067410 NA <NA>
#> 1:4 4 D 0.2255962 NA <NA>
#> 1:5 5 E 0.9357160 NA <NA>
#> 1:6 6 F 0.4424049 NA <NA>
#> 1:7 7 G 0.4545190 NA <NA>
#> 1:8 8 H -0.9620740 NA <NA>
#> 1:9 9 I -1.1324652 NA <NA>
#> 1:10 10 J -0.6003270 NA <NA>
#> 2:1 11 <NA> NA -1.77506105 a
#> 2:2 12 <NA> NA -0.09171419 b
#> 2:3 13 <NA> NA -0.23262573 c
#> 2:4 14 <NA> NA -0.51310927 d
#> 2:5 15 <NA> NA 0.18558859 e
#> 2:6 16 <NA> NA -1.43162311 f
#> 2:7 17 <NA> NA -1.89864479 g
#> 2:8 18 <NA> NA 0.57123723 h
#> 2:9 19 <NA> NA -0.97562605 i
#> 2:10 20 <NA> NA -0.87647620 j
# specify fill=0 to put 0 into the missing row entries
smartbind(df1, df2, fill = 0)
#> A B C D E
#> 1:1 1 A -1.6176047 0.00000000 0
#> 1:2 2 B -0.7237319 0.00000000 0
#> 1:3 3 C 0.3067410 0.00000000 0
#> 1:4 4 D 0.2255962 0.00000000 0
#> 1:5 5 E 0.9357160 0.00000000 0
#> 1:6 6 F 0.4424049 0.00000000 0
#> 1:7 7 G 0.4545190 0.00000000 0
#> 1:8 8 H -0.9620740 0.00000000 0
#> 1:9 9 I -1.1324652 0.00000000 0
#> 1:10 10 J -0.6003270 0.00000000 0
#> 2:1 11 0 0.0000000 -1.77506105 a
#> 2:2 12 0 0.0000000 -0.09171419 b
#> 2:3 13 0 0.0000000 -0.23262573 c
#> 2:4 14 0 0.0000000 -0.51310927 d
#> 2:5 15 0 0.0000000 0.18558859 e
#> 2:6 16 0 0.0000000 -1.43162311 f
#> 2:7 17 0 0.0000000 -1.89864479 g
#> 2:8 18 0 0.0000000 0.57123723 h
#> 2:9 19 0 0.0000000 -0.97562605 i
#> 2:10 20 0 0.0000000 -0.87647620 j
#> Warning: Column class mismatch for 'C'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'B'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'E'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'E'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'B'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'C'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'B'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'E'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'E'. Converting column to class 'character'.
#> Warning: Column class mismatch for 'B'. Converting column to class 'character'.