Make a specific row the column names for the specified data.frame

Many tables in Word documents are in twisted formats where there may be labels or other oddities mixed in that make it difficult to work with the underlying data. This function makes it easy to identify a particular row in a scraped data.frame as the one containing column names and have it become the column names, removing it and (optionally) all of the rows before it (since that's usually what needs to be done).

assign_colnames(dat, row, remove = TRUE, remove_previous = remove)

Arguments

dat: can be any data.frame but is intended for use with ones retuned by this package
row: numeric value indicating the row number that is to become the column names
remove: remove row specified by row after making it the column names? (Default: TRUE)
remove_previous: remove any rows preceding row? (Default: TRUE but will be assigned whatever is given for remove).

Value

data.frame

Examples

# a "real" Word doc
real_world <- read_docx(system.file("examples/realworld.docx", package="docxtractr"))
docx_tbl_count(real_world)
#> [1] 8

# get all the tables
tbls <- docx_extract_all_tbls(real_world)

# make table 1 better
assign_colnames(tbls[[1]], 2)
#> # A tibble: 7 × 9
#>   Country Birthrate `Death Rate` `Population Growth 2005` Population Growth 20…¹
#>   <chr>   <chr>     <chr>        <chr>                    <chr>                 
#> 1 USA     2.06      0.51%        0.92%                    -0.06%                
#> 2 China   1.62      0.3%         0.6%                     -0.58%                
#> 3 Egypt   2.83      0.41%        2.0%                     1.32%                 
#> 4 India   2.35      0.34%        1.56%                    0.76%                 
#> 5 Italy   1.28      0.72%        0.35%                    -1.33%                
#> 6 Mexico  2.43      0.25%        1.41%                    0.96%                 
#> 7 Nigeria 4.78      0.26%        2.46%                    3.58%                 
#> # ℹ abbreviated name: ¹`Population Growth 2050`
#> # ℹ 4 more variables: `Relative place in Transition` <chr>,
#> #   `Social Factors 1` <chr>, `Social Factors 2` <chr>,
#> #   `Social Factors 3` <chr>

# make table 5 better
assign_colnames(tbls[[5]], 2)
#> # A tibble: 3 × 6
#>   Nigeria           Default Prediction    `+ 5 years` `+15 years` `-5 years`
#>   <chr>             <chr>   <chr>         <chr>       <chr>       <chr>     
#> 1 Birth rate        4.78    Goes Down     4.76        4.72        4.79      
#> 2 Death rate        0.36%   Stay the Same 0.42%       0.52%       0.3%      
#> 3 Population growth 3.58%   Goes Down     3.02%       2.32%       4.38%

Make a specific row the column names for the specified data.frame

Arguments

Value

See also

Examples