
Convert Data Frame to Correlation Matrix
cor_df.RdComputes pairwise correlations between numeric columns and returns results in a tidy long format, sorted by absolute correlation.
Arguments
- data
A data.frame containing the variables to correlate
- columns
Character vector of column names to include. If NULL, all numeric columns will be used.
- use
Method for handling missing values, passed to cor(). Default is "complete.obs".
- method
Correlation method, passed to cor(). Default is "pearson".
Value
A tibble with columns:
- name1, name2
Variable pair names (lexicographically ordered)
- CORR
Correlation coefficient
- ABSCORR
Absolute correlation coefficient
Results are sorted by ABSCORR in descending order.
See also
Other statistics:
cv(),
geom_cv(),
geom_mean(),
geom_sd()
Examples
# Create sample data
set.seed(123)
df <- data.frame(
A = rnorm(100, 5, 2),
B = rnorm(100, 10, 3),
C = rnorm(100, 15, 1),
D = letters[1:100] # non-numeric
)
df$B <- df$A * 0.8 + rnorm(100, 0, 1) # Create some correlation
# All numeric columns
cor_df(df)
#> # A tibble: 3 × 4
#> name1 name2 CORR ABSCORR
#> <chr> <chr> <dbl> <dbl>
#> 1 A B 0.806 0.806
#> 2 B C -0.134 0.134
#> 3 A C -0.129 0.129
# Specific columns
cor_df(df, columns = c("A", "B", "C"))
#> # A tibble: 3 × 4
#> name1 name2 CORR ABSCORR
#> <chr> <chr> <dbl> <dbl>
#> 1 A B 0.806 0.806
#> 2 B C -0.134 0.134
#> 3 A C -0.129 0.129