Convert Data Frame to Correlation Matrix

Computes pairwise correlations between numeric columns and returns results in a tidy long format, sorted by absolute correlation.

Usage

cor_df(data, columns = NULL, use = "complete.obs", method = "pearson")

Arguments

data: A data.frame containing the variables to correlate
columns: Character vector of column names to include. If NULL, all numeric columns will be used.
use: Method for handling missing values, passed to cor(). Default is "complete.obs".
method: Correlation method, passed to cor(). Default is "pearson".

Value

A tibble with columns:

name1, name2: Variable pair names (lexicographically ordered)
CORR: Correlation coefficient
ABSCORR: Absolute correlation coefficient

Results are sorted by ABSCORR in descending order.

Examples

# Create sample data
set.seed(123)
df <- data.frame(
  A = rnorm(100, 5, 2),
  B = rnorm(100, 10, 3),
  C = rnorm(100, 15, 1),
  D = letters[1:100] # non-numeric
)
df$B <- df$A * 0.8 + rnorm(100, 0, 1) # Create some correlation

# All numeric columns
cor_df(df)
#> # A tibble: 3 × 4
#>   name1 name2   CORR ABSCORR
#>   <chr> <chr>  <dbl>   <dbl>
#> 1 A     B      0.806   0.806
#> 2 B     C     -0.134   0.134
#> 3 A     C     -0.129   0.129

# Specific columns
cor_df(df, columns = c("A", "B", "C"))
#> # A tibble: 3 × 4
#>   name1 name2   CORR ABSCORR
#>   <chr> <chr>  <dbl>   <dbl>
#> 1 A     B      0.806   0.806
#> 2 B     C     -0.134   0.134
#> 3 A     C     -0.129   0.129

Convert Data Frame to Correlation Matrix

Usage

Arguments

Value

See also

Examples