Compute correlations of columns of a dataframe of mixed types. The dataframe is allowed to have columns of these four classes: integer, numeric, factor and character. The character column is considered as categorical variable.

cor2(df, nproc = 1)

Arguments

df

input data frame

nproc

Number of parallel processes to use

Value

A simil/dist object.

Details

The correlation is computed as follows:

  • integer/numeric pair: pearson correlation using `cor` function. The valuelies between -1 and 1.

  • integer/numeric - factor/categorical pair: Anova is performed and effect size is computed . The value lies between 0 and 1.

  • factor/categorical pair: cramersV value is computed based on chisq test using `lsr::cramersV` function. The value lies between 0 and 1.

Pairwise complete observations are used to compute correlation. For a comprehensive implementation, use `polycor::hetcor`

Examples

iris_cor <- cor2(iris) iris_cor <- cor2(iris)