ordinalcorr.hetcor¶
- ordinalcorr.hetcor(data: DataFrame, n_unique: int = 20) DataFrame[source]¶
Estimate the heterogeneous correlation matrix.
The heterogeneous correlation matrix includes:
Pearson product-moment correlations between continuous variables
Polychoric correlations between ordinal variables
Polyserial correlations between continuous and ordinal variables
- Parameters:
data (pd.DataFrame) –
A DataFrame containing continuous and/or ordinal variables. Appropriate correlation coefficients are automatically selected based on the types of variables.
Columns with dtype float are treated as continuous variables.
Columns with dtype int and number of unique values less than or equal to n_unique are treated as ordinal variables.
Columns with dtype category are treated as ordinal variables if they are ordered.
n_unique (int, default=20) – The maximum number of unique values for an integer column to be considered ordinal. If the number of unique values exceeds n_unique, the column is treated as continuous.
- Returns:
Estimated heterogeneous correlation matrix.
- Return type:
pd.DataFrame
Examples
>>> from ordinalcorr import hetcor >>> import pandas as pd >>> data = pd.DataFrame({ ... "continuous": [0.1, 0.1, 0.2, 0.2, 0.3, 0.3], ... "ordinal": [0, 0, 0, 1, 1, 2], ... }) >>> hetcor(data)