Introduction to GMC Package

library(GMC)

Introduction

The GMC package provides tools to compute the Generalized Measure of Correlation (GMC), a dependence measure that accounts for nonlinearity and asymmetry in the relationship between variables. This measure was proposed by Zheng, Shi, and Zhang (2012).

Basic Usage

Computing GMC(Y|X)

# Generate sample data with linear relationship
set.seed(123)
n <- 1000
X <- rnorm(n)
Y <- 2 * X + rnorm(n, sd = 0.5)

# Calculate GMC(Y|X)
gmc_y_given_x <- GMC_Y_given_X(X, Y)
print(paste("GMC(Y|X) =", round(gmc_y_given_x, 3)))
#> [1] "GMC(Y|X) = 0.87"

Computing GMC(X|Y)

# Generate sample data with nonlinear relationship
set.seed(123)
X <- rnorm(n)
Y <- X^2 + rnorm(n, sd = 0.5)

# Calculate GMC(X|Y)
gmc_x_given_y <- GMC_X_given_Y(X, Y)
print(paste("GMC(X|Y) =", round(gmc_x_given_y, 3)))
#> [1] "GMC(X|Y) = 0.061"

Feature Ranking

# Generate sample data with multiple predictors
set.seed(123)
n <- 500
X1 <- rnorm(n)
X2 <- rnorm(n)
X3 <- rnorm(n)
Y <- 2 * X1 + X2^2 + rnorm(n, sd = 0.5)
X <- cbind(X1, X2, X3)

# Rank features by GMC
ranking <- GMC_feature_ranking(X, Y)
print(ranking)
#>   Variable        GMC
#> 1       X1 0.58288115
#> 2       X2 0.32093376
#> 3       X3 0.01087298

References

Zheng, S., Shi, N.Z., & Zhang, Z. (2012). Generalized Measures of Correlation for Asymmetry, Nonlinearity, and Beyond. Journal of the American Statistical Association, 107(499), 1239-1252.