Provides tools for detecting XOR-like patterns in variable pairs. Includes visualizations for pattern exploration.
Traditional feature selection methods often miss complex non-linear
relationships where variables interact to produce class differences. The
detectXOR package specifically targets XOR
patterns - relationships where class discrimination only
emerges through variable interactions, not individual variables
alone.
π XOR pattern detection - Statistical
identification using ΟΒ² and Wilcoxon tests
π Correlation analysis - Class-wise Kendall Ο
coefficients
π Visualization - Spaghetti plots and decision
boundary visualizations
β‘ Parallel processing - Multi-core acceleration for
large datasets
π¬ Robust statistics - Winsorization and scaling
options for outlier handling
Install the development version from GitHub:
# Install devtools if needed
if (!requireNamespace("devtools", quietly = TRUE)) { install.packages("devtools") }
# Install detectXOR
devtools::install_github("JornLotsch/detectXOR")The package requires R β₯ 3.5.0 and depends on: - dplyr,
tibble (data manipulation) - ggplot2,
ggh4x, scales (visualization) -
future, future.apply, pbmcapply,
parallel (parallel processing) - reshape2,
glue (data processing and string manipulation) -
DescTools (statistical tools) - Base R packages:
stats, utils, methods,
grDevices
Optional packages (suggested): - testthat,
knitr, rmarkdown (development and
documentation) - doParallel, foreach
(additional parallel processing options)
library(detectXOR)
# Load example data
data(XOR_data)
# Detect XOR patterns with default settings
results <- detectXOR(XOR_data, class_col = "class")
# View summary
print(results$results_df)# Detection with custom thresholds and parallel processing
results <- detect_xor(
  data = XOR_data,
  class_col = "class",
  p_threshold = 0.01,
  tau_threshold = 0.4,
  max_cores = 4,
  extreme_handling = "winsorize",
  scale_data = TRUE
)detectXOR() -
Main detection function| Parameter | Type | Default | Description | 
|---|---|---|---|
data | 
data.frame | required | Input dataset with variables and class column | 
class_col | 
character | "class" | 
Name of the class/target variable column | 
check_tau | 
logical | TRUE | 
Compute class-wise Kendall Ο correlations | 
compute_axes_parallel_significance | 
logical | TRUE | 
Perform group-wise Wilcoxon tests | 
p_threshold | 
numeric | 0.05 | 
Significance threshold for statistical tests | 
tau_threshold | 
numeric | 0.3 | 
Minimum absolute Ο for βstrongβ correlation | 
abs_diff_threshold | 
numeric | 20 | 
Minimum absolute difference for practical significance | 
split_method | 
character | "quantile" | 
Tile splitting method: "quantile" or
"range" | 
max_cores | 
integer | NULL | 
Maximum cores for parallel processing (auto-detect if NULL) | 
extreme_handling | 
character | "winsorize" | 
Outlier handling: "winsorize", "remove",
or "none" | 
winsor_limits | 
numeric vector | c(0.05, 0.95) | 
Winsorization percentiles | 
scale_data | 
logical | TRUE | 
Standardize variables before analysis | 
use_complete | 
logical | TRUE | 
Use only complete cases (remove NA values) | 
The detectXOR() function returns a list with two
components: ### results_df - Summary data frame
| Column | Description | 
|---|---|
var1, var2 | 
Variable pair names | 
xor_shape_detected | 
Logical: XOR pattern identified | 
chi_sq_p_value | 
ΟΒ² test p-value for tile independence | 
tau_class_0, tau_class_1 | 
Class-wise Kendall Ο coefficients | 
tau_difference | 
Absolute difference between class Ο values | 
wilcox_p_x, wilcox_p_y | 
Wilcoxon test p-values for each axis | 
significant_wilcox | 
Logical: significant group differences detected | 
pair_list - Detailed
resultsContains comprehensive analysis for each variable pair including: - Tile pattern analysis results - Statistical test outputs - Processed data subsets - Intermediate calculations
| Function | Description | Key Parameters | 
|---|---|---|
generate_spaghetti_plot_from_results() | 
Creates connected line plots showing variable trajectories for XOR-detected pairs | results, data, class_col,
scale_data = TRUE | 
generate_xy_plot_from_results() | 
Generates scatter plots with decision boundary lines for detected XOR patterns | results, data, class_col,
scale_data = TRUE,
quantile_lines = c(1/3, 2/3),
line_method = "quantile" | 
Both functions return ggplot objects that can be displayed or saved manually.
# Generate plots
generate_spaghetti_plot_from_results(results, XOR_data) 
generate_xy_plot_from_results(results, XOR_data)| Function | Description | Key Parameters | 
|---|---|---|
generate_xor_reportConsole() | 
Creates console-friendly formatted report with optional plots | results, data, class_col,
scale_data = TRUE, show_plots = TRUE | 
generate_xor_reportHTML() | 
Generates comprehensive HTML report with interactive elements | results, data, class_col,
output_file, open_browser = TRUE | 
# Generate formatted report 
generate_xor_reportHTML(results, XOR_data, class_col = "class")The report will be automaticlaly opened in the system standard web browser.
future::multisession for
parallel processingpbmcapply::pbmclapply with fork-based parallelismdetectXOR/
βββ R/                 # Package source code
βββ man/               # Package documentation
βββ data/              # Example dataset
βββ issues/            # Problem reporting
βββ analyses/          # Files used to generate or plot publictaion data sets (not in library)Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests on GitHub. ## License GPL-3 ## Citation
For citation details or to request a formal publication reference, please contact the maintainer.