Skip to content

Add GA-RFD#750

Open
py-cyber wants to merge 14 commits into
Desbordante:mainfrom
py-cyber:ga-rfd
Open

Add GA-RFD#750
py-cyber wants to merge 14 commits into
Desbordante:mainfrom
py-cyber:ga-rfd

Conversation

@py-cyber

@py-cyber py-cyber commented May 5, 2026

Copy link
Copy Markdown
Collaborator

This pull request introduces a genetic algorithm for the discovery of relaxed functional dependencies (GA‑RFD)

New GA‑RFD Discovery Algorithm

  • src/core/algorithms/rfd/ga_rfd/ga_rfd.h and src/core/algorithms/rfd/ga_rfd/ga_rfd.cpp
    Core implementation of the GaRfd class. It builds difference‑dataset bit matrices, maintains an LRU cache for support computation and evolves a population of candidate RFDs using selection, crossover, and mutation operators. The algorithm works on a single table, stores data column‑wise for performance, and outputs a set of unique RFDs with their conf and supp

  • src/core/algorithms/rfd/rfd.h
    Lightweight RFD struct (LHS mask, RHS index, confidence, support) with ordering for deduplication;

  • src/core/algorithms/rfd/similarity_metric.h and src/core/algorithms/rfd/similarity_metric.cpp
    Abstract SimilarityMetric interface and built‑in metrics. Users can provide custom metrics from Python.

Python Bindings

  • src/python_bindings/rfd/bind_ga_rfd.cpp and src/python_bindings/rfd/bind_ga_rfd.h
    Expose GaRfd, RFD, and SimilarityMetric to Python. A helper PySimilarityMetric wraps Python callables so that user‑defined metrics can be passed directly;

  • src/python_bindings/rfd/py_similarity_metric.h
    Python‑friendly metric adapter with GIL handling;

  • src/python_bindings/py_util/py_to_any.cpp
    Registers a converter for std::shared_ptr<SimilarityMetric> to enable passing Python functions as metrics;

  • Example
    examples/basic/mining_ga_rfd.py demonstrates loading Iris, configuring metrics, setting GA parameters, and printing the discovered RFDs.

Tests

  • src/tests/unit/test_ga_rfd.cpp
    Extensive test suite covering:
    • Exceptions on empty tables, single attributes, too many attributes, and single rows;
    • Parameterised tests on various datasets (Iris, BreastCancer, Neighbors10k, TestFD, TestLong, TestWide, etc.);
    • Correctness checks (LHS ≠ 0, RHS not in LHS, confidence ≥ ε);
    • Performance benchmarks (Neighbors10k with both fast and slow settings);
    • Determinism with fixed seed;
    • Levenshtein and equality metric variants.

Help links:
report
FlameGraph
paper

github-actions[bot]

This comment was marked as resolved.

@py-cyber py-cyber mentioned this pull request May 6, 2026
@Desbordante Desbordante deleted a comment from github-actions Bot May 6, 2026
@Desbordante Desbordante deleted a comment from github-actions Bot May 6, 2026
@Desbordante Desbordante deleted a comment from github-actions Bot May 6, 2026
@Desbordante Desbordante deleted a comment from github-actions Bot May 6, 2026
@Desbordante Desbordante deleted a comment from github-actions Bot May 6, 2026
@Desbordante Desbordante deleted a comment from github-actions Bot May 6, 2026
@Desbordante Desbordante deleted a comment from github-actions Bot May 6, 2026
@Desbordante Desbordante deleted a comment from github-actions Bot May 6, 2026
@py-cyber py-cyber force-pushed the ga-rfd branch 2 times, most recently from 0cdb179 to 9f46b95 Compare May 6, 2026 19:26
github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

@py-cyber py-cyber force-pushed the ga-rfd branch 3 times, most recently from c45f483 to 30f2c99 Compare May 11, 2026 21:02
github-actions[bot]

This comment was marked as resolved.

@py-cyber py-cyber force-pushed the ga-rfd branch 2 times, most recently from 0125af1 to 27cc596 Compare May 16, 2026 13:59
github-actions[bot]

This comment was marked as outdated.

@py-cyber py-cyber force-pushed the ga-rfd branch 2 times, most recently from 202b724 to d42f01c Compare May 17, 2026 00:39
github-actions[bot]

This comment was marked as outdated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant