Skip to content

Commit

Permalink
refactor(setup): update scikit-learn to scikit-lexicographical-trees …
Browse files Browse the repository at this point in the history
…[cd build]
  • Loading branch information
simonprovost committed Jul 31, 2024
1 parent 9f4b78f commit 213cc0f
Show file tree
Hide file tree
Showing 5 changed files with 65 additions and 20 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ jobs:
# Non-macos arm64 envrionments already have conda installed
echo "CONDA_HOME=/usr/local/miniconda" >> $GITHUB_ENV
- name: Build and test wheels
- name: 👀Build and test wheels
env:
CIBW_PRERELEASE_PYTHONS: ${{ matrix.prerelease }}
CIBW_ENVIRONMENT: SKLEARN_SKIP_NETWORK_TESTS=1
Expand All @@ -133,7 +133,7 @@ jobs:

# Build the source distribution under Linux
build_sdist:
name: Source distribution
name: ⚙️ Source distribution
runs-on: ubuntu-latest
needs: check_build_trigger
if: needs.check_build_trigger.outputs.build
Expand Down
29 changes: 29 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,32 @@
Scikit-lexicographical-trees
=============================

**Scikit-lexicographical-trees** is an adaptation of the Scikit-Learn trees module to support lexicographical approaches
for longitudinal data. Refer to the following document for further information:
`Lexico Decision Tree Classifier <https://simonprovost.github.io/scikit-longitudinal/API/estimators/trees/lexico_decision_tree_classifier/>`_.

Classifiers and regressors supporting lexicographical approaches:

🌲 Decision Tree Classifier
🌲 Random Forest Classifier
🌲 Decision Tree Regressor

For more information, refer to the Scikit-Longitudinal
– main library utilizing the current fork – : `Scikit-Longitudinal <https://simonprovost.github.io/scikit-longitudinal>`_.

Acknowledgements
----------------

This fork is from NeuroData, an endeavor that paved the path for improving trees/forests in Scikit-Learn.
Nonetheless, while our compliments go to the NeuroData team, we also like to thank the original Scikit-Learn
team for their excellent effort over the years in providing a robust and versatile library for machine learning.

Do not forget to cite them!

💬💬💬💬💬💬💬💬💬💬

🔄🔄🔄 Original Scikit-Learn README 🔄🔄🔄

.. -*- mode: rst -*-
|Azure| |CirrusCI| |Codecov| |CircleCI| |Nightly wheels| |Black| |PythonVersion| |PyPi| |DOI| |Benchmark|
Expand Down
16 changes: 8 additions & 8 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
[project]
name = "scikit-learn"
version = "1.6.dev0"
name = "scikit-lexicographical-trees"
version = "0.0.4"
description = "A set of python modules for machine learning and data mining"
readme = "README.rst"
maintainers = [
{name = "scikit-learn developers", email="[email protected]"},
{name = "scikit-longitudinal-developpers", email="[email protected]"},
]
dependencies = [
"numpy>=1.19.5",
Expand Down Expand Up @@ -36,11 +36,11 @@ classifiers=[
]

[project.urls]
homepage = "https://scikit-learn.org"
source = "https://github.com/scikit-learn/scikit-learn"
download = "https://pypi.org/project/scikit-learn/#files"
tracker = "https://github.com/scikit-learn/scikit-learn/issues"
"release notes" = "https://scikit-learn.org/stable/whats_new"
homepage = "https://simonprovost.github.io/scikit-longitudinal/"
source = "https://github.com/simonprovost/scikit-lexicographical-trees"
download = "https://pypi.org/project/scikit-lexicographical-trees/#files"
tracker = "https://github.com/simonprovost/scikit-lexicographical-trees/issues"
"release notes" = "https://github.com/simonprovost/scikit-longitudinal/releases"

[project.optional-dependencies]
build = ["numpy>=1.19.5", "scipy>=1.6.0", "cython>=3.0.10", "meson-python>=0.16.0"]
Expand Down
20 changes: 11 additions & 9 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,21 +28,22 @@
builtins.__SKLEARN_SETUP__ = True


DISTNAME = "scikit-lexicographical_trees"
DISTNAME = "scikit-lexicographical-trees"
DESCRIPTION = "A set of python modules for machine learning and data mining"
with open("README.rst") as f:
LONG_DESCRIPTION = f.read()
MAINTAINER = "scikit-learn developers"
MAINTAINER_EMAIL = "[email protected]"
URL = "https://scikit-learn.org"
DOWNLOAD_URL = "https://pypi.org/project/scikit-learn/#files"
MAINTAINER = "Simon Provost"
MAINTAINER_EMAIL = "[email protected]"
URL = "https://simonprovost.github.io/scikit-longitudinal/"
DOWNLOAD_URL = "https://pypi.org/project/scikit-lexicographical-trees/#files"
LICENSE = "new BSD"
PROJECT_URLS = {
"Bug Tracker": "https://github.com/scikit-learn/scikit-learn/issues",
"Documentation": "https://scikit-learn.org/stable/documentation.html",
"Source Code": "https://github.com/scikit-learn/scikit-learn",
"Bug Tracker": "https://github.com/simonprovost/scikit-lexicographical-trees/issues",
"Documentation": "https://simonprovost.github.io/scikit-longitudinal/",
"Source Code": "https://github.com/simonprovost/scikit-lexicographical-trees",
}


# We can actually import a restricted version of sklearn that
# does not need the compiled code
import sklearn # noqa
Expand All @@ -51,7 +52,8 @@
from sklearn.externals._packaging.version import parse as parse_version # noqa


VERSION = sklearn.__version__
# VERSION = sklearn.__version__
VERSION = "0.0.4"

# Custom clean command to remove build artifacts

Expand Down
16 changes: 15 additions & 1 deletion sklearn/ensemble/_gb.py
Original file line number Diff line number Diff line change
Expand Up @@ -394,6 +394,9 @@ def __init__(
validation_fraction=0.1,
n_iter_no_change=None,
tol=1e-4,
splitter="best",
features_group=None,
threshold_gain=0.0015,
):
self.n_estimators = n_estimators
self.learning_rate = learning_rate
Expand All @@ -416,6 +419,9 @@ def __init__(
self.validation_fraction = validation_fraction
self.n_iter_no_change = n_iter_no_change
self.tol = tol
self.splitter = splitter
self.features_group = features_group
self.threshold_gain = threshold_gain

@abstractmethod
def _encode_y(self, y=None, sample_weight=None):
Expand Down Expand Up @@ -470,7 +476,7 @@ def _fit_stage(
# induce regression tree on the negative gradient
tree = DecisionTreeRegressor(
criterion=self.criterion,
splitter="best",
splitter=self.splitter,
max_depth=self.max_depth,
min_samples_split=self.min_samples_split,
min_samples_leaf=self.min_samples_leaf,
Expand All @@ -480,6 +486,8 @@ def _fit_stage(
max_leaf_nodes=self.max_leaf_nodes,
random_state=random_state,
ccp_alpha=self.ccp_alpha,
threshold_gain=self.threshold_gain,
features_group=self.features_group,
)

if self.subsample < 1.0:
Expand Down Expand Up @@ -1470,6 +1478,9 @@ def __init__(
n_iter_no_change=None,
tol=1e-4,
ccp_alpha=0.0,
splitter="best",
features_group=None,
threshold_gain=0.0015,
):
super().__init__(
loss=loss,
Expand All @@ -1492,6 +1503,9 @@ def __init__(
n_iter_no_change=n_iter_no_change,
tol=tol,
ccp_alpha=ccp_alpha,
splitter=splitter,
features_group=features_group,
threshold_gain=threshold_gain,
)

def _encode_y(self, y, sample_weight):
Expand Down

0 comments on commit 213cc0f

Please sign in to comment.