adamingas · adamingas · Jan 20, 2024 · Jan 18, 2024 · Jan 18, 2024 · Jan 20, 2024
diff --git a/.github/workflows/deploymemt.yml b/.github/workflows/deploymemt.yml
@@ -0,0 +1,51 @@
+# This workflow will install Python dependencies, run tests and lint with a single version of Python
+# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
+
+name: Python application
+
+on:
+  push:
+    branches: [ "main" ]
+  pull_request:
+    branches: ['main']
+  workflow_run:
+    workflows: ['ci']
+    types:
+      - completed
+
+permissions:
+  contents: read
+
+jobs:
+  cd:
+    # Only run this job if new work is pushed to "main"
+    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
+    # Set up operating system
+    runs-on: ubuntu-latest
+    permissions:
+      id-token: write
+    environment:
+      name: pypi
+      url: https://pypi.org/p/ordinalgbt
+    # Define job steps
+    steps:
+    - name: Set up Python 3.9
+      uses: actions/setup-python@v3
+      with:
+        python-version: 3.9
+    - name: Install dependencies
+      run: |
+          python -m pip install --upgrade pip
+          pip install build
+    - uses: actions/checkout@v3
+      # Here we run build to create a wheel and a
+      # .tar.gz source distribution.
+    - name: Build package
+      run: python -m build --sdist --wheel
+      # Finally, we use a pre-defined action to publish
+      # our package in place of twine.
+    - name: Publish to PyPI
+      uses: pypa/gh-action-pypi-publish@release/v1
+    - name: Test install from PyPi
+      run: |
+        pip install ordinalgbt
diff --git a/.github/workflows/python-app.yml b/.github/workflows/python-app.yml
@@ -37,37 +37,4 @@ jobs:
     - uses: chartboost/ruff-action@v1
     - name: Test with pytest
       run: |
-        pytest
-  cd:
-    needs: ci
-    # Only run this job if new work is pushed to "main"
-    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
-    # Set up operating system
-    runs-on: ubuntu-latest
-    permissions:
-      id-token: write
-    environment:
-      name: pypi
-      url: https://pypi.org/p/ordinalgbt
-    # Define job steps
-    steps:
-    - name: Set up Python 3.9
-      uses: actions/setup-python@v3
-      with:
-        python-version: 3.9
-    - name: Install dependencies
-      run: |
-          python -m pip install --upgrade pip
-          pip install build
-    - uses: actions/checkout@v3
-      # Here we run build to create a wheel and a
-      # .tar.gz source distribution.
-    - name: Build package
-      run: python -m build --sdist --wheel
-      # Finally, we use a pre-defined action to publish
-      # our package in place of twine.
-    - name: Publish to PyPI
-      uses: pypa/gh-action-pypi-publish@release/v1
-    - name: Test install from PyPi
-      run: |
-        pip install ordinalgbt
+        pytest
diff --git a/README.md b/README.md
@@ -56,4 +56,13 @@ The `predict_proba` method can be used to get the probabilities of each class:
 y_proba = model.predict_proba(X_new)
 
 print(y_proba)
-```
+```
+
+## TODOs
+* Create XGBoost and Catboost implementations
+* Bring test coverage to 100%
+* Implement the all-thresholds loss function
+* Implement the ordistic loss function
+* Create more stable sigmoid calculation
+* Experiment with bounded and unbounded optimisation for the thresholds
+* Identify way to reduce jumps due to large gradient
diff --git a/docs/motivation.ipynb b/docs/motivation.ipynb
@@ -9,7 +9,7 @@
     "\n",
     "Usually when faced with prediction problems involving ordered labels (i.e. low, medium, high) and tabular data, data scientists turn to regular multinomial classifiers from the gradient boosted tree family of models, because of their ease of use, speed of fitting, and good performance. Parametric ordinal models have been around for a while, but they have not been popular because of their poor performance compared to the gradient boosted models, especially for larger datasets.\n",
     "\n",
-    "Although classifiers can predict ordinal labels adequately, they require building as many classifiers as there are labels to predict. This approach, however, leads to slower training times, and confusing feature interpretations. For example, a feature which is positively associated with the increasing order of the label set (i.e. as the feature's value grows, so do the probabilities of the higher ordered labels), will va a positive association with the highest ordered label, negative with the lowest ordered, and a \"concave\" association with the middle ones."
+    "Although classifiers can predict ordinal labels adequately, they require building as many classifiers as there are labels to predict. This approach, however, leads to slower training times, and confusing feature interpretations. For example, a feature which is positively associated with the increasing order of the label set (i.e. as the feature's value grows, so do the probabilities of the higher ordered labels), will va a positive association with the highest ordered label, negative with the lowest ordered, and a \"concave\" association with the middle ones.\n"
    ]
   },
   {
@@ -33,6 +33,14 @@
     "</div>"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "There's been recurring requests from the community for an ordinal loss implementation in all of the major gradient boosting model frameworks ([LightGBM](https://github.com/microsoft/LightGBM/issues/5882), [XGBoost](https://github.com/dmlc/xgboost/issues/5243), [XGBoost](https://github.com/dmlc/xgboost/issues/695), [CatBoost](https://github.com/catboost/catboost/issues/1994))."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {