Skip to content

Commit

Permalink
Readying to cut 0.5.0 release
Browse files Browse the repository at this point in the history
  • Loading branch information
caseykneale committed Aug 2, 2019
1 parent 61c1f67 commit c169ef0
Show file tree
Hide file tree
Showing 23 changed files with 39 additions and 32 deletions.
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "ChemometricsTools"
uuid = "a9718f02-dbee-5ae5-ad0e-dfbd07fa387b"
authors = ["caseykneale "]
version = "0.4.6"
version = "0.5.0"

[deps]
Arpack = "7d9fca2a-8960-54d3-9f78-7d1dccf2cb97"
Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,14 @@ This package contains a collection of tools to perform fundamental and advanced
- [Regression](https://github.com/caseykneale/ChemometricsTools.jl/blob/master/shootouts/RegressionShootout.jl)
- [Fault Detection](https://github.com/caseykneale/ChemometricsTools.jl/blob/master/shootouts/AnomalyShootout.jl)

### Package Status => "Registrator" release! (v 0.4.6)
ChemometricsTools is pretty new, and was recently accepted to be registered as an official Julia package! Yep, so you can ```Pkg.add("ChemometricsTools")``` to install it. The git repo's master branch has the most stable version right now, I fixed a lot of bugs since 0.2.3. In 0.4.6 almost all of the functionality available can reliably be used/abused, and the documentation is getting there, but it's hard to keep up with all the features I've been adding. There's probably still a few bugs. Some interesting plans for v0.5.0, but I've personally been testing this package doing some work with a fellow analytical chemist.
### Package Status => 50/50 release! (v 0.5.0)
ChemometricsTools has been accepted as an official Julia package! Yep, so you can ```Pkg.add("ChemometricsTools")``` to install it. The git repo's master branch has the most advanced version right now, but may be less reliable because I like to do dev on it. I fixed a lot of bugs since 0.2.3 (the first release). In 0.4.8 almost all of the functionality available can reliably be used/abused, and the documentation is getting there, but it's hard to keep up with all the features. There's probably still a few bugs, if you find one don't be shy - file an issue. Still working out some plans for v0.6.0 - slowly but surely we'll get there.

### Version Release Strategy
- < 0.3.0 : Mapping functionality, prototyping
- *< 0.5.0 : Testing via actual usage on real data, look for missing essentials*
- < 0.7.5 : Public input (find those bugs!). Complete docs with examples. Adequate Unit Tests.
- < 0.5.0 : Testing via actual usage on real data, look for missing essentials
- *< 0.6.0 : Bake in convenience functions for ease of use. Flesh out Documentation.*
- < 0.7.5 : Public input (find those bugs!). Adequate Unit Tests.
- < 1.0.0 : Focus on performance, stability, generalizability, lock down the package syntax.

# Package Highlights
Expand All @@ -37,13 +38,13 @@ Multiple transformations can easily be chained together and stored using "Pipeli
ChemometricsTools offers easy to use iterators for K-folds validation's, and moving window sampling/training. More advanced sampling methods, like Kennard Stone, are just a function call away. Convenience functions for interval selections, weighting regression ensembles, etc are also available. These allow for ensemble models like SIPLS, P-DS, P-OSC, etc to be built quickly. With the tools included both in this package and Base Julia, nothing should stand in your way.

### Regression Modeling
This package features dozens of regression performance metrics, and a few built in plots (Bland Altman, QQ, Interval Overlays etc) are included. The list of regression methods currently includes: CLS, Ridge, Kernel Ridge, LS-SVM, PCR, PLS(1/2), ELM's, Regression Trees, Random Forest... More to come. Chemometricians love regressions!
This package features dozens of regression performance metrics, and a few built in plots (Bland Altman, QQ, Interval Overlays etc) are included. The list of regression methods currently includes: CLS, Ridge, Kernel Ridge, LS-SVM, PCR, PLS(1/2), ELM's, Regression Trees, Random Forest, Monotone Regression... More to come. Chemometricians love regressions!

### Classification Modeling
In-house classification encodings (one cold/one hot), and easy to retrieve global or multiclass performance statistics. ChemometricsTools currently includes: LDA/PCA with Gaussian discriminants, also Hierchical LDA, multinomial softmax/logistic regression, PLS-DA, K-NN, Gaussian Naive Bayes, Classification Trees, Random Forest, Probabilistic Neural Networks, LinearPerceptrons, and more to come.
In-house classification encodings (one cold/one hot), and easy to retrieve global or multiclass performance statistics. ChemometricsTools currently includes: LDA/PCA with Gaussian discriminants, also Hierchical LDA, multinomial softmax/logistic regression, PLS-DA, K-NN, Gaussian Naive Bayes, Classification Trees, Random Forest, Probabilistic Neural Networks, LinearPerceptrons, and more to come. You can also conveniently dump classification statistics to LaTeX/CSV reports!

## Specialized tools?
This package has tools for specialized fields of analysis'. For instance, fractional derivatives for the electrochemists (and the adventurous), a handful of smoothing methods for spectroscopists, curve resolution for forensics, process fault detection methods, etc. There are certainly plans for other tools for analyzing chemical data that packages in other languages have seemingly left out. Stay tuned.
This package has tools for specialized fields of analysis'. For instance, fractional derivatives for the electrochemists (and the adventurous), a handful of smoothing methods for spectroscopists, curve resolution(unimodal and nonnegativity constraints available) for forensics, process fault detection methods, etc. There are certainly plans for other tools for analyzing chemical data that packages in other languages have seemingly left out. Stay tuned.

## Where's the Data?
Right now I don't have rights to provide much data; but the iris, Tecator meat data, and a NASA fault detection datasets are included. I'd love for a collaborator to contribute some: spectra, chromatograms, etc. Please reach out to me if you wish to collaborate/contribute. There's a good chance in a week or so I'll be reaching out to the community for these sorts of things, in the mean time you can load in your own datasets using the Julia ecosystem.
Expand All @@ -52,7 +53,6 @@ Right now I don't have rights to provide much data; but the iris, Tecator meat d
Well, I'd love to hammer in some time series methods. That was originally part of the plan. Then I realized [OnlineStats.jl](https://github.com/joshday/OnlineStats.jl) already has pretty much everything covered. Similarly, if you want clustering methods, just install [Clustering.jl](https://github.com/JuliaStats/Clustering.jl). I may add a few supportive odds and ends in here(or contribute to the packages directly) but really, most of the Julia 1.0+ ecosystem is really reliable, well made, and community supported.

## ToDo:
- Make function to dump classification statistics to an autogenerated LaTeX report.
- Gaussian Discriminant plotting function(needs cleaning and documenting)
- Long-term: SIMCA, and MultiWAY PLS
- Hyperspectral data preprocessing methods that fit into pipelines/transforms.
Expand Down
Loading

0 comments on commit c169ef0

Please sign in to comment.